Semester : SEMESTER 1
Subject : Data Warehousing & Mining
Year : 2017
Term : DECEMBER
Branch : COMPUTER SCIENCE AND ENGINEERING
Scheme : 2015 Full Time
Course Code : 01 CS 6151
Page:2
1
PART B
a. The training data for a classifier is given below: 6.5
Using a Bayesian Classifier, classify the tuple (Red, SUV, Domestic) as stolen or
not stolen.
BE No. I Color Origin I Stolen?
Red Sports DornBtic ५८७
न Red Sports Domestic} No
3 Red Sports Domstic
Domestic
; Yellow ति Imported No
Yellow »ports
6 Yellow StJV سح No
7 | ५/० SUV Domestic 15
Red SUV imported | No
° Red SUV Imported No
10 Sports
b. Differentiate betvveen agglomerative clustering and divisive clustering.
a. Explain k-means and k-medoids algorithms that perform effective clustering. 5.5
Illustrate the strength and weakness of k-means in comparison with the kmedoids
algorithm.
b. What are decision trees? Explain how decision trees are useful in data mining. 5.0
a. Suppose that the data mining task is to cluster points (with (x, y) representing 6.5
location) into three clusters, where the points are A2(2,5),A3(8,4),
B1(5,8),B2(7,5),B3(6,4),C1(1,2),C2(4,9)
The distance function is Euclidean distance. Suppose initially we assign Al, BI,
and Cl as the center of each cluster, respectively. Use the k-means algorithm to
show
(i) The three cluster centers after the first round of execution.
(11) The final three clusters
b. What are the issues faced by decision tree based classification algorithms? 4.0
PART C
a. Explain how the spatial data structures R-Tree and KD Tree differs? 3.0
b. How do context focused crawlers improve the performance of web search? 6.0
a. Illustrate Count Distribution Algorithm (CDA) with the help of an example. 6.0
b. What are Hidden Markov Models or HMM's? 30