Research paper

Choose the area of your preference, whatever you would like to describe in a dataset and explain using data mining. For example: actresses/actors, food, movies, sports, music bands, or anything you want.

Create a data file in .arff format containing about 20 entries, each described by

about 4 attributes, with the last attribute containing your preference (class attribute), e.g.

@relation food

@attribute calories numeric

@attribute taste {sweet, sour, bitter, salty} @attribute course (appetizer, main, dessert, drink} @attribute vegetarian {yes, no}

@attribute like_it (yes, no} @data

100, sweet, dessert, yes, yes%icecream 80, bitter, drink, yes, yes%beer

2, sweet, dessert,yes, no%cake

Compare 3 algorithms for classification of your data: decision trees, a classification or an association rule learner, and naive Bayes. For each algorithm check what the error is (which algorithm can explain your personal liking the best), and observe the generated rules (do they tell you anything interesting?).

Tags: No tags