Wednesday 11 January 2012

Overview of Data Mining Techniques

Hello Everyone,
Since we did some examples on clustering in today's BA class, I thought of giving you all a gist of all the other techniques used in data mining apart from clustering. Some of the other techniques are statistics, nearest neighbor, decision tree, neural network and rule induction.

Statistics - The crux of statistics involves collecting data and counting it, different tools like histogram, pie charts, linear regression etc. help in summarizing and analyzing the data collected from which it is possible to find out certain patterns which will help in making useful decisions.

Nearest Neighbor - This technique is very similar to clustering. The only difference is that, in clustering, the closer two objects are to each other the more similar they are whereas in the case of nearest neighbor, other factors apart from just closeness needs to be considered while making a decision on the similarity of two objects.

Decision Tree - Decision tree uses the probability of various activities in decision making. It can also be used for segmentation when there is a need to separate two dissimilar objects. This helps in zeroing on which approach to be followed in order to achieve the desired goal.

Neural Network - It apes the neural behavior of the human nervous system. It is a network which shows linkages between various inputs (artificial neurons) to their outputs. It stores the parameter 'weights' to prioritize the data collected for calculation.

Rule Induction - This is the most commonly used data mining technique. It converts the similarity hidden in the data collected into rules which help in unveiling possible patterns from which a user is able to understand which pattern is stronger and the probability of its occurrence.

Bye

Sources:
http://www.thearling.com/text/dmtechniques/dmtechniques.htm
http://lightning.eecs.ku.edu/Rule-Induction-new.pdf

























































No comments:

Post a Comment