At oneof multivariate analysis, data analysis/taxonomic technique, or the general term of the algorithm which divide object ones (gathering of data) into several groups (cluster) due to similarity (distance) of the sample. The especially data without external standard of classifying it is the mathematical method automatic operation and quantitatively.
As a concrete procedure, doing the definition of similarity first, it digitalizes similarity of the sample. It calculates sample respective distance from there, to collect the sample (clustering), it calculates also the distance between the cluster in consequence of that. As a measuring method of distance, there is a Euclid distance, a Euclid square distance, a standardization Euclid distance, a etc..
Various ones have been lectured also clustering technique (algorithm), according to analysis and use, the classification there is a variety, but hierarchical non being hierarchical, there is many a thing which you explain. There is shortest distance method, a longest distance method, a median method, a method of elastic center, a group equity, a ward method and a variable method etc. in hierarchical method, K-Means method (c-Means method) and there is a self organization map and the like in non hierarchical method.
Cluster analysis originally it is something which grows as "a " of the quantity chart shape study in living thing taxonomy, but presently from psychology sociological cognitive science, to business analysis, marketing and various product developments it is widely used as a general |