Document Type: Article
School of Industrial Engineering, Iran University of Science & Technology, Tehran, Iran.
Clustering is one of the main methods of data mining. K-means algorithm is one of the most common clustering algorithms due to its efficiency and ease of use. One of the challenges of clustering is to identify the appropriate label for each cluster. The selection of a label is done in such a way as to provide a proper description of the cluster records. In some cases, choosing the appropriate label is not easy due to the results and structure of each cluster. The aim of this study is to present an algorithm based on the K-means clustering in order to facilitate the allocation of labels to each cluster. Moreover, in many data mining issues, the data set contains a large number of fields and therefore, the identification of the fields and the extraction of subsets from the required fields is an important issue. With the help of the proposed algorithm, the important and influential variables of the data set would be identified and the subset of the required fields would be selected.