Parallel Design of Hash and K-means Algorithm in the Context of Big Data
		
			 Download as PDF
 Download as PDF
		
		DOI: 10.25236/mmmce.2019.129
		
			Author(s)
			Xing Lei, Zhang Xiang, Guo Zhengkun, Guo Fuwang
		 
		
			
Corresponding Author
			Xing Lei		
		
			
Abstract
			In order to further improve the efficiency of K - means algorithm on the large-scale data clustering, this paper conducts deep analysis and research on the optimization of K - means clustering algorithm and proposes a selected program of initial clustering center based on Hash algorithm, hashing mass high-dimensional data to a compression space to excavate the clustering relations, so as to make the selected initial clustering center tend to be convergent state as far as possible and to greatly reduce the number of iterations of clustering, improved the accuracy of clustering.		
		
			
Keywords
			K-means, Hash, mass data