Open Access Open Access  Restricted Access Subscription or Fee Access

K-Means Clustering With Optimal Centroid Estimation Mechanism for Electricity Usage Pattern Discovery

N Ranjani, B Kalaavathi

Abstract


Clustering techniques are used to group up the transactions based on the relevancy. Hierarchical and partition clustering techniques are used for the clustering process. Hierarchical clustering technique uses the structure and data values. The partition clustering technique uses the data similarity factors. Transactions are partitioned into small groups. K-means clustering (KMC) algorithm is one of the widely used clustering algorithm. Local cluster accuracy is high in the KMC algorithm. Inter cluster relationship is not focused in the K-means algorithm.
KMC algorithm requires the cluster count as the major input. The transactions are compared with the centroid values in each cluster environment. The system chooses random transactions are initial centroid for each cluster. Cluster accuracy is related with the initial centroid estimation process. Because all the transactions are transferred to the relevant cluster based on the initial centroid comparison process. The random transaction based centroid selection model may select related transactions. In this case, the cluster accuracy is limited with respect to the distance between the centroid values.
The K-means clusters with simulated annealing (KMC-SA) scheme is built to perform the data partitioning with centroid selection process. The centroid selection is carried out using the Simulated Annealing mechanism. The electricity usage pattern discovery system is designed to improve the KMC algorithm with optimal centroid estimation models. Cosine distance measure and Euclidean distance measure are used to estimate similarity between the transactions. Precision and recall and purity measure are used to test the cluster accuracy levels. Java language and Oracle database are selected for the system development.

Full Text:

PDF

References


H. Guo. Accelerated continuous conditional random fields for load forecasting, IEEE Trans Knowledge Data Eng. 2015; 27(8).

T. Qin, T.-Y. Liu, X.-D. Zhang, D.-S. Wang, H. Li. Global ranking using continuous conditional random fields, In: Proc. Adv. Neural Inform. Process. Syst. 2008; 21: 1281–8p.

D. Yu, L. Deng, A. Acero. Using continuous features in the maximum entropy model, Pattern Recogn Lett. 2009; 30(14): 1295–300p.

H. Wang, C. Wang, C. Zhai, J. Han. Learning online discussion structures by conditional random fields, In: Proc. 34th Int. ACM SIGIR Conf. Res. Develop. Inform. Retrieval. 2011, 435–44p.

B. Zenko, S.D_zeroski. Learning classification rules for multiple target attributes, In: Proc Adv Knowl Discov Data Mining. 2008, 454–65p.

X. Xin, I. King, H. Deng, M.R. Lyu. A social recommendation framework based on multi-scale continuous conditional random fields, In: Proc. 18th ACM Conf. Inform. Knowl. Manage. 2009, 1247–56p.

N. Djuric, V. Radosavljevic, V. Coric, S. Vucetic. Travel speed forecasting by means of continuous conditional random fields, Transport Res Record: J Transport Res Board. 2011; 2263: 131–9p.

V. Radosavljevic, S. Vucetic, Z. Obradovic. Continuous conditional random fields for regression in remote sensing, In: Proc 19th Eur Conf Artif Intell. 2010, 809–14p.

H. Guo. Modeling short-term energy load with continuous conditional random fields, In: Proc. Eur. Conf. Mach. Learn. Principles Pract. Knowl. Discovery Databases. 2013, 433–48p.

C. Yang, Y. Cao, Z. Nie, J. Zhou, J.-R. Wen. Closing the loop in webpage understanding, IEEE Trans. Knowl. Data Eng. 2010; 22(5): 639–50p.




DOI: https://doi.org/10.37628/ijods.v3i1.247

Refbacks

  • There are currently no refbacks.