Open Access Open Access  Restricted Access Subscription or Fee Access

Dynamic Ensemble Discovery and Analysis for Data Stream Classification

R Preethi, P. Sumathi Sumathi

Abstract


The classification techniques are used to assign transaction categories. The learning and testing tasks are carried out for the classification process. The classification operations are performed on the data values that are maintained under the databases. There is no limit for number of data scans and process time. The training set requires labeled transaction collection. The stream based classification uses data values collected from remote machines. The stream based classification model is restricted with time boundaries and data scan counts. The classification operation must be performed in minimum time intervals.
Most existing data stream classification techniques ignore one important aspect of stream data with arrival of a novel class. A data stream classification technique is adapted to integrate a novel class detection mechanism into traditional classifiers. The system enables automatic detection of novel classes before the true labels of the novel class instances arrive. Novel class detection problem becomes more inspiring in the occurrence of concept-drift, when the original data distributions grow in streams. In order to determine whether an instance belongs to a novel class, the classification model sometimes needs to wait for more test instances to discover similarities among those instances. The novel class identification receipts more waiting time to evaluate the class instances. Data point identification is a time-consuming process.
Stream based mining model collects data from streams from remote machines. Stream based classification model is used to fetch novel classes in concept drifting environment. The class ensembles are used to perform the transaction similarity measures. The Class Based ensembles for Class Evaluation (CBCE) scheme are applied to discover the classes in the streams. The class detection scheme is enhanced to assign class labels in dynamic feature set environment.

Full Text:

PDF

References


S. Wang, L.L. Minku, X. Yao. A Learning Framework for Online Class Imbalance Learning. IEEE, 2013.

G. Ditzler, R. Polikar. Incremental learning of concept drift from streaming imbalanced data, IEEE Trans Knowledge Data Eng. 2012 (DOI: 10.1109/TKDE.2012.136).

L. Rokach. Ensemble-based classifiers, Artif Intell Rev. 2010; 33(1-2): 1–39p.

S. Wang, H. Chen, X. Yao. Negative correlation learning for classification ensembles, In: International Joint Conference on Neural Networks. WCCI, IEEE Press; 2010, 2893–900p.

S. Wang. Ensemble diversity for class imbalance learning, Ph.D. Dissertation. School of Computer Science, The University of Birmingham, 2011.

R.N. Lichtenwalter. Multi-class imbalance problems: analysis and potential solutions, IEEE Trans Syst Man Cybernet PartB: Cybernet. 2012; 42(4): 1119–30p.

R.N. Lichtenwalter, N.V. Chawla. Adaptive methods for classification in arbitrarily imbalanced and drifting data streams, New Front Appl Data Min Lect Notes Comput Sci. 2010; 5669: 53–75p.

S. Chen, H. He, K. Li, S. Desai. Musera: Multiple selectively recursive approach towards imbalanced stream data mining, In: International Joint Conference on Neural Networks. 2010, 1–8p.

S. Chen, H. He. Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach, Evolv Syst. 2010; 2(1): 35–50p.

H.M. Nguyen, E.W. Cooper, K. Kamei. Online learning from imbalanced data streams, In: International Conference of Soft Computing and Pattern Recognition (SoCPaR). 2011, 347–52p.

L.L. Minku, X. Yao. DDD: a new ensemble approach for dealing with concept drift, IEEE Trans Knowledge Data Eng. 2012; 24(4): 619–33p.




DOI: https://doi.org/10.37628/ijods.v3i1.246

Refbacks

  • There are currently no refbacks.