ON THE USE OF SIDE INFORMATION FOR MINING TEXT DATA

Authors

  • Laxmi Mehetre Computer Engineering, D. Y. Patil College Of Engineering, Ambi, Pune
  • Durgesh Patil Computer Engineering, D. Y. Patil College Of Engineering, Ambi, Pune
  • Manish Pimple Computer Engineering, D. Y. Patil College Of Engineering, Ambi, Pune
  • Akshay Satkar Computer Engineering, D. Y. Patil College Of Engineering, Ambi, Pune

Keywords:

Text Mining, Side Information, COATES, Clustering, Data Mining.

Abstract

Side information is available along with text document several text mining application. This side
information can be the link in the documents, web logs which contains user access behavior, provenance information, the
link for ant document or any other non-textual attributes which are embedded in text document. All these attributes may
contain huge amount of information for clustering purposes. Sometimes clustering more difficult when some of the
information is noisy. In this matter it is inconvenient to merge side-information into the mining process because either it
can upgrade the quality of the representation for mining process or can add noise in this system. Thus, there should be a
right way to do this mining process so that it will make use of side information to maximize their advantage. Therefore, it
suggests to design an efficient algorithm which makes combination of classical portioning algorithm with probabilistic
models in order to create an effective clustering approach. Then the clustering approach will extend to classification
approach for real data set which shows advantages of using such an approach.

Published

2017-03-25

How to Cite

Laxmi Mehetre, Durgesh Patil, Manish Pimple, & Akshay Satkar. (2017). ON THE USE OF SIDE INFORMATION FOR MINING TEXT DATA. International Journal of Advance Engineering and Research Development (IJAERD), 4(3), 431–433. Retrieved from https://ijaerd.com/index.php/IJAERD/article/view/2056