Agglomerative algorithm to discover semantics from unstructured big data

I-Jen Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The paper presents a graph model and an agglomerative algorithm for text document clustering. Given a set of documents, the associations among frequently co-occurring terms in any of the documents naturally form a graph, which can be decomposed into connected components at various levels. Each connected component represents a concept in the collection. These concepts can categorize documents into different semantic classes. The experiments on three different data sets from news, Web, and medical literatures have shown our algorithm is significantly better than traditional clustering algorithms, such as k-means, principal direction division partitioning, AutoClass and hierarchical clustering.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1556-1563
Number of pages8
ISBN (Print)9781479999255
DOIs
Publication statusPublished - Dec 22 2015
Event3rd IEEE International Conference on Big Data, IEEE Big Data 2015 - Santa Clara, United States
Duration: Oct 29 2015Nov 1 2015

Other

Other3rd IEEE International Conference on Big Data, IEEE Big Data 2015
CountryUnited States
CitySanta Clara
Period10/29/1511/1/15

Keywords

  • agglomerative document categorization/clustering
  • association rules
  • hierarchical clustering
  • hypergraph

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Software

Fingerprint Dive into the research topics of 'Agglomerative algorithm to discover semantics from unstructured big data'. Together they form a unique fingerprint.

  • Cite this

    Chiang, I-J. (2015). Agglomerative algorithm to discover semantics from unstructured big data. In Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015 (pp. 1556-1563). [7363920] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2015.7363920