Formal concept analysis and document clustering via granular computing

Tsau Young Lin, I-Jen Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A text/web document is a knowledge representation of a human idea (a structured set of thoughts). This paper refines TFIDF and Extended TFIDF(ETFIDF)[16]; These values really measures the co-occurrences of tokens. The ETFID captures the semantic more accurately. Tokens with high TFIDF values are called Keywords. The sets of (n+1) Co-occurring keywords with High ETFIDF are called n-granules. The collection of keywords and n-granules can be interpreted geometrically; they form a non-closed simplicial complex. The corresponding non-closed polyhedron is called Latent Semantic Space(LSS). LSS is a geometric knowledge base that provides the semantic to search engine:

Original languageEnglish
Title of host publicationConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
Pages4763-4767
Number of pages5
Volume6
DOIs
Publication statusPublished - 2007
Event2006 IEEE International Conference on Systems, Man and Cybernetics - Taipei, Taiwan
Duration: Oct 8 2006Oct 11 2006

Other

Other2006 IEEE International Conference on Systems, Man and Cybernetics
CountryTaiwan
CityTaipei
Period10/8/0610/11/06

Keywords

  • Granules
  • Latent semantic space
  • Simplex

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Formal concept analysis and document clustering via granular computing'. Together they form a unique fingerprint.

  • Cite this

    Lin, T. Y., & Chiang, I-J. (2007). Formal concept analysis and document clustering via granular computing. In Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics (Vol. 6, pp. 4763-4767). [4274667] https://doi.org/10.1109/ICSMC.2006.385058