Discover the semantic topology in high-dimensional data

I. J. Chiang

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Discovering the homogeneous concept groups in the high-dimensional data sets and clustering them accordingly are contemporary challenge. Conventional clustering techniques often based on Euclidean metric. However, the metric is ad hoc not intrinsic to the semantic of the documents. In this paper, we are proposing a novel approach, in which the semantic space of high-dimensional data is structured as a simplicial complex of Euclidean space (a hypergraph but with different focus). Such a simplicial structure intrinsically captures the semantic of the data; for example, the coherent topics of documents will appear in the same connected component. Finally, we cluster the data by the structure of concepts, which is organized by such a geometry.

Original languageEnglish
Pages (from-to)256-262
Number of pages7
JournalExpert Systems with Applications
Volume33
Issue number1
DOIs
Publication statusPublished - Jul 2007

Fingerprint

Semantics
Topology
Geometry

Keywords

  • Association rules
  • Document clustering
  • Hierarchical clustering
  • Simplicial complex

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications

Cite this

Discover the semantic topology in high-dimensional data. / Chiang, I. J.

In: Expert Systems with Applications, Vol. 33, No. 1, 07.2007, p. 256-262.

Research output: Contribution to journalArticle

Chiang, I. J. / Discover the semantic topology in high-dimensional data. In: Expert Systems with Applications. 2007 ; Vol. 33, No. 1. pp. 256-262.
@article{032213de4ac44568a69acf107942ea26,
title = "Discover the semantic topology in high-dimensional data",
abstract = "Discovering the homogeneous concept groups in the high-dimensional data sets and clustering them accordingly are contemporary challenge. Conventional clustering techniques often based on Euclidean metric. However, the metric is ad hoc not intrinsic to the semantic of the documents. In this paper, we are proposing a novel approach, in which the semantic space of high-dimensional data is structured as a simplicial complex of Euclidean space (a hypergraph but with different focus). Such a simplicial structure intrinsically captures the semantic of the data; for example, the coherent topics of documents will appear in the same connected component. Finally, we cluster the data by the structure of concepts, which is organized by such a geometry.",
keywords = "Association rules, Document clustering, Hierarchical clustering, Simplicial complex",
author = "Chiang, {I. J.}",
year = "2007",
month = "7",
doi = "10.1016/j.eswa.2006.05.033",
language = "English",
volume = "33",
pages = "256--262",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Limited",
number = "1",

}

TY - JOUR

T1 - Discover the semantic topology in high-dimensional data

AU - Chiang, I. J.

PY - 2007/7

Y1 - 2007/7

N2 - Discovering the homogeneous concept groups in the high-dimensional data sets and clustering them accordingly are contemporary challenge. Conventional clustering techniques often based on Euclidean metric. However, the metric is ad hoc not intrinsic to the semantic of the documents. In this paper, we are proposing a novel approach, in which the semantic space of high-dimensional data is structured as a simplicial complex of Euclidean space (a hypergraph but with different focus). Such a simplicial structure intrinsically captures the semantic of the data; for example, the coherent topics of documents will appear in the same connected component. Finally, we cluster the data by the structure of concepts, which is organized by such a geometry.

AB - Discovering the homogeneous concept groups in the high-dimensional data sets and clustering them accordingly are contemporary challenge. Conventional clustering techniques often based on Euclidean metric. However, the metric is ad hoc not intrinsic to the semantic of the documents. In this paper, we are proposing a novel approach, in which the semantic space of high-dimensional data is structured as a simplicial complex of Euclidean space (a hypergraph but with different focus). Such a simplicial structure intrinsically captures the semantic of the data; for example, the coherent topics of documents will appear in the same connected component. Finally, we cluster the data by the structure of concepts, which is organized by such a geometry.

KW - Association rules

KW - Document clustering

KW - Hierarchical clustering

KW - Simplicial complex

UR - http://www.scopus.com/inward/record.url?scp=33845625115&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33845625115&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2006.05.033

DO - 10.1016/j.eswa.2006.05.033

M3 - Article

VL - 33

SP - 256

EP - 262

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

IS - 1

ER -