Tea in benefits of health: A literature analysis using text mining and latent dirichlet allocation

Ching Hsue Cheng, Wei Lun Hung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Tea originated in Asian, which was initially used as a medicinal herb. The variety of tea is according to different manufacturing processes and levels of oxidation. The different varieties of tea have different level of effects on health, thus this study adopted text mining technique and Latent Dirichlet Allocation (LDA) to analyze literature for tea in health effect. This study chose Web of Science as the database of literature source, and the search literature from 2007 to 2017. The total 1230 journal articles were collected in this study. The title, abstract, and keywords of the collected journal articles were used as a dataset for the experiment. Experimental results show that the VEM method is significantly lower than Gibbs sampling in perplexity. Hence, this study chooses K=150 when VEM method and Gibbs sampling reach the minimal perplexity in the same time. Many topics that related with tea and compounds of tea, however some topics had terms that related to health and disease. The top 10 topics show that tea could reduce the risk of diseases and benefit of health.

Original languageEnglish
Title of host publicationICMHI 2018 - Proceedings of 2018 the 2nd International Conference on Medical and Health Informatics
PublisherAssociation for Computing Machinery (ACM)
Pages148-155
Number of pages8
ISBN (Electronic)9781450363891
DOIs
Publication statusPublished - Jun 8 2018
Externally publishedYes
Event2nd International Conference on Medical and Health Informatics, ICMHI 2018 - Tsukuba, Japan
Duration: Jun 8 2018Jun 10 2018

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2nd International Conference on Medical and Health Informatics, ICMHI 2018
CountryJapan
CityTsukuba
Period6/8/186/10/18

Keywords

  • Health
  • Latent Dirichlet Allocation
  • LDA
  • Literature analysis
  • Tea
  • Text mining
  • Topic model

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint Dive into the research topics of 'Tea in benefits of health: A literature analysis using text mining and latent dirichlet allocation'. Together they form a unique fingerprint.

Cite this