Page 187 - Proceedings of the 2017 ITU Kaleidoscope
P. 187
TASIS: TREND ANALYSIS SYSTEM FOR INTERNATIONAL STANDARDS
2
Myeongha Hwang , Minkyo In , Suwook Ha , Kangchan Lee 2
1
2
1 University of Science and Technology, Daejeon, Korea, hmh929@etri.re.kr
2 Electronics and Telecommunications Research Institute, Daejeon, Korea,
{mkin, sw.ha, chan}@etri.re.kr
ABSTRACT While topic modeling, a text mining technique, is a
statistical inference model developed for finding hidden
Recently, text mining has risen as an advanced technology topics in a text, it has not been used for the analysis of
that analyzes meaningful trends and topics in document international standards. Therefore, we have collected the
collections. Despite its increasing use in various research international standard documents published by ITU-T, an
areas, there have not been previous studies using international standard organization, and examined topics of
document collections of international standards. In this the international standards by performing topic modeling
paper, we propose the Trend Analysis System for experiments based on a latent dirichlet allocation (LDA)
International Standards (TASIS), which automatically algorithm [6]. Additionally, we have developed the Trend
performs topic modeling and trend analysis on document Analysis System for International Standards (TASIS), which
collections of the International Telecommunication Union performs topic modeling and trend analysis automatically,
Telecommunication Standardization Sector (ITU-T) making it possible to analyze trends at various points, in
Recommendations, based on a latent dirichlet allocation accordance with user requirements.
(LDA) algorithm. For providing Web services, the TASIS
performs topic modeling by exploiting user-defined 2. RELATED WORK
parameters, such as the number of topics and iterations,
and the results show a list of the documents that each 2.1. Trend Analysis
keyword in the topic is included in. The TASIS also A trend is defined as a method of identifying and describing
describes a TreeMap with the size of the extracted topic as specific changes over a long period of time, and the future
a graphical expression for easier understanding. can thus be predicted using past patterns [7]. Trend analysis
for predicting the rapidly advancing IT field is becoming
Keywords— Text Mining, Latent Dirichlet Allocation, increasingly important. Qualitative research and trend
International Standards, Topic Modeling, Trend Analysis analysis methods based on the opinions of the experts have
the probability of individual subjectivity. On the other hand,
1. INTRODUCTION quantitative research and trend analysis methods are
employed for performance evaluation and predicting the
Text mining is broadly describing a range of technologies future using collected data, such as papers and articles.
for analyzing and processing semi-structured and Therefore, researchers are trying to overcome these
unstructured text data [1]. Particularly, as text data is limitations by combining quantitative and qualitative
becoming more important because of the explosion of research methods [8, 9]. One of the solutions for solving
Internet users [2], text mining can summarize documents as these limitations is a text mining methodology that analyzes
well as analyze human emotions [3, 4]. Additionally, trends based on text data. Trend analysis using text mining is
research has been carried out to identify technology trend a technique for extracting meaningful patterns from
patterns from patent documents, and there have been cases digitized text in unstructured data. As such, we can extract
where the evolution of patents related to specific products main topics in related fields based on accumulated research
and technologies is found and the direction of next- literature or papers, and determine trend patterns using
generation development suggested [5]. international standard documents.
Here, we apply text mining to standard documents to better Trend analysis research using text mining has been
understand trend analysis and research trends. International conducted in various fields. First, research on determining
standard documents are a record of societal orientation, and the topics of recent active research have been implemented
have great historical value for technologies. Therefore, we using topic modeling for text mining in Proceedings of the
analyzed the Recommendations in the International National Academy of Sciences (PNAS) abstracts [10].
Telecommunication Union Telecommunication Second, there are studies on topic detection and trend
Standardization Sector (ITU-T) to perform objective analysis methodology using LDA algorithms [11]. Moreover,
analysis of international standards and information research has contributed to the realization of business
technology (IT) research. intelligence for banks by analyzing its application from 2002
to 2013 using an LDA algorithm [12].
978-92-61-24291-6/CFP1768P-ART © 2017 ITU – 171 – Kaleidoscope