Page 190 - Proceedings of the 2017 ITU Kaleidoscope
P. 190
2017 ITU Kaleidoscope Academic Conference
parameter of the keyword in the topic representing each 4.3. Topic Modeling Function
document as per the expression below, and the total sum of Topic modeling, performed by user-defined values, is
the occurrence rate is 100. implemented using the Mallet API for the topic results
provided by LDA algorithm. Mallet API is a machine
learning for language toolkit [18], being an API based on
Java, which can perform various machine learning functions,
such as text natural language processing, document
classification, document clustering, topic modeling, and
As per Table 2, the keywords of the representative topics of information extraction. Additionally, TASIS provides a
Recommendation ITU-T Y.3501: Cloud computing – graph based on the number of dirichlet parameters. The
Framework and high-level requirements document are graph is represented in TreeMap. It recursively subdivides
service, cloud, CSP, CSC, computing, datum, capability, area into rectangles [19]. As a result, the topic that
application, resource, note. By calculating the occurrence represents the range by the cluster of the largest colored area
rate, TASIS can provide that the “service” keyword is a among all clusters is the representative topic in Figure 4.
keyword that accounts for 24.7% of the representative
topics in the Y.3501 international standard document.
4.4. Trend Analysis and Document Find Function
By clicking each keyword in the table, the corresponding
Table 2. Example of Result of Recommendation ITU-T Trend Analysis and Document Find Function provides trend
Y.3501 in Topic Table
analysis and the international standard document list in
Series id Topic dirichlet_parameter occurrence_rate Figure 5. TASIS shows an international standard document
Y.3501 service 204 24.7 list represented by the occurrence rate in the topic table. The
Y.3501 cloud 177 21.5 documents can be sorted in descending order using the
Y.3501 CSP 103 12.5 occurrence rate of the keyword. The table displayed on the
Y.3501 CSC 67 8.1 web page is composed of a document number (Document),
Y.3501 computing 56 6.8 publication year (Year), occurrence rate (Occurrence Rate),
Y.3501 datum 51 6.1 and a title of the document (Title). Additionally, the user is
Y.3501 capability 48 5.8
Y.3501 application 44 5.3 provided with a detailed document page and a link for
Y.3501 resource 40 4.8 downloading, by implementing the hyperlink function in the
Y.3501 note 33 4 document number.
Fig. 4. Topic Modeling Function
– 174 –