Page 109 - Proceedings of the 2018 ITU Kaleidoscope
P. 109
Machine learning for a 5G future
preparing a machine or a resource to be ready and on time for creating a list of all the events of a trace, using a criterion
production. of separation between each event. Then, each event
Besides physical resources, the planning and allocation of is represented as a unique integer, allowing the traces
resources could also refer to the cloud. With the advent of the to be converted into a sequence of integers, generating
Industry 4.0 and the automation of cyber-physical systems, two sequence lists of "integers", the first list consisting
many information systems are being executed in the cloud. of input activities (X), and the second list of output
In this context, on-demand elasticity is a key aspect. In cloud activities (Y). Finally, the sequence list of input activities
computing, elasticity is defined as "the degree to which a is transformed into a two-dimensional matrix (number
system is able to adapt to workload changes by provisioning of sequences, the maximum length of sequences).
and de-provisioning resources in an autonomic manner, such
that at each point in time the available resources match the 4.2 Phase 2: Categorization
current demand as closely as possible" [20]. Knowing from
advance which one is the next activity of a business process The intermediate categorization phase consists of a process
that is going to be executed is key to pro-actively release or to categorize the sequence of integers corresponding to the
reserve resources to support elasticity on the cloud. output activities (Y), in a one hot encoding representation
type, specifying that the number of classes will be equal to
4. PREDICTING BUSINESS PROCESS ACTIVITIES the size of the vocabulary.
This Section introduces a methodology to predict activities in 4.3 Phase 3: Prediction Model
business processes from information registered in event logs
derived from the execution of business processes. The prediction model phase based on LSTM network is
The proposed methodology is based on the LSTM neural composed of the following stages:
network and consists of three phases: 1) pre-processing of
Network Design. It consists of generating a design of the
the event log, 2) categorization, and 3) prediction model based
LSTM network by layers. First, an input layer is
on LSTM, as shown in Figure 1.
generated (embedding) to the network, then the hidden
layer (LSTM units) is created so that finally an output
layer is built. In each of these layers, some necessary
parameters are defined.
Network Training. The training of the LSTM network is
carried out using as training data the sequence list of
integers represented by the activities contained in the
matrix (X) and in the representation one hot type (Y).
Model Selection. The results of the training will allow
choosing a model of the LSTM network as the final
model to be implemented. A network with training
with a high degree of accuracy should be selected as
Figure 1 – The methodology for predicting activities of a the model to make the predictions. Otherwise, it is
business process using the proposed approach. recommended to modify the design of the network,
adjusting the required parameters and execute network
training again.
4.1 Phase 1: Event Log Preprocessing
Prediction. It is the output generated by the LSTM
The pre-processing phase of the event log consists of the
neural network, which through a training stage allows
following stages:
predicting the next activity in a business process model,
from an input activity or a sequence of input activities,
Data Extraction. A detailed analysis of the event logs
which is explained in the following sections of the
is performed (.XES file format), which allows the
document.
identification of the different attributes contained in the
event log, allowing to select the attributes required for a
4.4 Implementation
prediction, in this case, the attribute "activity".
The proposed approach is based on the definition of a
Trace Identification. It consists of identifying and obtaining
recurrent neural network LSTM, considered as a network of
the traces, with their respective events. Then, the
a special structure consisting of memory blocks and memory
traces are added in a text file maintaining their order
cells, together with the gate unitsthat contain them[21], i.e, an
of appearance.
LSTM unit consists of a cell and three gates (input, forget, and
Segmentation. A segmentation task is applied to the text output). Through this special structure, an LSTM network
file generated in the previous stage, which consists of can select which information is forgotten or remembered.
– 93 –