Page 217 - Kaleidoscope Academic Conference Proceedings 2020
P. 217
STCCS: SEGMENTED TIME CONTROLLED COUNT-MIN SKETCH
2
1
1
Ismail Khram ; Maha Shamseddine ; Wassim Itani
1 Beirut Arab University, Beirut, Lebanon
2 University of Houston-Victoria, Texas, USA
ABSTRACT in a star or mesh depending on the need for redundancy and
feasibility. The sensors would then send the sensed data to a
IoT is a concept consisting of many components powered by controller or a master that is arranged in a hieratical structure
different techniques and technologies. However, due to to prevent network congestion due to multiple simultaneous
computation restrictions, encryption algorithms had to be connection establishment. The master node usually performs
adapted, often at the expense of lower data security levels the data processing and then the data is sent to the cloud for
and strength. To maintain data privacy between source and storage and analysis. This multi-tier architecture put a lot of
sink we present in this paper a data sketching algorithm that security risk on the system in the form of denial of services
utilizes bandwidth by providing a summary of the data to the attacks, malware, data manipulation and man in the middle
cloud. The input data stream goes through a hashing attacks, to name a few.
algorithm which produces a hexadecimal representation of
the data before going through the sketching algorithm. At the To prevent security threats and computations to a minimum
algorithm the data is categorized and the corresponding researchers today are designing algorithms that are
hash cell value updated. Note is also taken of the arrival time optimized for memory allocation and bandwidth utilization
of the data considered anomalous to allow the manager to which make it a suitable application for IoT ecosystems. Of
take corrective action if it is deduced that the periodic the methods considered data sketching is by far the most
appearance of the information is successive in nature. popular. Data sketching is a way of looking at a selected
sample of the overall data stream in a random or
Keywords – compact, control, count-min sketch, pseudorandom fashion to determine patterns in the data and
IoT, time formulate solutions while allowing for a margin of error.
Count-min sketch and the hyper log log sketch are two of the
1. INTRODUCTION most well-known sketching algorithms used to detect
anomalies in use today. Count-min sketch [1] which is used
IoT has found its way into almost every facet of our lives. for counting event occurrences, works by passing the said
Industrial applications, for example, where in places like event through multiple hash functions and mapping them to
manufacturing facilities sensors are placed around key areas a table representative of each hash function. As more events
of the production process to provide valuable insight into the get hashed the values inside the table get incremented. To
production level peak times and number of defective find an event count list the values of all hash values
products coming off the production line. Another important corresponding to this event and choose their minimum. What
area that has gained interest is in agriculture where sensors is special is that even if the data size were to increase
are placed around the field to monitor plant growth, soil pH, exponentially with the passage of time the sketch size
water intake etc. The data can be used to better care for the remains constant. Hyper log log sketch [2] works by
crops and keep mineral waste to a minimum. An additional counting the number of unique elements in a set. This is done
crucial use for IoT is in medical application. The devices are by defining a number of buckets, hash the event, take the
placed on the body referred to as wearables, or inside the prefix of the hash which identifies with one of the buckets,
body referred to as implanted devices or in the surrounding the number of bits that are left you count the number of
area referred to as environment monitoring. The devices leading zeros. Hyper log log only counts the maximum
provide round the clock monitoring with data analysis and number of leading zeros of a given bucket and computes their
feedback being done at any point in the day with the average.
physician taking a more informed decision on treatment and
tests being performed as needed. There are a lot more use In this paper we introduce a variation on the traditional
cases for IoT devices but these areas stand out the most by count-min sketch which takes into account the time variable.
looking at the current research and the industry as a whole. The added hash functions add depth to the sketch by reducing
the number of collisions. Time was not considered in the
The IoT system is comprised of a group of sensors that are reviewed literature as a way of adding more dimensionality
application specific connected in a certain topology mostly
978-92-61-31391-3/CFP2068P @ ITU 2020 – 159 – Kaleidoscope