Page 217 - Kaleidoscope Academic Conference Proceedings 2020
P. 217

STCCS: SEGMENTED TIME CONTROLLED COUNT-MIN SKETCH




                                                                                 2
                                                   1
                                                                    1
                                        Ismail Khram ; Maha Shamseddine ; Wassim Itani
                                              1 Beirut Arab University, Beirut, Lebanon
                                            2 University of Houston-Victoria, Texas, USA


                              ABSTRACT                        in a star or mesh depending on the need for redundancy and
                                                              feasibility. The sensors would then send the sensed data to a
           IoT is a concept consisting of many components powered by   controller or a master that is arranged in a hieratical structure
           different techniques and technologies.  However, due  to   to prevent network congestion due to multiple simultaneous
           computation restrictions,  encryption algorithms had to be   connection establishment. The master node usually performs
           adapted, often at the expense of lower data security levels   the data processing and then the data is sent to the cloud for
           and strength. To maintain data privacy between source and   storage and analysis. This multi-tier architecture put a lot of
           sink we present in this paper a data sketching algorithm that   security risk on the system in the form of denial of services
           utilizes bandwidth by providing a summary of the data to the   attacks, malware, data manipulation and man in the middle
           cloud.  The input data  stream  goes through  a hashing   attacks, to name a few.
           algorithm which produces a hexadecimal representation of
           the data before going through the sketching algorithm. At the   To prevent security threats and computations to a minimum
           algorithm the  data is categorized and  the corresponding   researchers  today  are  designing  algorithms  that  are
           hash cell value updated. Note is also taken of the arrival time   optimized for memory allocation and bandwidth utilization
           of the data considered anomalous to allow the manager to   which make it a suitable application for IoT ecosystems. Of
           take  corrective action if  it is deduced that  the periodic   the  methods  considered  data  sketching  is  by  far  the  most
           appearance of the information is successive in nature.         popular.  Data  sketching  is a  way  of looking at  a  selected
                                                              sample  of  the  overall  data  stream  in  a  random  or
               Keywords – compact, control, count-min sketch,   pseudorandom fashion to determine patterns in the data and
                               IoT, time                      formulate  solutions  while  allowing  for  a  margin  of  error.
                                                              Count-min sketch and the hyper log log sketch are two of the
                         1.  INTRODUCTION                     most  well-known  sketching  algorithms  used  to  detect
                                                              anomalies in use today. Count-min sketch [1] which is used
           IoT has found its way into almost every facet of our lives.   for counting event occurrences, works by passing the said
           Industrial  applications,  for  example,  where  in  places  like   event through multiple hash functions and mapping them to
           manufacturing facilities sensors are placed around key areas   a table representative of each hash function. As more events
           of the production process to provide valuable insight into the   get hashed the values inside the table get incremented. To
           production  level  peak  times  and  number  of  defective   find  an  event  count  list  the  values  of  all  hash  values
           products coming off the production line. Another important   corresponding to this event and choose their minimum. What
           area that has gained interest is in agriculture where sensors   is  special  is  that  even  if  the  data  size  were  to  increase
           are placed around the field to monitor plant growth, soil pH,   exponentially  with  the  passage  of  time  the  sketch  size
           water intake etc. The data can be used to better care for the   remains  constant.  Hyper  log  log  sketch  [2]  works  by
           crops and keep mineral waste to a minimum. An additional   counting the number of unique elements in a set. This is done
           crucial use for IoT is in medical application. The devices are   by defining a number of buckets, hash the event, take the
           placed on the body referred to as wearables, or inside the   prefix of the hash which identifies with one of the buckets,
           body referred to as implanted devices or in the surrounding   the  number  of  bits  that  are  left  you  count  the  number  of
           area  referred  to  as  environment  monitoring.  The  devices   leading  zeros.  Hyper  log  log  only  counts  the  maximum
           provide round the clock monitoring with data analysis and   number of leading zeros of a given bucket and computes their
           feedback  being  done  at  any  point  in  the  day  with  the   average.
           physician taking a more informed decision on treatment and
           tests being performed as needed. There are a lot more use   In  this  paper  we  introduce  a  variation  on  the  traditional
           cases for IoT devices but these areas stand out the most by   count-min sketch which takes into account the time variable.
           looking at the current research and the industry as a whole.   The added hash functions add depth to the sketch by reducing
                                                              the  number  of  collisions.  Time  was  not  considered in the
           The IoT system is comprised of a group of sensors that are   reviewed literature as a way of adding more dimensionality
           application specific connected in a certain topology mostly





           978-92-61-31391-3/CFP2068P @ ITU 2020            – 159 –                                 Kaleidoscope
   212   213   214   215   216   217   218   219   220   221   222