Page 402 - Kaleidoscope Academic Conference Proceedings 2024
P. 402

2024 ITU Kaleidoscope Academic Conference




           preprocessing  module  is  completed,  further  optimizations   5.5   Data Aggregation and Augmentation Strategies
           and  categorizations  are  performed  by  the  optimization
           module. This task is necessary to bring the data into a state   For  collecting  data  from  OSV,  it  was  found  feasible  to
           optimal  for  generating  vulnerability  intelligence  and   periodically  download  the  complete  zipped  JSON  files
           includes  data  categorization,  data  organization,  and   archive from  the  data dumps  provided  through  their  GCS
           parameter  optimization.  While  the  data  augmentation   bucket and process it (GCS bucket maintained by OSV at
           module  is  responsible  to  collect  data  from  the  CAPEC   gs://osv-vulnerabilities  [15]).  This  can  be  done  timely
           repository  and  extend  it  using  CWE  list  so  that  it  can  be   through a scheduled application process over an encrypted
           linked with the aggregated vulnerability repository, the web   web channel. For integration with NVD, the API mode of
           user interface module is responsible for providing the means   integration  [15]  best  suited  the  mirroring  the  data,  once
           of interaction between the system users and system itself.   completely and then incrementally.
           Finally,   the   collaboration   module   provides   the
           interoperability features for the external systems that wish to   To  enhance  the  aggregated  vulnerability  data  to  suit  the
           work  with  the  data  aggregated  and/or  the  vulnerability   needs for being able to aid in forensic analysis process for
           intelligence generated by the system without using the web   software systems, the CAPEC repository, has been utilized
           user interface module. This delivers the flexibility to enhance   for  data  augmentation.  Given  the  CAPEC  repository  and
           the data being requested without requesting changes in the   CWE  list  are  available  for  download  in  csv  (comma
           system itself.                                     separated  values)  format,  their  latest  versions  were  both
                                                              imported  directly  into  the  local  database  and  further
           5.2   Interoperability Aspects                     processed according to the needs for mapping and linking.

           The developed vulnerability intelligence platform has two   5.6   The Unified Schema for Seamless Integration
           chief capabilities that allow the solution to interoperate with
           external systems. First, it has a standardized data format to   The OSV database transforms and stores the data from the
           consume the vulnerability data for future integrations, called   multiple open-source databases into a custom schema, that
           the  unified  schema.  Second,  it  exposes  well-  defined   grew  from  the  vulnerability  interchange  schema,  having
           RESTful  APIs  (APIs  that  following  the  Representational   gone several iterations of change [5]. Similarly, the NVD
           State Transfer architectural style for exchanging data over   database  maintains  a  standard  format  to  keep  the
           the Internet) for external software systems to consume the   vulnerability  records.  These  two  records  are  different  in
           aggregated vulnerability repository. Further, the main point   several aspects and in order to integrate the data from these
           of  access  to  the  platform  is  through  a  web  user  interface   two sources to present the converged vulnerability insights,
           which  has  been  built  in  compliance  with  the  latest  web   the schemas needed to be unified. This new unified schema
           standards ensuring cross browser interoperability.   developed  for  the  system,  not  only  provides  relevant
                                                              information  without  any  data  loss  from  either  source,  but
           5.3   Collaboration Aspects                        also reduces the noise from unnecessary data fields, or data
                                                              fields that may not be needed for current context.
           The system supports collaboration with the users, both who
           directly use the vulnerability intelligence user interface and   Under  the  unified  schema,  all  records  are  aggregated
           those who wish to interact with the data consumed by the   together  under  a  unique  identifier  for  every  vulnerability
           external software systems. Additionally, it provides a feature   record, called VIP ID (represented by vip_id in the format).
           to  add  feedback  for  integrations  with  external  software   This unified schema forms the crux of the convergence of
           systems,  so  that  the  vulnerability  intelligence  can  be   vulnerability  insights  from  multiple  sources.  It  has  been
           accordingly  re-organized  to  show  restructured  priority  for   structured in JSON format and is composed of the fields as
           the specific usage.                                summarized in Table 1.

           5.4   Configurability Aspects                              Table 1 – The Unified Schema Format

           The developed system provides the ability to configure or   Field Name       Requirement     Field
           modify  several  parameters  through  its  user  interface.  The                            Value
           default ecosystem for which vulnerability intelligence is to                                 Type
           be  generated  can  be  customized  for  each  user.  The   vip_id          mandatory      string
           aggregated  vulnerability  repository  can  be  enhanced  from   source_name   mandatory   string
           another data source by configuring API information that can   source_vuln_id   mandatory   string
           be consumed periodically to append the information to the
           overall  system.  Further,  the  collaboration  with  external   ecosystem   optional      string
           software  systems  can  also  be  managed  through  a   vulnerability_description   mandatory   object
           configuration user interface.                       source_published        mandatory      string
                                                               response_version        mandatory      string
                                                               response_timestamp      mandatory      string






                                                          – 358 –
   397   398   399   400   401   402   403   404   405   406   407