Page 595 - AI for Good Innovate for Impact
P. 595

AI for Good Innovate for Impact



               REQ-07: The system shall integrate with relevant government databases for data verification
               and cross-referencing.

               REQ-08: The system shall ensure secure data handling and provide a user interface for
               authorized access and investigation.                                                                 4.6: Finance

               REQ-09: The underlying AI/ML models shall be adaptable and updated to maintain detection
               accuracy.


               3�1 Separate AI Models and Infrastructure
               AI Models:

               •    Anomaly Detection: Trained on financial data using scikit-learn or TensorFlow for
                    detecting unusual transactions.
               •    Graph Analytics: Uses NetworkX to identify duplicate registrations via entity networks.
               •    NLP/Sentiment Analysis: Fine-tuned BERT models for analyzing social media/news.
               •    Geospatial Analysis: Custom algorithms to correlate satellite imagery with project
                    locations.
               •    OCR: Tesseract for document text extraction.
                Infrastructure:

               •    Data Sources:
               •    Primary: NGO Darpan (ngodarpan.gov.in) for registration and financial data.
               •    Secondary: ISRO for satellite imagery, MCA for CSR data, social media APIs (e.g.,
                    Twitter/X).
               •    APIs:
               •    RESTful APIs for real-time data ingestion from NGO Darpan and MCA.
               •    ISRO Bhuvan API for geospatial data.
               •    Twitter/X API for sentiment analysis.
               •    Data Collection Protocols:
               •    Secure HTTPS for data transfer.
               •    Batch processing for historical data; streaming for real-time updates.
               •    SOPs:
               •    Data Ingestion: Validate data formats (CSV, JSON) and log errors.
               •    Model Training: Retrain models quarterly with new data; validate with 80/20 train-test
                    split.
               •    Alert Handling: Auditors review high-risk alerts within 48 hours; escalate critical cases to
                    NIC leadership.
               •    Privacy Compliance: Anonymize whistleblower data; encrypt financial records per India’s
                    PDP Act.



















                                                                                                    559
   590   591   592   593   594   595   596   597   598   599   600