Page 611 - AI for Good Innovate for Impact
P. 611
AI for Good Innovate for Impact
for lending. These methods screen potential borrowers based on characteristics like payment
history, debt amount, credit history age, credit mix employed, recent behavior, and credit
utilization, each contributing to a credit score that reflects the creditworthiness of a person.
However, more advanced methods are now suggested by recent studies: Leevy et al. [3] 4.6: Finance
demonstrate that optimization of the decision threshold with a constraint TPR ≥ TNR can boost
Area Under the Precision-Recall Curve (AUPRC) without resampling; Kamimura et al. [4] survey
forty-six metaheuristic and Bayesian methods—like genetic algorithms or tree-structured Parzen
estimators—to optimize credit scoring models; Kyeong & Shin [5] introduce a two-stage Bayesian
logistic model to offer a trade-off between interpretability and performance; and the Khashei &
Mirahmadi hybrid soft-computing model indicates improvements in cases of high uncertainty.
Besides scoring, anomaly-detection techniques (like isolation forests, autoencoders) and fraud-
detection programs introduce additional levels of risk regulation.
As a consequence of these findings, our use case implements an adaptive real-time decision-
support system whose acceptance threshold it is continually adapting. We make use of MCMC
(Metropolis–Hastings) to sample from the posterior distribution of threshold parameters and
Bayesian Optimization (through Hyperopt) to hyperparameter-tune a LightGBM model.
In order to put these innovations into practice, we utilize the Home Credit Default Risk
dataset, which comprises a chain of linked tables that provide an overall snapshot of each
applicant's financial history and behavior. Demographic information, incomes, credit amounts,
employment and education data, and housing and family status for each applicant are found in
the application_train and application_test.csv datasets. Supplementing this data, the bureau.
csv and bureau_balance.csv tables contain the applicant's external credit history and monthly
repayment statuses reported by other financial institutions. The POS_CASH_balance.csv
and credit_card_balance.csv data contain monthly snapshots of the point-of-sale transaction
balances, cash loans, and credit card utilization. Additionally, the prior_application.csv table
summarizes the applicant's earlier loan applications to the institution, while installments_
payments.csv provides detailed records of repayments along with instances of missed
repayments on those loans. By bringing together data from these tables, we produce an
extensive blend of borrower attributes, such as demographic variables such as age, gender,
and housing situation; financial variables such as aggregate income, credit-to-income ratio,
and annuity sums; employment characteristics such as days employed and contract form; and
behavioral variables such as delinquency counts and punctuality in installment payments. Our
dynamic threshold model optimizes profit while ensuring fairness, enhancing risk management
and increasing fair access to credit—especially for rural and underbanked populations.
Use Case Status: This use case is part of a larger research project.
Partners:
Commercial Bank, Credit Bureau, Academic Research Lab
2�2 Benefits of the use case
Our AI-driven financial model offers substantial benefits by making credit levels responsive to
local economic conditions, such as agricultural production and seasonal income fluctuations.
This approach opens up small loans and microfinance to rural families, enabling them to
smooth consumption, invest in small enterprises or farm inputs, and accumulate assets. By
575

