Skip to content

This repository includes the source code of LASCA approach.

Notifications You must be signed in to change notification settings

Gu-Youngfeng/LASCA_CODE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LASCA Static Badge

The implementation for the paper "LASCA: A Large-Scale Stable Customer Segmentation Approach to Credit Risk Assessment".

Setup

Create the running environment with conda 23.5.2 with Python 3.7.16:

conda create -n lasca python==3.7.16
conda activate lasca

Install the requirements for running LASCA:

pip install -r requirements.txt

Real-world user datasets

Due to the inclusion of a significant amount of personal privacy and commercially sensitive information in the datasets, the contents of real-world datasets are not publicly disclosed in this paper.

To simulate a real-world dataset, we provide mock user data in the data/demo directory. This includes two CSV files: df1_demo.csv and df2_demo.csv. df1_demo.csv is the pre-binning result (100 pre-bins over 3 months) and df2_demo.csv is user data (5000 users' score over 8 months). These two csv file are generated by random, serving as the input user data for LASCA. Note that this demo dataset are only used for academic research, it does not represent any real business situation.

Run LASCA

Run the phase 1 of LASCA: high quality dataset construction (HDC). This phase take user dataset as input and output the solutions dataset for data-driven optimization

python experiments.py run_hdc

Run the phase 2 of LASCA: reliable data-driven optimization (RDO). This phase take the collected solutions dataset as input and output the optimized binning solutions.

python experiments.py run_rdo

The project structure

LASCA
├─ core
│  ├─ main.py
│  ├─ model.py
│  ├─ optimizer.py
│  └─ task.py
├─ data
│  └─ demo
│     ├─ hdc.csv
│     ├─ df1_demo.py
│     └─ df2_demo.py
├─ utils
│  ├─ logger.py
│  ├─ metric.py
│  └─ utils.py
├─ experiments.py
├─ README.md
└─ requirements.txt

Notes for the project structure:

  • The files in the folder core are the main components of the algorithms.
  • The files in the folder utils are some useful functions for the implementation of LASCA.
  • The files in the folder data are the user datasets of each task. Since the three real-world user datasets are classified, a demo user dataset is provided.

Citation

This paper has been accepted by the SIGKDD 2024 conference. Should you find our work beneficial to your studies or work, we kindly request that you acknowledge our contributions by citing our work:

Yongfeng Gu, Yupeng Wu, Huakang Lu, Xingyu Lu, Hong Qian, Jun Zhou, and Aimin Zhou. 2024. LASCA: A Large-Scale Stable Customer Segmentation Approach to Credit Risk Assessment. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24).

About

This repository includes the source code of LASCA approach.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages