Skip to content

This Jupyter Notebook focuses on credit risk prediction using a Random Forest Classifier. It covers data preprocessing, exploratory data analysis (EDA), model training, and handling class imbalance. Additionally, essential metrics such as precision, recall, F1-score, and confusion matrices are computed to evaluate the model's performance.

Notifications You must be signed in to change notification settings

MahanPourhosseini/Credit-Risk-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Problem Statement

Credit risk infers to the possibility of a loss emerging from a borrower's downfall to pay back a loan or meet contractual commitments. Conventionally, it pertains to the risk arising from lenders' inability to return the owed interest and principal, impacting the cash flows and increasing assemblage costs. In this session, we used German credit data.The aim is to predict credit risk when a person requests for loan. You have to build a model to predict whether the person, described by the attributes of this dataset, is a good (1) or a bad (0) credit risk?

Attribute Information

Credit risk dataset contains 1000 samples and 21 attributes. A brief description for each attribute is as follows:

Categorical Features:

  1. status: status of the debtor's checking account with the bank
  • 1 : no checking account
  • 2 : ... < 0 DM
  • 3 : 0<= ... < 200 DM
  • 4 : ... >= 200 DM / salary for at least 1 year
  1. credit_history: history of compliance with previous or concurrent credit contracts
  • 0 : delay in paying off in the past
  • 1 : critical account/other credits elsewhere
  • 2 : no credits taken/all credits paid back duly
  • 3 : existing credits paid back duly till now
  • 4 : all credits at this bank paid bac
  1. purpose: purpose for which the credit is needed
  • 0 : others
  • 1 : car (new)
  • 2 : car (used)
  • 3 : furniture/equipment
  • 4 : radio/television
  • 5 : domestic appliances
  • 6 : repairs
  • 7 : education
  • 8 : vacation
  • 9 : retraining
  • 10 : business
  1. savings: debtor's savings
  • 1 : unknown/no savings account
  • 2 : ... < 100 DM
  • 3 : 100 <= ... < 500 DM
  • 4 : 500 <= ... < 1000 DM
  • 5 : ... >= 1000 DM
  1. personal_status: combined information on sex and marital status
  • 1 : male : divorced/separated
  • 2 : female : non-single or male : single
  • 3 : male : married/widowed
  • 4 : female : single
  1. other_debtors: Is there another debtor or a guarantor for the credit?
  • 1 : none
  • 2 : co-applicant
  • 3 : guarantor
  1. other_installment_plans: installment plans from providers other than the credit-giving bank
  • 1 : bank
  • 2 : stores
  • 3 : none
  1. housing: type of housing the debtor lives in
  • 1 : for free
  • 2 : rent
  • 3 : own

Binary features

  1. people_liable: number of persons who financially depend on the debtor (i.e., are entitled to maintenance)
  • 1 : 3 or more
  • 2 : 0 to 2
  1. telephone: Is there a telephone landline registered on the debtor's name?
  • 1 : no
  • 2 : yes (under customer name)
  1. foreign_worker: Is the debtor a foreign worker?
  • 1 : yes
  • 2 : no

Ordinal features:

  1. employment_duration: duration of debtor's employment with current employer
  • 1 : unemployed
  • 2 : < 1 yr
  • 3 : 1 <= ... < 4 yrs
  • 4 : 4 <= ... < 7 yrs
  • 5 : >= 7 yrs
  1. installment_rate: credit installments as a percentage of debtor's disposable income
  • 1 : >= 35
  • 2 : 25 <= ... < 35
  • 3 : 20 <= ... < 25
  • 4 : < 20
  1. present_residence: length of time (in years) the debtor lives in the present residence
  • 1 : < 1 yr
  • 2 : 1 <= ... < 4 yrs
  • 3 : 4 <= ... < 7 yrs
  • 4 : >= 7 yr
  1. property: the debtor's most valuable property, i.e. the highest possible code is used. Code 2 is used, if codes 3 or 4 are not applicable and there is a car or any other relevant property that does not fall under variable sparkont.
  • 1 : unknown / no property
  • 2 : car or other
  • 3 : building soc. savings agr./life insurance
  • 4 : real estate
  1. number_credits: number of credits including the current one the debtor has (or had) at this bank
  • 1 : 1
  • 2 : 2-3
  • 3 : 4-5
  • 4 : >= 6
  1. job: quality of debtor's job
  • 1 : unemployed/unskilled - non-resident
  • 2 : unskilled - resident
  • 3 : skilled employee/official
  • 4 : manager/self-empl./highly qualif. employee

Continuous features:

  1. duration: credit duration in months

  2. amount: credit amount in DM

  3. age: age in years

Class labels:

  1. credit_risk: Has the credit contract been complied with (good) or not (bad) ? (binary class)
  • 0 : bad
  • 1 : good

download (1)

About

This Jupyter Notebook focuses on credit risk prediction using a Random Forest Classifier. It covers data preprocessing, exploratory data analysis (EDA), model training, and handling class imbalance. Additionally, essential metrics such as precision, recall, F1-score, and confusion matrices are computed to evaluate the model's performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published