The use of statistical models in computer algorithms allows computers to make decisions and predictions, and to perform tasks that traditionally require human cognitive abilities. Machine learning is the interdisciplinary field at the intersection of statistics and computer science which develops such statistical models and interweaves them with computer algorithms. It underpins many modern technologies, such as speech recognition, Internet search, bioinformatics and computer vision—Amazon’s recommender system, Google’s driverless car and the most recent imaging systems for cancer diagnosis are all based on Machine Learning technology.
This course on Machine Learning will explain how to build systems that learn and adapt using real-world applications. Some of the topics to be covered include linear regression, logistic regression, deep neural networks, clustering, and so forth. The course will be project-oriented, with emphasis placed on writing software implementations of learning algorithms applied to real-world problems, in particular, Credit Risk, Collections Management and Fraud Detection.
Instructor: Dr. Alejandro Correa Bahnsen
- email: al.bahnsen@gmail.com
- twitter: @albahnsen
- github: albahnsen
- Python version 3.5;
- Numpy, the core numerical extensions for linear algebra and multidimensional arrays;
- Scipy, additional libraries for scientific programming;
- Matplotlib, excellent plotting and graphing libraries;
- IPython, with the additional libraries required for the notebook interface.
- Pandas, Python version of R dataframe
- scikit-learn, Machine learning library!
A good, easy to install option that supports Mac, Windows, and Linux, and that has all of these packages (and much more) is the Anaconda.
GIT!! Unfortunatelly out of the scope of this class, but please take a look at these tutorials
Session | Notebook link | Exercises |
---|---|---|
1 | Introduction to Machine Learning |
| | 2 | Introduction to Python | Python & Numpy | | 3 | Pandas Data Frame | Baby names | | 4 | Linear Regression | Income Prediction Rent | | 5 | Logistic Regression | Credit Scoring | | 6 | Data Preparation and Model Evaluation | Credit Scoring V2 | | 7 | Feature Selection | Income Prediction V2 | | 8 | Unbalance Datasets | Fraud Detection | | 9 | Decision Trees | Fraud Detection V2 | | 10 | Ensemble Methods - Bagging | Fraud Detection V3 | | 11 | Ensemble Methods - Boosting | Credit Scoring V3 | | 12 | SVM | Fraud Detection V4 | | 13 | Cost-Sensitive Classification | Credit Scoring V4 | | 14 | Deep Learning | |