Skip to content

NikosKont/ML-notebooks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Machine Learning Notebooks

This repository is a collection of notebooks for various machine learning applications and other relevant tasks, including university coursework and personal projects. The notebooks are written in Python and primarily utilize popular libraries such as PyTorch, scikit-learn, TensorFlow, Keras, statsmodels, NumPy, SciPy, Pandas, Matplotlib, Seaborn. Many of these notebooks are hosted in external repositories and are accordingly linked below.

Index

Course assignments for CS573: Optimization Methods (graduate level). Derivations and implementations (in pure numpy and scipy) of optimization and machine learning algorithms:

  • Regression & Completion: OLS regression using the pseudoinverse, LASSO regression by coordinate descent, matrix completion through nuclear norm minimization.

  • SVD & PCA: SVD using eigendecomposition, PCA through SVD, user similarity search in low-dimensional PCA space on the MovieLens dataset.

Course assignments for CS485: Applied Data Science. Applications of machine learning methods and algorithms on different datasets, utilizing popular python libraries:

  • PCA: Principal Component Analysis and Linear Regression on the California Housing dataset, using sklearn.
  • GMM & MLE: Implementation of a basic Gaussian Mixture Model and an example of derivation and application of Maximum Likelihood Estimation, in pure numpy.
  • SVM: Support Vector Machines for classification on the Wine dataset, using sklearn.
  • MLP: Multi-Layer Perceptron for digit classification on the MNIST dataset, using keras.
  • CNN: Convolutional Neural Networks and transfer learning for classification on the Fashion MNIST dataset, using pytorch.
  • TSA: Analysis of the time series components and prediction on the Sunspots dataset, using ARIMA from statsmodels and LSTM from keras.
  • GNN: Node classification and embedding visualisation on the Planetoid PubMed dataset, using pytorch-geometric.

Course assignments for CS473: Pattern Recognition. Implementations (from scratch, in pure numpy) and applications of the following machine learning algorithms:

Personal project. Q-Learning implementation for the estimation of the $p$ parameter in a binomial $B(n, p)$ distribution with a target of $k$ out of $n$ successes.

Personal project. Analysis with the goal of finding the most important stats for a midfielder in football.

Course project for MEM264: Applied Statistics. Predicting movie revenue using multiple linear regression, based on data from TMDB.