Link to my portfolio : https://ayswarya-sundararaman.github.io/portfolio/
Data Engineer | Machine Learning Enthusiast | Automation Expert
I have over 4 years of experience in Python, SQL, Tableau, delivering impactful data solutions. Currently pursuing a Master's in Informatics at Northeastern University, Iโm passionate about using data-driven methods to solve real-world problems, from predictive modeling to automation.
- Languages: Python, SQL, R
- Machine Learning: Scikit-Learn, TensorFlow, Keras
- Data Visualization: Tableau, Matplotlib, Seaborn , Advanced Excel
- Big Data: Hadoop, Spark, Apache Kafka , Airflow , Apache Spark
- Tools: Git, Docker, Jupyter, Google Colab
- Databases: MySQL, MongoDB, Cassandra
Built an end-to-end convolutional neural network model based on NVIDIAโs architecture to predict steering angles from front-view car images using SullyChenโs self-driving car dataset. Applied data augmentation, dropout, and fine-tuned hyperparameters to achieve high accuracy in real-world scenarios.
- Tools: Python, TensorFlow, Keras, OpenCV
- Results: Significantly reduced training error, model accurately predicts angles in unseen scenarios.
๐ Walmart Sales Forecasting(https://medium.com/@ays060/walmart-store-sales-forecasting-kaggle-challenge-53719d4f8ddf)
Developed a machine learning model to forecast weekly sales for different Walmart departments using time-series data. Integrated markdown event features, applied advanced feature engineering, and trained a model to predict future sales accurately.
- Tools: Python, Random Forest, XGBoost, Time Series Analysis
- Results: Achieved an impressive weighted mean absolute error (WMAE) metric by accounting for holiday effects and markdowns.Achieved an Kaggle rank of 20 .
Built a deep learning model using CNN and LSTM layers to classify human activities (walking, sitting, standing, etc.) using smartphone sensor data from the UCI HAR dataset. Applied a divide-and-conquer strategy to optimize classification for dynamic and static activities.
- Tools: Python, TensorFlow, Keras
- Results: Achieved 96% accuracy on the test dataset using time-series modeling.
Developed a multi-label classification system for StackOverflowโs question pairs using NLP techniques such as TF-IDF, word embeddings, and machine learning models. Built a robust feature engineering pipeline to predict relevant tags for each question.
- Tools: Python, NLP, Logistic Regression, XGBoost
- Results: Achieved 85% accuracy in predicting the most relevant tags for a given question.
Applied K-Means and Agglomerative Clustering on the DonorsChoose dataset to identify patterns in project proposals. Utilized Truncated SVD for dimensionality reduction and visualized clusters to uncover insights into project types and approval likelihood.
- Tools: Python, Scikit-Learn, Matplotlib
- Results: Identified key clusters of successful projects, providing valuable insights for donors and organizers.
- LinkedIn: [Connect with me](https://www.linkedin.com/in/ayshwaryasund/)
- Email: [your.email@example.com](mailto:ayswaryasundararaman@gmail.com / sundararaman.a@northeastern.edu)
Let's build something impactful together! ๐