Projects Portfolio

2023 Data Science Job Market Analysis | Tools: SQL, Data Analysis, PostgreSQL

  • 📊 Dive into the data job market! Focusing on data scientist roles, this project explores 💰 top-paying jobs, 🔥 in-demand skills, and 📈 where high demand meets high salary in data science.

Code

Intelligent Rival for Nine Men’s Morris | Tools: Python, Machine Learning, AI

  • Created Adversarial Search Nine Men’s Morris agent using Mini-Max and Alpha-Beta pruning algorithms with dynamically altering Mini-Max depth, achieving an 88% win rate against human players.
  • Competed in a class-wide AI competition organized by the professor, securing a top 3 position among 80+ participants.

Code

Counter-Speech Generator against Online Hate Speech | Tools: PySpark, NLP, ML

  • Built a deep learning model using Hugging Face T5 to generate counter-speech against online speech, with an accuracy of 85%.
  • Evaluated the effectiveness of the generated counter-speech in mitigating negativity through sentiment analysis using TextBlob.

Code

Stock Price Prediction with LSTM-RNN | Tools: PySpark, Databricks, AWS

  • Conceived a scalable real-time Spark pipeline on AWS infrastructure for data ingestion, pre-processing, and stock prediction.
  • Leveraged Databricks with PySpark to construct a model using LSTM-RNN achieving a squared error of 5% on historical data.
  • Collaborated with a team of 3 students to refine the model, improving its accuracy to 93% through hyperparameter tuning.

Code

Databricks Formula 1 Racing Analysis

  • Created databricks notebooks to ingest, transform, analyze and create reports on Formula 1 racing data.
  • Written Spark SQL queries to find the dominant drivers and teams for visualization.
  • Scheduled the pipeline using Azure Data Factory (ADF) for monitoring and alerts.

Code

Impact of Airbnb on Housing Supply | Tools: MySQL, Data Analytics, Tableau

  • Led a team of 5 to analyze Airbnb’s impact on housing, completing the project on time.
  • Utilized GeoPandas and ZCTA5 shapefiles to extract granular data at the zip code level.
  • Created a scalable Tableau dashboard to assess risk for 1.5M Airbnb listings from state to zip code level.