top of page

Projects

My projects makes use of vast variety of latest machine learning tools for developing and interpreting models. My best experience is to create models, infer them using various visualization tools and deploy them to web applications.

REAL ESTATE ANALYTICS PROJECT: LEVERAGING MACHINE LEARNING FOR MARKET INSIGHTS

[Python, Apache Airflow, CI/CD]

Link
Link

• Orchestrated the implementation of a robust MLOps (Machine Learning Operations) pipeline, incorporating tools such as GitHub Actions, Docker, Apache Airflow, DVC (Data Version Control), AWS, MLflow, and Flask for efficient development, deployment,and maintenance of machine learning models.
• Established and maintained a continuous integration and deployment (CI/CD) pipeline on AWS, utilizing GitHub Actions,resulting in streamlined model deployment and automated monitoring, ensuring optimal performance and accuracy in real- world scenarios.

DATA ANALYTICS AND VISUALIZATION OF IPL PERFORMANCE

[Tableau, Python, Data Visualization, Dashboard]

• Transformed complex data into actionable insights, effectively communicating the strengths and capabilities of players across 14 IPL seasons using four key metrics.
• Created executive level interactive data visualizations to show a team win percentage against other teams
• Developed insightful Tableau dashboardsthat provided overviews of team performance, player statistics, and team preferences.
• Ensured data integrity through meticulous data cleaning, eliminating missing, duplicate, and irrelevant data, standardizing data types, and harmonizing team information.

DESIGN AND DEVELOPMENT OF SPORTS DATABASE

[SQL, Python, MySQL Workbench, Microsoft SQL Server]

• Established a database on Microsoft SQL Server to manage player, staff, team, match, and country information. Database employed functions for automated calculations, triggers for data integrity, and table-level constraints for data consistency.
• Developed a detailed Entity Relationship Diagram (ERD) comprising 14 tables, including 1-to-1 and one-to-many relationships.
• Automated data import processes using Python scripts. These scripts dynamically generated Data Definition Language (DDL) statements based on Excel data, streamlining data integration into the database.

INCOME CLASSIFIER BASED ON DEMOGRAPHIC DATA

[Python, Classification, Analysis]

Link
Link

• Conducted thorough Exploratory Data Analysis (EDA) to uncover patterns, inconsistencies, biases, outliers, and missing values.
• Executed meticulous data cleaning to address missing and duplicate data, handle data type inconsistencies, and eliminate irrelevant columns based on Variance Inflation Factor (VIF) values, enhancing the accuracy of machine learning models by 2-3%.
• Developed and optimized range of machine learning algorithms including KNN, Decision Trees, Random Forest, Logistic
• Regression, and Neural Networks. Achieved an optimal classification accuracy of 86.54% and an impressive F1 score of 92%.

bottom of page
Your Website Title