Data Science Portfolio

Hello there, welcome to my DS Portfolio!

My name is João Pedro Vazquez, I'm an aspiring data scientist who has been practicing data science skills since early 2020.
The projects' main goal is to demonstrate my habilities to solve real business problems with Data Science concepts and tools, using public data. Here you will be able to check out some of the data science projects I've made!

Skills and Habilities

My main focus is to create end-to-end data solutions for business problems through collecting, processing, analysing data and through implementing machine learning models that help improve business performance.

  • Programming language: ⭐ ⭐ ⭐ ⭐
  • • Python for Data Analysis
  • Data Collection and Databases: ⭐ ⭐
  • • SQLite, MySQL
  • Data Processing and Analysis: ⭐ ⭐ ⭐
  • • Numpy, Pandas, Scipy
  • Data Visualization: ⭐ ⭐ ⭐
  • • Matplotlib, Seaborn, Plotly, Metabase
  • Machine Learning Modeling: ⭐ ⭐ ⭐
  • •Descriptive Statistics: central tendency, dispersion, assymetry, kurtosis, density
  • • Regression, Classification, Clustering and embedding algorithms
  • • Data balacing, data preparation, feature selection and dimensionality reduction/li>
  • • Performance metrics: RMSE, MAE, MAPE, F1-Score, Precision, Recall, Calibration Curve, Silhouette Score, DB-Index
  • • Machine Learning Packages: Sci-kit Learn, Scipy
  • Deployment: ⭐ ⭐
  • • Heroku, Dash, Streamlit, Flask
  • Development: ⭐ ⭐
  • • Git, Github, Gitlab, Cookiecutter, Virtual Environment

Data Science Projects

Check out the projects!

Churn Prediction Probability - TopBank

In this project, I created a XGBoost model that predicted the TopBank customers' probability to churn and also formulate an action plan to tackle the churning problem based on giving customers a gift card in accordance to their churn probability and the maximization of customers' ROI.
In addition to the financial return, the model was created using Dash and deployed in production with Heroku.

The model had a 0.905 F1-Score and can display how much revenue TopBank could save avoiding churn with gift cards, depending on the budget designated by the user. To check out the app, click here.

Sales Prediction - Rossmann Stores

In this project, I created a XGBoost model that predicts the sales for the next six weeks. The sales forescating served as parameter to the budget designation for the stores' infraestructure renovation.
The model had a 763.11 MAE and the sales prediction per store can be easily access by Rossmann CFO through a Telegram bot. To access the Telegram bot, click here.

Banking Marketing Strategy - Clustering

On-going Project

In this project, the main goal is to create a data solution through clustering, designed to create a customer segmentation that will orientate the banking marketing strategy.

Contact

Feel free to send a message!
For any critics or sugestions, please open a issue or pull request on Github.