Data Science Project Portfolio
Data Science Professional Practice Module Portfolio
This will be more curated when I consider what information I would like to put out there.
I am currently in the first year of a 3-year Level 6 Data Science degree apprenticeship course. Redirect link to Github Profile Page
I have completed 4 modules and demonstrated skills in the following areas:
| Area | Skills |
|---|---|
| Data engineering | ETL processing in Power Query for PowerBI and Jupyter Notebooks for Python |
| Data visualisation and dashboarding | PowerBI dashboarding with interactive visualisations Data visuals in Python using Jupyter Notebooks |
| Data analytics | Linear regression and logistic regression modelling in Python using Jupyter Notebooks |
I have done a lot of courses and I’m self taught in a lot of things. I will update this when I can think more clearly about what I would like to put out there.
| Type | Logistic Regression - Companies House |
K-Means Clustering - Allrecipes |
|---|---|---|
| Quesion | To what extent can company structure and ownership variables be used to predict late filing behaviour in UK companies using logistic regression, as an indicator of regulatory non-compliance? | How effectively can K-Means clustering be applied to segment recipes, based on macronutrient composition and preparation time, in order to support time-constrained individuals in making nutritionally informed choices? |
| Github Links | Logistic Regression with Companies House Data | Clustering recipes with data from https://www.allrecipes.com/ |
| Data Sources | Companies House - Free Company Product People with Significant Control (PSC) Snapshot |
all_recipes.csv |
| Scripts | Get Data EDA and Data Cleansing Logistic Regression Modelling |
EDA and Cleansing Notebook |