About me

Click here to go direct to my Github profile

Hi, I’m a Data Scientist with an Product Development and Strategic analysis background. I hold a Masters degree in Mechanical Engineering and Aeronautics with a focus in numerical analysis and simulation. I have extensive experience in the automotive and energy sectors in product development, testing and strategic insight functions.

My commercial projects have included:

  • Fleet telematics: Deployment of a vehicle fleet telematics system including ETL pipeline, data processing, visualisation and dashboarding.
  • Battery health forecasting: Regression model to forecast EV battery failure, and add data-driven insight to support >£10mil warranty analysis.
  • Pipe inspection tools: Developed data processing and reporting tools for 2x cutting edge subsea pipe test inspection products, laser bore scanning and 3D strain imaging.

My technical expertise is in data mining, visualisation and software development (Python, Matlab etc). I work on Data analytics and machine learning projects in my free time. Some personal projects include:

Projects

Cars at auction

Used car pricing analysis

  • Built a used car valuation model to optimise the selling price for my own car.
  • Web-scraped 1000+ adverts from Autotrader and analysed price trends against features like age, mileage, engine size etc.
  • With an SVR Regression model I achieved R^2: 0.97 and MAE: £961. The most influential features on price were age and mileage.
  • The final result valued my car within £200 of Autotrader's own recommended selling price.

Tools: Python Pandas NumPy Requests BeautifulSoup4 Matplotlib Seaborn Scikit-learn

Read More

Job market

Job market analysis (NLP)

  • Created an NLP job title classifier with data scraped from indeed.com.
  • Automation of job titling could boost recruitment efficiency and better reach the most suitable candidates.
  • Extracted 'skill tags' for each role (Python, Cloud, Machine Learning etc).
  • Deployed a web-app using Streamlit, allowing anyone to classify a job as 'Data Scientist', 'Data Analyst', or 'Data Engineer'.

Libraries: Requests BeautifulSoup4 Pandas NTLK Seaborn Scikit-learn Streamlit

Read More

mot inspection

MOT data analysis

  • Analysed 30mi MOT tests from GOV.uk for trends in vehicle ownership, pass/fail rates etc.

Libraries: SQLite Pandas Seaborn Scikit-Learn

Read More

Energy Demand Forecasting

  • Predicted survival probability for passengers onboard the Titanic cruise ship.

Libraries:

Read More

Data science tools

  • Languages: Python SQL Matlab
  • Databases: SQLite BigQuery
  • Machine learning: sklearn
  • Tex analytics: nltk spaCy
  • Data manipulation: pandas numpy dask scipy
  • Visualisation: matplotlib seaborn plotly Tableau streamlit

Other skills and tools

  • Google Cloud: BigQuery Datastudio
  • Web scraping: BeatifulSoup4
  • Other: pptk (point cloud visualisation)

Currently learning

  • tensorflow/keras
  • Databricks spark
  • Airflow