Pole Position Prediction- A tidymodels Example

Hello readers, today’s blog I will be looking at predicting the formula 1 grid using the Tidymodels collection of R packages. The idea is to use data from the practice sessions on a Friday, to give an idea of what the grid is expected to be for the race on Sunday before qualifying on Saturday. […]

Twenty20 Win Probability Added

Hello readers, welcome to today’s blog. I am going to implement a win probability added model for twenty 20 cricket. Now this is nothing new a quick google and there are many sources for it. Cricviz is possibly the most famous version which you may have seen on the app. The idea is the model […]

Predicting Qualifying — 2

In the last blog I outlined creating a model which predicts the fastest time for each driver in F1 qualifying. theparttimeanalyst.com/2019/07/10/predicting-f1-qualifying/ Today I am going to be dissecting the model to understands its strengths and weaknesses and to look if their is any bias within the model. First lets look at the importance matrix The […]

Predicting F1 Qualifying

Hello, welcome to this blog a few days ago I tweeted the below graph in a tweet It was the output from the model I have created which predicts the qualifying time for each driver. I will get into the review of the outputs of the model in the next blog but today im going […]

Replacing Nikita 1

Today I have a challenge go through. Done by FC Rstats however the submission was a few weeks ago and I did it in a rush and I don’t think it was my best work. Therefore this is a re hash of my submission so could end up with different results. The challenge is simple […]

Tidy Tuesday Board Games – XGBoost Model

Hello, Today we are going to ask a question and try to answer it with a analysis of data. I want to create a board game. In order do that I want to understand what make a board game more highly rated then another. I want to create a popular board game afterall! To do […]

The Next Chris Gayle

Hello, welcome to today’s data adventure where we are going to be scouting for the next Chris Gayle. I am going to be using K-means clustering in order to achieve this. The first question is what numbers am I going to use for this? Previously I detailed the creation of several metrics with which to […]

Predicting Future Twenty20 Batting Performance

Hello, welcome to the next R stats adventure. I am going to be looking at developing a model which can be used to scout batsmen of the future in twenty 20 cricket. You may have seen previously (and if you haven’t go check it out) i looked at the second 11 county championship in order […]

Does the Dog Get Adopted?? — P2

Today we are going to looking into the second part of creating the classification tree to look at the outcomes of dogs in the Dallas animal shelter. Today it’s the exciting stuff, creating the actual classification tree. If you want to understand how I have prepared the data, go and check out the first blog […]

Does the Dog Get Adopted?? — P1

Hello, welcome to the next blog. I was inspired by this week Tidy Tuesday dataset. I’m sure I have said this before but if you want to learn rstats its a great resource with the weekly dataset to practice your burgeoning skills. This week’s data was from the Dallas open data project, and the particular […]