Kaggle Playground Series – Tidymodels

Hello readers, we are entering another Kaggle playground competition, so get your Yorkshire tea ready and enjoy the process of joining. This month the competition I entered is this one https://www.kaggle.com/competitions/playground-series-s3e7It’seiew It’s looks like looks are canncellations from hotels and spoiler alert – I had a lot of fun with this dataset. EDA First, I […]

Pole Position Prediction- A tidymodels Example

Hello readers, today’s blog I will be looking at predicting the formula 1 grid using the Tidymodels collection of R packages. The idea is to use data from the practice sessions on a Friday, to give an idea of what the grid is expected to be for the race on Sunday before qualifying on Saturday. […]

Twenty20 Win Probability Added

Hello readers, welcome to today’s blog. I am going to implement a win probability added model for twenty 20 cricket. Now this is nothing new a quick google and there are many sources for it. Cricviz is possibly the most famous version which you may have seen on the app. The idea is the model […]

Predicting Qualifying — 2

In the last blog I outlined creating a model which predicts the fastest time for each driver in F1 qualifying. theparttimeanalyst.com/2019/07/10/predicting-f1-qualifying/ Today I am going to be dissecting the model to understands its strengths and weaknesses and to look if their is any bias within the model. First lets look at the importance matrix The […]

Predicting F1 Qualifying

Hello, welcome to this blog a few days ago I tweeted the below graph in a tweet It was the output from the model I have created which predicts the qualifying time for each driver. I will get into the review of the outputs of the model in the next blog but today im going […]

Replacing Nikita 1

Today I have a challenge go through. Done by FC Rstats however the submission was a few weeks ago and I did it in a rush and I don’t think it was my best work. Therefore this is a re hash of my submission so could end up with different results. The challenge is simple […]

Tidy Tuesday Board Games – XGBoost Model

Hello, Today we are going to ask a question and try to answer it with a analysis of data. I want to create a board game. In order do that I want to understand what make a board game more highly rated then another. I want to create a popular board game afterall! To do […]

The Next Chris Gayle

Hello, welcome to today’s data adventure where we are going to be scouting for the next Chris Gayle. I am going to be using K-means clustering in order to achieve this. The first question is what numbers am I going to use for this? Previously I detailed the creation of several metrics with which to […]

Predicting Future Twenty20 Batting Performance

Hello, welcome to the next R stats adventure. I am going to be looking at developing a model which can be used to scout batsmen of the future in twenty 20 cricket. You may have seen previously (and if you haven’t go check it out) i looked at the second 11 county championship in order […]