Kaggle Playground Series – Tidymodels

Hello readers, we are entering another Kaggle playground competition, so get your Yorkshire tea ready and enjoy the process of joining. This month the competition I entered is this one https://www.kaggle.com/competitions/playground-series-s3e7It’seiew It’s looks like looks are canncellations from hotels and spoiler alert – I had a lot of fun with this dataset. EDA First, I […]

Kaggle January Playground Series – Tidymodels

Hello, hope you have your Yorkshire tea ready this is going to be a new series on the blog in which each month I am going to be tackling Kaggles monthly playground series. Find the link to Januarys below feel free https://www.kaggle.com/competitions/playground-series-s3e1 So let’s get started EDA Above is the structure of the training dataset. […]

Sliced – New York Air BnB

Hello, welcome to todays blog which I am going to go through my attmept at sliced. If you never hear of sliced its competitive data science which you have 2 hours to create a machine learning model. Catch the show here on tuesdays late a night for us Europeans https://www.twitch.tv/nickwan_datasci One of the recent rounds […]

F1 Drivers Rated – Version 2

Hello, so a year and a half a go I created a new metric for measuring F1 drivers performance based around there performance in the race and the expected the performance in the race see blog here F1 Drivers Rated Since then my laptop BSOD’ed and me being useless I never committed the code to […]

F1 2020 -Season So Far and Why Racing Point’s Method of Designing the Car is Controversial

Hello Readers, Today i’m going to do a little data explore of the data from the F1 2020 season so far. Exploring a number of questions about the season so far. First of all looking at qualifying and why a lot of teams are annoyed by (t)Racing Point and the strategy they have used to […]

Twenty20 Win Probability Added

Hello readers, welcome to today’s blog. I am going to implement a win probability added model for twenty 20 cricket. Now this is nothing new a quick google and there are many sources for it. Cricviz is possibly the most famous version which you may have seen on the app. The idea is the model […]

F1 Drivers Rated

Hello, welcome to today’s blog and in it I’m going to be developing methods to evaluate F1 drivers. Currently there is no real way to tell if an F1 driver is any good. It seems sort of arbitrary how a racing driver is decided if they are good or not. Being a data fan I […]

Predicting Qualifying — 2

In the last blog I outlined creating a model which predicts the fastest time for each driver in F1 qualifying. theparttimeanalyst.com/2019/07/10/predicting-f1-qualifying/ Today I am going to be dissecting the model to understands its strengths and weaknesses and to look if their is any bias within the model. First lets look at the importance matrix The […]

Predicting F1 Qualifying

Hello, welcome to this blog a few days ago I tweeted the below graph in a tweet It was the output from the model I have created which predicts the qualifying time for each driver. I will get into the review of the outputs of the model in the next blog but today im going […]

The Value of a Wicket

Hello, today I am going to be looking a the value of a wicket. In one daya cricket be it 20 over of 50 over you have 2 resources. The amount of balls remaining and the amount of wickets remaining. The balls remaining influences what risks the batsman takes however how much does taking a […]

Sport Data Science

Join the R Stats Adventure Here!

Category: rstats