Cricket Weighted Batting Average in R

Hello, I hope you have your Yorkshire tea ready as today I am going to be exploring weighted averages using R. I used the code above to generate the table of the top 15 players by batting average in the 2022 county championship. Now the whole point of this blog is to devise a weighted […]

Predicting Twenty 20 Cricket Result with Tidy Models

Hello, hope you have your Yorkshire tea to hand and sitting comfortably ready to read today’s blog. In it I am going to be doing some machine learning with tidymodels to predict the outcome of some twenty20 cricket matches. I am using the data from cricsheet as used in this blog and using the win […]

F1 Strategy Analysis

I was recently browsing reddit and found this AMA from a former mercedes strategy engineer The most surprising thing was that most of the race strategy was calculated using VBA in excel. This isn’t some start up outfit this is the mighty Mercedes winners of the last 7 constructors and drivers championships. In this blog […]

Finding Undervalued Air Bnb’s

Hello, today I am going to do an EDA (exploratory data analysis) on AirBnB in the New York area. This data set is available here on Kaggle https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data Lets read the data into R and take a look of it So I can see there are 17 columns and over 48000 records with information covering […]

Replacing Nikita — 3

Hello, Welcome to the third and final part of my replacing Nikita Parris series. If you haven’t caught the first 2 blogs go check out before this one as they take you through the various parts of the process. We identified our 3 best candidates to replace Nikita Parris: Francesca Kirby, Millie Farrow and Vivianne […]

Finding the Next James Anderson

Hello, welcome to today’s blog in which we will be scouting for the next James Anderson. Possibly. The ideas behind this blog are nothing new in fact I have stolen the idea from another sport. The Rangers report blog wrote a piece about scouting for the best youth players by using age-adjusted stats. This idea […]

Does the Dog Get Adopted?? — P2

Today we are going to looking into the second part of creating the classification tree to look at the outcomes of dogs in the Dallas animal shelter. Today it’s the exciting stuff, creating the actual classification tree. If you want to understand how I have prepared the data, go and check out the first blog […]

F1 Circuit Cluster Analysis – 3

Hello and welcome to the meant to be final F1 circuit cluster analysis blog, however, I have thought of some ideas to extend it to a fourth so we shall see how that goes. The idea is today we will review what we can from the season so far and in the next one look […]

Tidy Tuesday 2 – World Life Expectancy

Hello, welcome to today’s blog which is going to be my second one covering the tidy Tuesday dataset. This week it was looking at a dataset with life expectancy for every country in the world since 1950. I decided you could do some cluster analysis on this dataset and then once you have the clusters […]

F1 Circuit Cluster Analysis —- Part 2

Hello and welcome to the second part of my mini-series using cluster analysis in order to categorise formula 1 circuits. please go check the first part it outlines the basic data we are using to categorise the circuits and an overview of the method used for hierarchical clustering. Today we are going to go with […]

Sport Data Science

Join the R Stats Adventure Here!

Tag: analytics