Predicting Twenty 20 Cricket Result with Tidy Models

Hello, hope you have your Yorkshire tea to hand and sitting comfortably ready to read today’s blog. In it I am going to be doing some machine learning with tidymodels to predict the outcome of some twenty20 cricket matches. I am using the data from cricsheet as used in this blog and using the win […]

Tidy Tuesday – Films Dataset

Hello, today we are going to be looking at this weeks tidy Tuesday dataset. This is just quick EDA as i got a bit carried away with just the dataset. I initially set out to do just one interesting graph but kept finding more and more interesting insights. So below you can see the structure […]

Biketown EDA – P2

Hello, welcome to the second part of this blog doing exploratory data analysis on the bike town dataset. If you haven’t read the first one then go check it out. An overview is we found that most of the records were either subscribers to the system of casual people who might just use it every […]

Nike Biketown – Exploratory Data Analysis

Hello welcome to today’s blog which we are going to take a large dataset and do some exploratory data analysis on it. I am going to look at the biketown dataset which was a dataset on tidy tuesday. If you’re new around here tidy Tuesday is a hashtag on twitter which the R for data […]

Does the Dog Get Adopted?? — P2

Today we are going to looking into the second part of creating the classification tree to look at the outcomes of dogs in the Dallas animal shelter. Today it’s the exciting stuff, creating the actual classification tree. If you want to understand how I have prepared the data, go and check out the first blog […]

Does the Dog Get Adopted?? — P1

Hello, welcome to the next blog. I was inspired by this week Tidy Tuesday dataset. I’m sure I have said this before but if you want to learn rstats its a great resource with the weekly dataset to practice your burgeoning skills. This week’s data was from the Dallas open data project, and the particular […]

F1 Circuit Cluster Analysis – 3

Hello and welcome to the meant to be final F1 circuit cluster analysis blog, however, I have thought of some ideas to extend it to a fourth so we shall see how that goes. The idea is today we will review what we can from the season so far and in the next one look […]

Tidy Tuesday 2 – World Life Expectancy

Hello, welcome to today’s blog which is going to be my second one covering the tidy Tuesday dataset. This week it was looking at a dataset with life expectancy for every country in the world since 1950. I decided you could do some cluster analysis on this dataset and then once you have the clusters […]

F1 Circuit Cluster Analysis —- Part 2

Hello and welcome to the second part of my mini-series using cluster analysis in order to categorise formula 1 circuits. please go check the first part it outlines the basic data we are using to categorise the circuits and an overview of the method used for hierarchical clustering. Today we are going to go with […]

Middlesbrough Performance Review

Hello welcome to the next blog on this blog. If this is our first time here then please have a read of all the other blogs on here and let me know your thoughts anything I havent spotted or things you want looked at. Today we are going to look at the performances of the […]

Sport Data Science

Join the R Stats Adventure Here!

Tag: data