Twenty20 Data Exploratory Data Analysis 1

Hello, welcome to today’s blog which I am going to do some exploratory data analysis on data in twenty cricket. You may have seen my blogs earlier looking at twenty20 batting metrics. Well they were all calculated just as they are but its likely these will be effected by state of the match and series or locations the match is being played in/at. The aim of this blog is to explore some of these trends so I can further refine the metrics.

First things first: the data. I cant just do this can of exploratory analysis on the IPL data i previously had from Kaggle. Therefore I would like to say a huge thank you to whiteballanalytics for providing some of there data for free. Check it out on the link below


https://www.whiteballanalytics.com/articles/2018/6/23/ball-by-ball-data

In that data theirs ball by ball data from 7 twenty20 competitions around the world and over 200 matches so its a comprehensive data set for this type of analysis. My first question to ask is how does strike rate vary across a twenty 20 game.

First things first strike rates generally increase as the innings goes on. Therefore bowling earlier is good for the economy rate while batting earlier players could often have a lower strike rate. There is also a significant drop in strike rates after the end of the power play. Is this an area where teams could increase there scores. If that drop is flattened or the strike rate continues to increase thats significantly more runs scored and could be an area to exploit. Another observation is that in the second innings strike rates are often higher in the early part of the innings. Is that batsmen more willing to take risks as they know they are chasing?

Comparing the strike rates by competition they all fallow the same trends we have seen previously with the general increase however a drop in strike rates after the power play. You can also see the big difference in strike rates by competition with the New Zealand competition having a significantly higher strikes rates then other series. Also of note is the significant increase in strike rate of the PSL. It looks to have the highest strike rates at the death with there average ball by ball strike rate going from bottom to the middles of the pack.

Lets just check if the innings trend is present throughout all the competitions:

Yes it is as previously seen all the competitions have an overall upward trend throughout an innings. They also all see a dip in strike rate after the power play. Also chasing teams also start off at generally high strike rates.

Finally lets look at the effect of the ground the game is played on. There are 111 grounds in the dataset below is a summery:

Above is the twenty grounds with the highest strike rates. Whats apparent is there are a lot of English grounds form the Twenty20 cup. This could be small boundary dimensions and/or good wickets.

The twenty grounds with the lowest strike rates are also above. There are a few Caribbean grounds in there as well as a few grounds which have hosted the IPL. Therefore a batsmen batting at Trent Bridge compared to one batting in the Sir Vivian Richards Stadium cant be compared and there stats must be adjusted accordingly

Thats it for part one. I plan to continue this for the next few weeks and see what else i can find. I will then use this knowldege to rationalise my metrics. Any comments or questions let me know

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s