Finding the Next James Anderson

Hello, welcome to today’s blog in which we will be scouting for the next James Anderson. Possibly. The ideas behind this blog are nothing new in fact I have stolen the idea from another sport. The Rangers report blog wrote a piece about scouting for the best youth players by using age-adjusted stats. This idea was based on Vollans work on ice hockey. I always like to say a lot of the best ideas are re-purposed from other areas! So how am I going to apply it? well, we are going to use the bowling stats from the second 11 country championship, age-adjust and see which bowler under the age of 21 looks to be the most promising.

bowlerdataAbove is the first 20 rows for the 350 bowlers we are going to be looking at. The top wickets is Nijjar from Essex, however, he is 24, therefore, will not be included overall. The first under 21 player on the top wicket list is 20-year-old mike for Leicestershire. So let’s adjust the data and see where we end up.

The first thing to do is all the players have bowled differing amounts of balls. I need to get all the players to the same level of balls. To do that, I took the strike weight for there balls actually bowled and extended this to a full season of say 1500 balls. In the table below the estwicket column is the number of wickets if the bowlers strike rate continued over a full 1500 balls.

ffff

However, a bowler who has taken a lot of wickets in a low amount of balls is unlikely to continue this rate to the end of a season. (Sorry Ben Coad). Regression to the mean is a widely accepted mathematical theory therefore when we are extrapolating performance we need to account for regression to the mean.

In the Rangers report blog, they applied 1% regression for every match they projected performance for. I can’t do that as my analysis is based on balls and in multi-day matches bowlers will bowl different amounts of balls.  Therefore I decided to apply the same 1% regression for every extrapolated 100 balls.

regreewick

The table to the side shows how much this affects the bowlers on the list, with most bowlers having a reduction in wickets.

Now the final thing is to adjust the total wickets based on the player’s age. The first part will be to filter for all players 21 and younger and then apply Volman’s age curve below

volmanage.png

The age curve is by year and month, however, I have only got the age in years, therefore, will just be using the numbers in the first column. The graph below shows the resulting results

bowlers

Based on this method Szymanski is the best bowler of the 2nd XI county championship. However, the large dot means we have to extrapolate a lot for Szymanski. Another interesting point is 3 of the top 5 estimated wicket takers are left-arm spinners. The England team is badly missing a reliable spinner particularly away from home could one of the 3 be the future England Spinner.

There lots more work that can be done with this data and look back historically at how these numbers can relate to future county championship averages. Also, apply a similar model to Batsmen which will be detailed in a future blog. Let me know your thoughts have you seen Szymanski bowl? Ideally, I would have prefered younger players but I think younger players play in the under 17 county championship.

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s