Hello, welcome to the second part of this blog doing exploratory data analysis on the bike town dataset. If you haven’t read the first one then go check it out. An overview is we found that most of the records were either subscribers to the system of casual people who might just use it every now and then. So I decided to compare those two groups. We saw in the smaller sample size that most of the casual users used it for recreation and the subscribers used it mostly for commuting. Today we are going to look at the distances and speeds the groups travel and where they rent the bikes from but first we are going to look at when they rent the bikes:
This graph totally makes sense that subscribers tend to rent the bikes for commuting as there are two large spikes at around 8am and around 5pm. The peak hours for people going too and from work. Casual users don’t tend to use the system in the morning, however, there’s consistent usage throughout the afternoon. This could be tourists exploring the city for instance. It’s strange both groups don’t reach their minimum to well after 2am, are these people using the system to get home from their night outs? Beware of the drunk Portland cyclist!
The density plot for the distances the two groups go is interesting however i think does fit the current pattern. Subscribers are looking to do shorter journeys because they might be covering the last few miles to work whereas the casual users are maybe exploring the city and therefore cover more distance.
Now let’s check out the speed curves for both groups the further the distance the person travelled the slower the speed. The subscribers are obviously generally the much faster riders at all distances. The increase in speed towards the 25-mile distance i think is down to lower amounts of data. At the lower distance the subscribers have significantly faster speeds are they using the system to get from A to B as quickly as possible.
Now let’s look at the start and end locations for the casual riders. The heat map above shows that most riders are clustered around the city centre possibly moving from one tourist spot to another. There seems to be a fairly even distribution across the centre locations.
The subscriber heat map above shows that generally, people take the bike from much further out than compared to the casual riders who are possibly just using it to get to the tourist hot spots. Once again mostly the usage is on the left side of the river that must be the area where people like to get about. Also, the start locations around the outside are much denser, therefore, its clear people are using the system from further out to go into the centre of the city.
That’s it for this exploratory data analysis in this blog I think we have found some interesting insights and at the minimum able to confirm what you would expect. I hope you have found this interesting and informative let me know your thoughts and if you have looked at this dataset yourself lets see your thoughts. Check out the code on my GitHub should be linked somewhere on the website.