Showing posts with label distribution. Show all posts
Showing posts with label distribution. Show all posts

Friday, July 15, 2011

Defensive Distribution Analysis

Analyzing the defensive side of the ball is a greater challenge due to the fact that fewer statistics are available with regards to defense.  Steals, blocks, rebounds and fouls do not tell the whole story of how a player contributes, whether it is on the ball or helping teammates.  A great on the ball defender doesn't always get steals to reflect his on ball defense and a great help defender doesn't always get blocks to reflect his help defense.  Until I can get my hands on better defensive data, this is my effort using the standard readily available defensive statistics.  


Measuring defensive efficiency, or defensive points per possession, using the standard deviations of various defensive statistics provided rather simple results.  Lineup data was used and standard deviation variables were created for personal fouls, steals, blocks, and defensive rebounds based on the percentage in each category that each of the five players in a lineup accounted for.  

Defensive efficiency was predicted using lineup per possession statistics in personal fouls, steals, and blocks as well as defensive rebounding %.  Each of these categories increased with a more even distribution or lower standard deviation.  As a result, for the positive statistics of rebounding percentage, steals and blocks, as the standard deviation decreased giving us a more even distribution, defensive points per possession also decreased, as much as 4, 9 and 1 point(s) per 100 possessions, respectively.  In the case of personal fouls, a negative statistic, the reverse was true; as the standard deviation of personal fouls increases, personal fouls per possession and defensive points per possession decrease.  

So what do these results mean and why do I refer to these results as simple?  They are simple because they essentially mean that we want everyone in a lineup playing defense.  We want everyone rebounding, everyone playing strong defense on the ball and getting steals, and everyone contesting shots and getting blocks. 

Obviously when players are getting steals and blocks successfully, they aren't fouling, and ideally, players are getting those steals and blocks efficiently without fouling, so why is a wider distribution of personal fouls beneficial for a defense?  Perhaps this means we want a player or two in a lineup getting fouls when a steal or a block isn't a realistic possibility.  These fouls are likely skewed towards players in the paint, lending itself to a higher standard deviation, where fouling is preferred to allowing easy buckets on dunks and layups.


Friday, May 27, 2011

True Distribution Analysis with 2-Point FG replacing All FG

As suspected, replacing the field goal data with two-point field goal data found a stronger relationship between the standard deviation of the percentage of two-point field goal attempts taken by each player in the lineup and the lineup OPPP than was found between the standard deviation of the percentage of all field goal attempts taken by each player in the lineup and the lineup OPPP. 




This graph illustrates that increasing the standard deviation of the percentage of two-point field goals attempted by each of the players in a lineup can increase the lineup OPPP by as much as 3 points per 100 possessions, more than found with respect to field goal attempts generally.  


Interestingly, the coefficient for the standard deviation of the percentage of two-point field goals attempts with respect to the dependent variable of lineup two-point field goals made per possession was negative, while it was positive with respect to the dependent variables of lineup three-point field goals made per possession and lineup free throws made per possession.  This indicates that while a greater distribution of two-point field goal attempts predicts a greater number of three-point field goals made per possession and to a lesser extent a greater number  of free throws made per possession, it also predicts fewer two-point field goals made per possession.  

Friday, May 20, 2011

True Distribution Analysis

To best analyze how the distribution of various roles influence offensive efficiency, we want to do so independent of the sum.  This means reducing the per possession numbers of each player to a percentage of the team or lineup total, which is realistic with lineup or by position data, as opposed to the original data set comprised of players not on the floor together much of the time.

Season lineup data was used in this analysis.  By position data was found to be unconducive to the analysis of the distribution of roles or statistics, for a few reasons.  Most importantly, the same player can account for statistics at multiple positions throughout a game.  Also, by position data combines multiple players at each position, each of which may bring a different set of skills.  As a result, this makes it difficult to determine how the roles and statistics are actually distributed.  For example, three-point attempts may be taken by only the guards in a starting lineup, but by only the forwards in a second unit.  Although the three-point attempts are broadly distributed in each lineup, when the statistics of the two lineups are combined, the distribution looks even.  This wasn't as much of a problem with the prior analysis as the standard deviations analyzed were not sum independent; sums that weighed heavily into the results.  With lineup data, we are able to look at specific players, with specific skills, which is ideal for studying the distribution of roles. The true distribution analysis for offensive efficiency is below.  

Field Goal Attempts


The analysis shows that a wider distribution of field goal attempts can make a difference of as much as 1.7 points per 100 possessions.  This may not appear too significant, but when considered with the results regarding the distribution of three-point attempts, where an even distribution proves beneficial, we can conclude that a wider distribution of two-point field goal attempts has a greater effect than the distribution of overall field goal attempts suggests. 

Three-Point Attempts




Unlike the previous results, this analysis shows that a more even distribution of three-point attempts can increase offensive efficiency by 12 points per 100 possessions.  This implies that the previous results regarding three-point attempts were influenced more by the total three-point attempts than the distribution of those attempts, indicating that offensive efficiency increases as three-point attempts per possession increases, but that a more even distribution of those attempts is preferred.  So, not only do we want skilled three-point shooters taking many shots, we want many of them spread around the floor.  


Free Throw Attempts 



The analysis shows that a more even distribution of free throw attempts can make a difference of as much as 10 points per 100 possessions.  This implies that the previous results regarding free throw attempts were influenced more by the total free throw attempts than the distribution of those attempts, indicating that offensive efficiency increases as free throw attempts per possession increases, but that a more even distribution of those attempts is preferred.  Lineups with multiple players capable of attacking the rim and earn free throw attempts are more efficient than those that get most of their free throw attempts from a player or two.    


Offensive Rebounds



This analysis shows that a more even distribution of offensive rebounds can increase offensive efficiency by almost 3 points per 100 possessions.  This indicates that lineups with multiple players capable of grabbing offensive rebounds are more efficient than those with a player or two that get most of the lineup's offensive rebounds.  
   
Assists




Unlike the previous results, this analysis shows that a more even distribution of assists can increase offensive efficiency by 6 points per 100 possessions.  This implies that the previous results regarding assists were influenced more by the total assists than the distribution of assists, indicating that offensive efficiency increases as assists per possession increases, but that a more even distribution of assists is preferred.  
  
Turnovers



Unlike the previous results, this analysis shows that a wider distribution of turnovers can increase offensive efficiency by 15 points per 100 possessions.  This implies that the previous results regarding turnovers were influenced more by the total turnovers than the distribution of turnovers, indicating that although offensive efficiency increases as turnovers per possession decreases, a wider distribution of turnovers is preferred.    

Friday, May 13, 2011

NCSSORS Follow-up

After NCSSORS, I had the opportunity to further discuss my research with Dean Oliver, author of Basketball on Paper, and former Director of Quantitative Analysis for the Denver Nuggets, who kindly provided data more ideal for my analysis.  Rather than using season data for the top five or eight players in minutes played to predict offensive efficiency for the team, season lineup data and game by position data was used in the follow-up analysis.  This data fully accounts for the offensive efficiency being analyzed (OPPP), which appears on the Y-axis in all graphs. 

SEASON LINEUP DATA

Using season data rather than game data takes into consideration all opponents faced, rather than a specific opponent, giving us individual data points less affected by defenses that may be outliers, weak or strong.  Lineup data also focuses on specific players rather than the contributions of multiple players to a single position as by position data does, making it easier to pinpoint the skills of that player as opposed to those of multiple players accounting for the statistics accumulated at a particular position during a game.     


Using lineup data in the analysis shows a stronger relationship between the standard deviation of three point attempts per possession and offensive points per possession than in the original analysis.  Going from the lowest standard deviations in the sample to the highest made a difference of nearly 7 points per 100 possessions.    

In their recent series against the Los Angeles Lakers, the Dallas Mavericks showed great success taking many threes and focusing most of the attempts on a player or two in each lineup, with Jason Terry, Jason Kidd and Peja Stojakovic (who have only played 18 possessions together through 10 games in two playoff series) taking the great majority of them.  Dirk Nowitzki also took and made three point attempts, but he mostly played to his greatest strength in taking mid-range jumpers that are difficult to stop, forcing the defense to help, leading to many uncontested three point attempts for the others.  Particularly in Game 4 against the Lakers, the Mavericks played the game plan of draw and kick to perfection, with the offense almost exclusively starting with Nowitzki’s mid-range or post-up game or with Kidd, Barea or even Terry penetrating off screens, forcing rotation that just could not keep up with the ball. 


The relationship between the standard deviation of free throws attempted per possession and offensive points per possession was also stronger using lineup data.  17 points per 100 possessions separates the lowest standard deviations from the highest.  Needless to say, having a player (or two) who can get to the line at will provides a significant advantage.  It certainly hasn’t hurt the Oklahoma City Thunder this year with Kevin Durant and Russell Westbrook.  James Harden also gets to the line at a high rate for the Thunder, providing another option that can attack the rim and earn free throw attempts. 


The relationship between points per possession and the standard deviation of offensive rebounds is not a strong one, though lineups that have a lower standard deviation of offensive rebounds per possession still score about 1.6 points per 100 possessions more than teams with a greater standard deviation of offensive rebounds.  Determining whether this is a result of more offensive rebounds being available as a result of more missed field goals requires further analysis.  


The category of assists is where the most significant advantage can be found, where a higher standard deviation of assists per possession predicts in increase of as much as 20 points per 100 possessions. 


Since turnovers are a negative stat, the relationship found is a negative one.  Like with assists, the primary ball handler generally leads his team in this category.  If that player limits his turnovers, clearly a good thing for a team’s efficiency, the standard deviation will decrease.  Here you can see that from one end of the spectrum to the other, decreasing the standard deviation of turnovers per possession can make a difference of as much as 16 points per 100 possessions.    

GAME BY POSITION DATA

Game by position data accounts for what a team gets out of each position for the duration of each game.  It takes into consideration the contribution of every player that played in that game rather than a specific group of five as is the case with lineup data.  The following graphs represent how the stated standard deviation variables influence offensive points per possession.   These results reflect those from the lineup analysis, with the exception of offensive rebounds, where the relationship reversed.  In addition, the standard deviation of field goal attempts per possession proved significant in this analysis as well, although just slightly, with a more even distribution of field goal attempts per possession predicting an increase of up to 0.8 points per possession.  


This graph illustrates that when observing game by position data, increasing the standard deviation of three point attempts per possession predicts an increase of as much as 17 points per 100 possessions.


This graph illustrates that when observing game by position data, increasing the standard deviation of free throw attempts per possession predicts an increase of as much as 13 points per 100 possessions.


Unlike the previous results, this graph illustrates that when observing game by position data, increasing the standard deviation of offensive rebounds per possession predicts an increase of as much as 1.2 points per 100 possessions. This does not represent the strongest relationship, similar to the negative relationship found in the original and lineup, so the change is not a great one, and in either case, the distribution of offensive rebounds does not have a strong relationship with offensive efficiency.  


This graph illustrates that when observing game by position data, increasing the standard deviation of assists per possession predicts an increase of as much as 20 points per 100 possessions. 


This graph illustrates that when observing game by position data, decreasing the standard deviation of turnovers per possession predicts an increase of as much as 8 points per 100 possessions. 

NEXT

The standard deviation of per possession numbers used in the above analyses are affected not only by the distribution of those categories, but also by the aggregate team per possession numbers in those categories.  For example, if Team A averages twice as many assists per possession as Team B, but those assists are similarly distributed between the five players on the floor, with 60% going to one player, and 10% each going to the other four players, the standard deviation variable for Team A will be twice that of Team B, and Team A will score more points per possession, all else being equal, as reflected above.  The same goes true for free throws attempted.  This applies to turnovers as well, though with the opposite effect.  


Given this, in addition to looking at the distribution of various statistical categories, this analysis lends insight into the influence of the aggregate as well, which is more obvious with respect to certain per possession statistical categories like free throws attempted, assists and turnovers than it is with field goals attempted, three-point field goals attempted and offensive rebounds.  When the relationships found above are consistent with the assumed relationship between the per possession total in a particular category and offensive points per possession, the affect of the distribution of those statistical categories remains unclear.  To exclusively analyze the distribution of roles or categories independent of the total, the total must be removed from the analysis.  Reducing each player's per possession totals into a percentage of the team per possession total and using the standard deviation of those percentages in the analysis instead of per possession values, will better predict the influence of distribution alone.  This analysis and the implication of those results with respect to the above results will soon follow.  

Initial Analysis

My efforts to define chemistry in basketball began with research for my Masters in Sport Management at the University of San Francisco, which I presented via poster at the 2010 Northern California Symposium on Statistics and Operations Research in Sports (NCSSORS), held by Dr. Ben Alamar, the Director of Basketball Analytics and Research for the Oklahoma City Thunder, and Peter Keating later wrote about in his blog for ESPN 


With wins being a function of offensive and defensive efficiency, this study sought to predict offensive efficiency in terms of how various statistics or roles are distributed among the players on the floor, such as shooting from the three point line, field generally and free throw line, passing, and offensive rebounding using a multiple linear regression model.  This was accomplished by using the standard deviations of the pace- and minute-adjusted field goals attempted, free throws attempted, three-point field goals attempted, offensive rebounds, assists and turnovers among the top five or top eight players in minutes played to represent how those roles are distributed and to predict field goals made, free throws made, three-point field goals made and turnovers, the significant variables in predicting offensive efficiency.


NBA regular season team and player data since 1997 was gathered from Basketball-Reference.com and used to create the data sets for this study.  370 teams were observed.  The 1997-1998 season was chosen as the initial season in the data set as it was when the three-point line was moved back to 23’9”, where it has since remained.  Team and player stats were adjusted for team pace and minutes played.  


The charts on the right side of the poster display the following results:  A wider distribution of three-point attempts predicts an increase of as much as 4 points per 100 possessions.   A wider distribution of free throw attempted predicts an increase of as much as 3 points per 100 possessions.  A wider distribution of assists predicts an increase of as much as 2 points per 100 possessions.  A more even distribution of turnovers predicts an increase of as much as 2 points per 100 possessions.  A more even distribution of offensive rebounds predicts an increase of as much as 4 points per 100 possessions.  The distribution of field goal attempts was not significant in this analysis.  


There are some significant differences in points scored between the widest and most even distribution of various statistical categories.  With the exception of offensive rebounds and turnovers, a wider distribution proved more beneficial, indicating that teams built with more defined roles in shooting and distributing are more efficient.  This is just the beginning of this analysis and the implication of the results of analyzing how roles are distributed deserves more thought, as Kevin Pelton suggested in his recap of the NCSSORS conference.