Archived

This topic is now archived and is closed to further replies.

Zack

Correlation of Team Salary and Wins over last 3 Years

Recommended Posts

Zack    0

A common question that arises in discussion of how we would like our favorite NBA team to spend money is the relationship between spending money and winning.

 

Usually the argument starts like this...

 

Fan A: "I wish my team wasn't so cheap and would spend some money and at least act like they care about winning"

 

Fan B: "The team shouldn't spend money just to spend money. There is no guarantee that spending money produces win. I mean, look at the Knicks!!! And San Antonio and Utah win with low payrolls"

 

I wanted to delve deeper into this question of how total team salary relates to winning in the NBA.

 

To do this, I needed to get total team salary data from years past. I was able to find, with hooba's help, a website that does have the last 3 years of salary information saved for the NBA. Here is the link...

 

Here is what I did:

 

--Collect total team salary and regular season wins for each team over the last 3 years.

--Turn total team salary into dollars above or below the league salary cap. This way I normalize each year to each other by looking at how much a team spends relative to the salary cap (i.e., spending 58.7 million last year would be the same as spending 53.1 million 3 years ago)

--Plot the team salary above/below cap vs number of wins for each year and for all 3 years combined.

--Calculate correlation coefficient in order to look at the strength of the correlation between the two variables.

 

Here is a spreadsheet with some of the basic information on it.

 

3708271689_823f26dc09.jpg

 

First I will plot each years salary cap difference vs wins. The x-axis is the amount of money above or below the cap a team is (in units of millions of dollars). The y-axis is the number of wins for that year.

 

3708305717_7d25ba1d68.jpg

3709134372_7a55b83e94.jpg

3709141720_089f07b4ab.jpg

 

Observations and Notes

--The first thing one will notice is the outlier data points in the plots above. They are actually very important when looking at the quantitative correlation between team salary and wins. When looking at the correlation coefficient for the pair of variables, the correlation response the most to those outliers when there is a small number of data points, as is the case here. FYI....correlation coefficient goes between -1 and 1. Correlation coefficients above 0.5 suggest a significant correlation (along with a calculation of correlation by random chance that is below 0.5%)

--The two main outliers are NY and Dallas.

--When looking at the qualitative correlation with our eyes, the outliers stand out, but can be ignored easily to focus on the trend of the cloud of points grouped together. Below are some observations for just the cloud of points without the significant outliers.

--06/07 doesn't seem to indicate much correlation between team salary and wins.

--07/08 starts to suggest a stronger correlation.

--08/09 shows a strong correlation between team salary and wins if we ignore the 3 outlier points on the right.

 

Idea

In order to average out the year-to-year variation, I think looking at all 3 years summed together and averaged gives a better view of how spending more money correlates to more wins. Here are a few reasons why I think that:

 

--Rookie Contracts. Teams that have a lot of good young players on rookie contracts are able to spend less for a competitive team, but the time eventually comes where they need to pay those players. By averaging over years, we get a better sense of how much a team spends in relation to the cap.

 

--Time-lapse Effects. Sometimes a team might be forced to take extra salary on one year in order to get a player that will contribute for years to come. Averaging over years better captures the money spent to acquire talent and the effect that talent had on long term winning.

 

--Outliers. By averaging over a few years, outliers will be reduced. Only if a team is consistently an outlier will they stay an outlier.

 

Results for Correlation of Team Salary and Wins for Last 3 Years Combined

 

3709184362_c70848d667.jpg

 

Observations/Notes

--The Grizzlies have spent the least amount of money the last 3 years and have won the least amount of games over those 3 years.

--Despite the positive slope on the best-fit line to the data points, the correlation coefficient does not suggest a significant correlation between spending more money and winning more games.

--There are 2 outliers still, in New York and Dallas. Lets look at the data without those 2 teams.

Results for Correlation of Team Salary and Wins for Last 3 Years Combined Without NY and Dallas

 

3708405449_81de12346a.jpg

 

Observations

 

--Correlation coefficient was 0.53. The chance of getting that correlation by random chance is 0.4% (that is a low number). This suggest a statistically significant correlation between spending more money and winning more games.

 

--This assumes that one must consider NY and Dallas different enough business-basketball models over the last 3 years to ignore them.

 

--The best fit line over this 3 year averaged data says that if a team simply spends the salary cap, they will average 31 games (of course we know that isn't true!!!!)

 

--The best fit line over this 3 year averaged data says that for every 10 million a team spends over the salary cap, they will win 9.5 more games. This measure (the slope) of the best fit line is more meaningful than what the intercept did in the previous bullet.

 

Feedback

Give me some feedback on this. I'm thinking about modifying the post and putting it on the blog. Due to the TrueHoop affiliation, I want to not rush the post and improve the concept as much as possible. So, please give some feedback on anything. What do you see in the data? Do you want to see something else plotted against each other? Can I improve the plots? Larger? What should I expand on or summarize better?

Share this post


Link to post
Share on other sites
GF#1    0

Wow.

 

An absolute terrific post. Very nice work, Zack.

 

Thoughts to come later. I need to read it a little better! ;)

Share this post


Link to post
Share on other sites
bgassassin    0

Too... Much... Informa...

 

ScannersExplodingHead.gif

 

I'm going to need a good night's rest to process all this Dr. Zack. We definitely see that Ph. D. is being put to use. ^_^

Share this post


Link to post
Share on other sites
Zack    0

and just to be clear.....

 

correlation does not equal causation.

 

They are different things....

 

and I'm NOT suggesting a team should spend, spend, spend with this post. The economic-basketball business plan is a complicated process that is dynamic from year to year and different for team-to-team.........

 

I just wanted to do a calculation that looks at the question of "Do teams that spend more, win more?"

 

I'd love to do the calculation over alot more years than 3, however, I can't find team salary data before 2006........

 

I really believe that is the best way to go, because so many decisions a team makes effects them for years, although the salary associated with that decision may not.....and the way the rookie pay scale works, averaging over, say 8 years, effectively averages out that set-in-stone low salary with a possible higher salary the players receives based on merit.....

Share this post


Link to post
Share on other sites
guillermo    0

Awesome job dude. I couldn't be more impressed.

 

The only thing I would recommend would be to make a summary graph-plotting the final outcome(wins/salary)w/team logos- if that makes any sense. ??

 

you should definitaly put this on a blog.

 

 

 

 

Share this post


Link to post
Share on other sites
bgassassin    0

I will say this even only after initial thought and that is 06/07 will be affected due to Eddie Jones' salary being on their despite him being bought out. From what I understand from glancing over it, it would seem that would actually make the Grizzlies look better for that season as they "spent money" even though they had few wins.

Share this post


Link to post
Share on other sites
Zack    0
I will say this even only after initial thought and that is 06/07 will be affected due to Eddie Jones' salary being on their despite him being bought out. From what I understand from glancing over it, it would seem that would actually make the Grizzlies look better for that season as they "spent money" even though they had few wins.
ya.....

 

I think we could play that game with every team, every year.......which is why I'm hesitant to look at an individual years and draw any conclusions.......and prefer the averaging over all years.....when looking at the league as a whole....

 

I do agree that if discussing the grizzlies specific position within the rest of the league on any given year, we might want to think of whateve odd salary cap quirks went into that year.....

 

(FYI....What I plotted was the actual team salary that was used to calculate lux tax payments, including all those 10 day contracts, money spent on training camp players, actual money paid to players due to buyouts and other things)

 

using the Eddie Jones thing as an example....

 

when we traded for him, we knew his salary AND his basketball production was going to affect the team in the present and the future.......we paid him 32 Million (guessing) over 2 years, and if he only played the first year of those 2 yeas, then we should consider it took the whole 32 million to get his 1st year production......which is why I think the summed and average data points are more realistic in capturing the correlation and not an individual season......

Share this post


Link to post
Share on other sites

1) First, thought is yes there is a relationship but we can't determine causation. Its also possible and possibly even probably that low wins over 3 years results in the need to curb cap spending. The more a team wins, the more they are willing to spend. This would definitely be true over a three year period, where wins would result in future spending.

 

2) Three years is a small sample size and also correlates to beginning recesson. Wonder how that effects.

 

3) I am way too sleepy to think any deeper than that.

 

 

Good post. Will try to look at itmore deeply tomorrow.

Share this post


Link to post
Share on other sites
grizzdizzle    0

great info, Zack.

 

(disclaimer: the following comes from someone currently in the process of writing a paper to be submitted to a scientific journal, so it's probably more nitpicky than is necessary)

Some general feedback:

1) Definitely increase font size of titles/axis tick labels/marker size to make it more readable. I like the idea that guillermo had for using the team logos as marker labels, but that could take some effort.

 

2) I am confused about plotting versus the under/over salary cap on the x-axis. Are these values normalized to 1 being the lowest, or are they the magnitude? I would be more interested in seeing the actual values (either relative to the cap, i.e. -4.5 mil to +20mil or whatever it is, or absolute amounts in millions) that way you can clearly see how much they are are spending. In any event, the axis title and the numbers used don't make sense to me as they're currently presented.

 

Good work.

 

Something else that I think would be interesting to look at is the number/position of draft picks versus the number of wins. What I am wondering is, for instance, of the teams in the lottery, how many go on to have success 2, 3, 4 years down the road, and how many stay situated in the lottery?

Share this post


Link to post
Share on other sites
Zack    0
2) I am confused about plotting versus the under/over salary cap on the x-axis. Are these values normalized to 1 being the lowest, or are they the magnitude? I would be more interested in seeing the actual values (either relative to the cap, i.e. -4.5 mil to +20mil or whatever it is, or absolute amounts in millions) that way you can clearly see how much they are are spending. In any event, the axis title and the numbers used don't make sense to me as they're currently presented.
that is exactly what they are

 

then expressed in units of millions of dollars....

 

they are the amount a team spent relative to the salary cap that year.....

 

look at the yearly plots and the numbers go from -5 Million to 60 million

 

yes, the knicks spent 60 million more than the salary cap one year......

 

the spreadsheet should make the numbers makes sense.....take the column B minus B2.....

 

and I should have said 'relative' rather than 'over/under'.....the source I was using for the salaries had calculated the amount relative to the salary cap and used the term over the cap or under the cap and I kept using that terminology.....

 

regarding the figures.....

 

ya, I know....I don't like what flickr did to the images....

 

I made them in matlab like I would for a paper, powerpoint or poster.....

 

I made the fonts big and pumped up the resolution, but I think flickr downsamples and downscales the figure and makes them jpeg (which is lossy).....ugh....i'll work on it some....

Share this post


Link to post
Share on other sites
grizzdizzle    0
that is exactly what they are

 

then expressed in units of millions of dollars....

 

they are the amount a team spent relative to the salary cap that year.....

 

look at the yearly plots and the numbers go from -5 Million to 60 million

 

yes, the knicks spent 60 million more than the salary cap one year......

 

the spreadsheet should make the numbers makes sense.....take the column B minus B2.....

 

and I should have said 'relative' rather than 'over/under'.....the source I was using for the salaries had calculated the amount relative to the salary cap and used the term over the cap or under the cap and I kept using that terminology.....

 

regarding the figures.....

 

ya, I know....I don't like what flickr did to the images....

 

I made them in matlab like I would for a paper, powerpoint or poster.....

 

I made the fonts big and pumped up the resolution, but I think flickr downsamples and downscales the figure and makes them jpeg (which is lossy).....ugh....i'll work on it some....

*I should pay more attention. I looked at it wrong the first time then assumed the origin was (0,y) from there on out.

In Matlab, I use 'FontSize',20 and 'MarkerSize'>=8, then just export it as a jpeg since that's what they'll convert it to anyways.

Share this post


Link to post
Share on other sites
Zack    0

I'll do an example.....

 

for instance, the grizzlies....

 

06/07

spent 62.5 million...cap at 53.1 million.....so they were over the cap by ~9 million....and you can see that the grizzlies dot that is highlighted plots just under 10 million on the x-axis

 

in 07/08

spent 51.1 million....cap at 55.6, so we were under the cap by ~4 million.....and you can see that the grizzlies dot that is highlighted plots to the left (negative) of 0 (zero).....

 

in 08/09

spent 55.4 million....ap at 58.6, so we were under by ~3 million....and you can see that the grizzlies dot that is highlighted plots to the left (negative) of 0 (zero).....

 

then for all 3 years averaged together....

 

take those 3 relative values summed together (9-4-3=2) and divided by 3 (3/2=.66) you can see the grizzlies dot plotted just to the right (positive) of 0 (zero)

 

[EDIT]Just saw your reply....nevermind then....

 

I used 16 font on those, but will try 20 for the next go round......[/EDIT]

Share this post


Link to post
Share on other sites
Timmy_D    0

Great post Zach. It's something Ive always assumed but was too lazy to really look into. And I totally agree with we shouldnt just spend money to spend it but in an offseason like this one in which we actually have an advantage over other teams and it might be our one shot to land some good players we once again seem to be looking for a cheap way out. And yes Zach Randolph is the cheap way out I dont care how big his contract is.

Share this post


Link to post
Share on other sites
new_skool91    0

Great, thorough post. The key will be understanding the implications of this data.

 

As you have pointed out, and it has been discussed prior, spending more doesn't directly translate into winning more; however, as a team gains experience and gets beyond its rookie contracts, it will become more expensive. Better teams will cost more, but more expensive teams are not necessarily better.

 

Thanks for the insight.

Share this post


Link to post
Share on other sites
El Lobo    0
Too... Much... Informa...

 

ScannersExplodingHead.gif

 

I'm going to need a good night's rest to process all this Dr. Zack. We definitely see that Ph. D. is being put to use. ^_^

 

 

Me too. I feel like I just watched an episode of "House".

 

Kinda like where a patient goes in with post nasal drip, and ends up on life support. <_<

Share this post


Link to post
Share on other sites

Great post Zack--this is the kind of stuff I'd like to see more of on this board (and also something I wish I had more time to do!). I like that you didn't just throw the data together and say "well, no correlation, nothing to see here" and move on.

 

I think it's completely legitimate to toss out the extreme outliers, i.e. Dallas and NY. I think when you're questioning whether there is a correlation between spending and winning percentage, you are implicitly making the assumption that the money is being spent wisely--and in the case of Dallas and especially NY I think it's safe to say that this isn't the case. I would argue that this is the case in Memphis as well, but that's another argument. :)

 

One suggestion...as far as normalizing the salaries, you could take it a step further in order to eliminate the effects of shifting salary caps for the case where you want to look at multiple years. You could determine each team's salary relative to the others in a particular year by setting the minimum salary as zero, the maximum salary as unity, and then scaling each team to fit in between accordingly. The formula for each year would look like (Team Salary - Minimum Salary)/(Maximum Salary - Minimum Salary). For a period of three years it's probably not an issue, but if you were looking at a longer time frame then it might affect the analysis.

 

Also, you might consider time-shifting things to see if there's a lag between investing in talent and success. I think your grouping of three years is a way to get at that, but if you have time to fool around with the data you might see if anything pops out.

 

Last suggestion--you might do something like calculate the winning percentage for the bottom 25th percentile of spenders, calculate the winning percentage for the top 25th percentile of spenders, and compare the results. I would probably want much more data (like 15-20 years worth) for that to be meaningful though, so you'd probably have to find a different source for your data.

Share this post


Link to post
Share on other sites

I suppose somebody understood this, maybe Sir Issac Newton.

 

Here's my feedback: huh, all those boxy colored things er purdyyy.

 

 

Seriously, I don't need a graph. Smart spending correlates to more wins. I.E. San Antonio. Dumb spending leads to nothing. I.E. New York. Teams with the lowest sarlaies usually have more rookies and the more rookie or young players a team has the less wins they have.

Share this post


Link to post
Share on other sites

well the problem is that it's not a linear fit at all. I don't think you can call it correlative by throwing out data points.

 

if you include all the teams - you can see once you start pumping money into the program it's almost a crapshoot on which side of the curve you land (variance is too high).

 

But one thing can be concluded:

 

if you don't spend money you aren't going to win.

 

 

I just was thinkign more and in fact I have a similar paper that I will try and get reference to that was done on German futbol teams. hold on...

Share this post


Link to post
Share on other sites

doh - ok I was thinking of a slightly different problem of predicting outcomes of matches.

 

in this book: Simple Heuristics chapter 4 they are describing a study where they had different students predicting german soccer team matches based on these variables:

 

National capital (is city national capital?)

Exposition site (was city was an exposition site?)

Intercity train connection (is city on a major train line?)

State capital ( is city the state Capital?)

License plate (Is the abbrev. one letter long?)

University (Is city home to a major university?)

Industrial belt (Is city in the industrial belt?)

East Germany (Was city formerly in East Germany?)

 

With these cues they were able to calculate parameters for ecological validities or regression coefficients.

This study was for reasoning under limited knowledge.

Share this post


Link to post
Share on other sites
SixthMan    0

I think you are on the right track, but the analysis is flawed using dollars over the cap.

 

For example, you could determine for each team, how much more over the cap they would need to spend, in order to win 82 games. The problem is that if every team did that, every team wouldn't/can't win 82 games. The teams compete against each other. That being the case, you need to do a correlation on a variable that "competes" against the other teams too -- like the teams salary compared to the salaries of the other teams. Using average team salary would probably be better, but even that has problems, due to the number of times each team is played. In other words, if all of the Eastern Conference teams had high payrolls and all of the Western Conference Teams had low payrolls, you woudn't necessarily see all of the Eastern conference teams with better overall records. That's because they would play each other twice as many times as Western Conference foes (and visa versa).

 

However, what you are demostrating is pretty straight forward:

- To win, you need talent

- To get talent, you have to pay to get them

 

However, there's going to be variables that will impact the results, where there isn't a good correlation with each player's contribution and their pay. The more a team has these problems, the more likely they are to be out-lyers. Problems like this include;

- Injuries to high-paid players

- Really bad/dumb contracts

- Good players taking "pay cuts" just to get on a good team

 

You almost have to do a team-by-team, game-by-game analysis and see how likely one team is to beat another, based on each team's payroll and them sum up the results. This takes into account how many times each team plays each other.

 

I do commend you on trying to taking the concept and trying to quantify an estimated ROI on spending and winning. I think you are on to something.

Share this post


Link to post
Share on other sites
Grizzhype    0

Interesting post... You've extrapolated salary data and analyzed the relationship between wins and salary... A key factor missing to me is the tenure of key rotation and impact players.. A team of experienced vets usually has a higher salary vs a rebuilding club with a few bad contracts.. I guess in essense you've attempted to show: If you dont spend you dont win... But thats not always the case, Portland is under the salary cap by 9 mil, yet last year they were a 50 win team.. They had bad contract players who didnt even contribute last year that had them over the salary cap.. An analysis that showed the upward trend of wins and the factors that influenced them would better show any implicit relationships and their consequences... Go Grizz.. ;)

Share this post


Link to post
Share on other sites
kmkemp    0

Just as an additional gotcha that makes your data ever so slightly flawed, I think you should be graphing based on percentage above or below the cap for a given year since the salary cap is not a static number. Being 9 million over the cap in 02-03, for example, is higher over the cap than it would be 5 years later in 07-08. Great post, though. ~

Share this post


Link to post
Share on other sites
Zack    0
Great post Zack--this is the kind of stuff I'd like to see more of on this board (and also something I wish I had more time to do!). I like that you didn't just throw the data together and say "well, no correlation, nothing to see here" and move on.

 

I think it's completely legitimate to toss out the extreme outliers, i.e. Dallas and NY. I think when you're questioning whether there is a correlation between spending and winning percentage, you are implicitly making the assumption that the money is being spent wisely--and in the case of Dallas and especially NY I think it's safe to say that this isn't the case. I would argue that this is the case in Memphis as well, but that's another argument. :)

 

One suggestion...as far as normalizing the salaries, you could take it a step further in order to eliminate the effects of shifting salary caps for the case where you want to look at multiple years. You could determine each team's salary relative to the others in a particular year by setting the minimum salary as zero, the maximum salary as unity, and then scaling each team to fit in between accordingly. The formula for each year would look like (Team Salary - Minimum Salary)/(Maximum Salary - Minimum Salary). For a period of three years it's probably not an issue, but if you were looking at a longer time frame then it might affect the analysis.

 

Also, you might consider time-shifting things to see if there's a lag between investing in talent and success. I think your grouping of three years is a way to get at that, but if you have time to fool around with the data you might see if anything pops out.

 

Last suggestion--you might do something like calculate the winning percentage for the bottom 25th percentile of spenders, calculate the winning percentage for the top 25th percentile of spenders, and compare the results. I would probably want much more data (like 15-20 years worth) for that to be meaningful though, so you'd probably have to find a different source for your data.

thanks for the suggestions......those a really good...

Share this post


Link to post
Share on other sites
Zack    0
I think you are on the right track, but the analysis is flawed using dollars over the cap.

 

For example, you could determine for each team, how much more over the cap they would need to spend, in order to win 82 games. The problem is that if every team did that, every team wouldn't/can't win 82 games. The teams compete against each other. That being the case, you need to do a correlation on a variable that "competes" against the other teams too -- like the teams salary compared to the salaries of the other teams. Using average team salary would probably be better, but even that has problems, due to the number of times each team is played. In other words, if all of the Eastern Conference teams had high payrolls and all of the Western Conference Teams had low payrolls, you woudn't necessarily see all of the Eastern conference teams with better overall records. That's because they would play each other twice as many times as Western Conference foes (and visa versa).

 

However, what you are demostrating is pretty straight forward:

- To win, you need talent

- To get talent, you have to pay to get them

 

However, there's going to be variables that will impact the results, where there isn't a good correlation with each player's contribution and their pay. The more a team has these problems, the more likely they are to be out-lyers. Problems like this include;

- Injuries to high-paid players

- Really bad/dumb contracts

- Good players taking "pay cuts" just to get on a good team

 

You almost have to do a team-by-team, game-by-game analysis and see how likely one team is to beat another, based on each team's payroll and them sum up the results. This takes into account how many times each team plays each other.

 

I do commend you on trying to taking the concept and trying to quantify an estimated ROI on spending and winning. I think you are on to something.

good thoughts....thanks....

 

your reply makes me think I should find a way to normalize wins to the amount of wins needed to get into the playoffs.....that would address your comment on some teams have to lose and some have to win over the course of 82 games.......

 

going game by game though, might be too time consuming....

Share this post


Link to post
Share on other sites
Zack    0
Just as an additional gotcha that makes your data ever so slightly flawed, I think you should be graphing based on percentage above or below the cap for a given year since the salary cap is not a static number. Being 9 million over the cap in 02-03, for example, is higher over the cap than it would be 5 years later in 07-08. Great post, though. ~

ya, I might try that......in this 3 year span, the cap only moved some, so the percentage thing isn't such a big deal, but it does matter.......thx

Share this post


Link to post
Share on other sites