Advanced Analytics

NBA Player Movement Data in R

Boston

Everyone’s excited about the newly released NBA player movement data that’s been released – at least I am. I stumbled across this post which shows how to visualize player movement data in Python, but I wanted to figure out how to do the same in R. We’ll use a combination of several R packages including RCurl, jsonlite, png and plotrix.

We will be replicating a play in R while shadowing James Harden (#13):


First, we’ll read in the movement data and plot a basketball court.

library(dplyr)
library(RCurl)
library(jsonlite)
library(png)
library(plotrix)

## Read in the player movement data for this specific play for this game
url = "http://stats.nba.com/stats/locations_getmoments/?eventid=308&gameid=0041400235"
x<-getURL(url)
the.data<-fromJSON(x)

## read in the full court image and rasterize it in R
con <- url("https://tcbanalytics.com/uploads/fullcourt.png", open="rb")
rawpng <- readBin(con, what='raw', n=50000)
close(con)
the.court <- readPNG(rawpng)
plot(0:94, xlim=c(0,94), ylim=c(50,0),  type="n")
lim <- par()
rasterImage(the.court, lim$usr[1], lim$usr[3], lim$usr[2], lim$usr[4] )

We have a basketball court!


Next, we need to overlay James’ movement on this image.
Here we will capture the home and away team information.

## Read in the home and away team player information
home.team <- the.data$home
away.team <- the.data$visitor

This will capture the movement moment data and provide column headers when we are done compiling all of the data.

moments <- the.data$moments
headers = c("team_id", "player_id", "x_loc", "y_loc", "radius", "game_clock", "shot_clock", "quarter")

quarters <- unlist(lapply(moments, function(x) x[1]))
game.clock <- unlist(lapply(moments, function(x) x[3]))
shot.clock <- unlist(lapply(moments, function(x) x[4]))

## Add the quarter, game clock and shot clock information to each moment 
moment.details <- lapply(moments, function(x) x[[6]])
x<-mapply(function(a,b,c,d) cbind(a,b,c,d), moment.details, game.clock, shot.clock, quarters)
all.movement<-do.call('rbind', x)
colnames(all.movement) <- headers
all.movement<-data.frame(all.movement)
all.movement<-all.movement[order(all.movement$game_clock),]

home.players <- home.team$players
away.players <- away.team$players
colnames(home.players)[3] <- "player_id"
colnames(away.players)[3] <- "player_id"

Next we add the player name information to each set of movement data.

## Add the player name information to each movement moment
home.movements<-merge(home.players, all.movement, by="player_id")
away.movements<-merge(away.players, all.movement, by="player_id")
home.movements <- home.movements[order(home.movements$game_clock, decreasing = TRUE),]
away.movements <- away.movements[order(away.movements$game_clock, decreasing = TRUE),]
all.movements <- rbind(home.movements, away.movements)

## Use James Harden as an example
james <- home.movements[which(home.movements$lastname == "Harden"),]
lines(james$x_loc, james$y_loc, type="b", col=cut(james$game_clock, breaks=3))

Here is James Harden’s movement with green being earlier in the play and black being later.

We can also calculate player distance traveled:

## Function to calculate player distance traveled
travelDist <- function(xloc, yloc){
        diffx <- diff(xloc)
        diffy <- diff(yloc)
        diffx2 <- diffx ^ 2
        diffy2 <- diffy ^ 2
        a<- diffx2 + diffy2
        b<-sqrt(a)
        return (sum(b)) 
}

travelDist(james$x_loc, james$y_loc)
[1] 197.4482   

## Calculate distance traveled for each player
player.groups <- group_by(all.movements, player_id)
dist.traveled.players <- summarise(player.groups, totalDist=travelDist(x_loc, y_loc))
all.players <- rbind(home.players, away.players)
player.travel <- merge(all.players, dist.traveled.players, by="player_id")
arrange(player.travel, desc(totalDist))

James Harden moved the most around the court on that play while JJ Redick moved the second most.


player_id lastname firstname jersey position totalDist
1 201935 Harden James 13 G 197.4482
2 200755 Redick JJ 4 G 184.5041
3 101108 Paul Chris 3 G 176.1983
4 1891 Terry Jason 31 G 173.3089
5 2746 Smith Josh 5 F 162.2261
6 2440 Barnes Matt 22 F 161.9764
7 2772 Ariza Trevor 1 F 153.3894
8 201933 Griffin Blake 32 F 153.0766
9 2730 Howard Dwight 12 C 123.4396
10 201599 Jordan DeAndre 6 C 119.9199

Please tweet at us and let us know if you have any improvements upon this work or other ideas. The possibilities are endless with what we can start to do with this data.