Date Tags R / NBA

Everyone's excited about the newly released NBA player movement data that's been released - at least I am. I stumbled across this post which shows how to visualize player movement data in Python, but I wanted to figure out how to do the same in R. We'll use a combination of several R packages including RCurl, jsonlite, png and plotrix.

This is the exact play we will be replicating in R while shadowing James Harden (#13):

First, we'll read in the movement data and plot a basketball court.


## Read in the player movement data for this specific play for this game
url = ""

## read in the full court image and rasterize it in R
con <- url("", open="rb")
rawpng <- readBin(con, what='raw', n=50000)
the.court <- readPNG(rawpng)
plot(0:94, xlim=c(0,94), ylim=c(50,0),  type="n")
lim <- par()
rasterImage(the.court, lim$usr[1], lim$usr[3], lim$usr[2], lim$usr[4] )

We have a basketball court!

full court image
Next, we need to overlay James' movement on this image. Here we will capture the home and away team information.

## Read in the home and away team player information <-$home <-$visitor

This will capture the movement moment data and provide column headers when we are done compiling all of the data.

moments <-$moments
headers = c("team_id", "player_id", "x_loc", "y_loc", 
        "radius", "game_clock", "shot_clock", "quarter")

quarters <- unlist(lapply(moments, function(x) x[1]))
game.clock <- unlist(lapply(moments, function(x) x[3]))
shot.clock <- unlist(lapply(moments, function(x) x[4]))

## Add the quarter, game clock and shot clock information to each moment 
moment.details <- lapply(moments, function(x) x[[6]])
x<-mapply(function(a,b,c,d) cbind(a,b,c,d), moment.details, game.clock, shot.clock, quarters)
all.movement<'rbind', x)
colnames(all.movement) <- headers

home.players <-$players
away.players <-$players
colnames(home.players)[3] <- "player_id"
colnames(away.players)[3] <- "player_id"

Next we add the player name information to each set of movement data.

## Add the player name information to each movement moment
home.movements<-merge(home.players, all.movement, by="player_id")
away.movements<-merge(away.players, all.movement, by="player_id")
home.movements <- home.movements[order(home.movements$game_clock, decreasing = TRUE),]
away.movements <- away.movements[order(away.movements$game_clock, decreasing = TRUE),]
all.movements <- rbind(home.movements, away.movements)

## Use James Harden as an example
james <- home.movements[which(home.movements$lastname == "Harden"),]
lines(james$x_loc, james$y_loc, type="b", col=cut(james$game_clock, breaks=3))

Here is James Harden's movement with green being earlier in the play and black being later.

Harden Movement

We can also calculate player distance traveled:

## Function to calculate player distance traveled
travelDist <- function(xloc, yloc){
    diffx <- diff(xloc)
    diffy <- diff(yloc)
    diffx2 <- diffx ^ 2
    diffy2 <- diffy ^ 2
    a<- diffx2 + diffy2
    return (sum(b)) 

travelDist(james$x_loc, james$y_loc)
[1] 197.4482

## Calculate distance traveled for each player
player.groups <- group_by(all.movements, player_id)
dist.traveled.players <- summarise(player.groups, totalDist=travelDist(x_loc, y_loc))
all.players <- rbind(home.players, away.players) <- merge(all.players, dist.traveled.players, by="player_id")
arrange(, desc(totalDist))

James Harden moved the most around the court on that play while JJ Redick moved the second most.

     player_id lastname firstname jersey position totalDist
1     201935   Harden     James     13        G  197.4482
2     200755   Redick        JJ      4        G  184.5041
3     101108     Paul     Chris      3        G  176.1983
4       1891    Terry     Jason     31        G  173.3089
5       2746    Smith      Josh      5        F  162.2261
6       2440   Barnes      Matt     22        F  161.9764
7       2772    Ariza    Trevor      1        F  153.3894
8     201933  Griffin     Blake     32        F  153.0766
9       2730   Howard    Dwight     12        C  123.4396
10    201599   Jordan   DeAndre      6        C  119.9199

Please tweet at us and let us know if you have any improvements upon this work or other ideas. The possibilities are endless with what we can start to do with this data.


comments powered by Disqus