Everyone’s excited about the newly released NBA player movement data that’s been released – at least I am. I stumbled across this post which shows how to visualize player movement data in Python, but I wanted to figure out how to do the same in R. We’ll use a combination of several R packages including RCurl, jsonlite, png and plotrix.
We will be replicating a play in R while shadowing James Harden (#13):
library(dplyr) library(RCurl) library(jsonlite) library(png) library(plotrix) ## Read in the player movement data for this specific play for this game url = "http://stats.nba.com/stats/locations_getmoments/?eventid=308&gameid=0041400235" x<-getURL(url) the.data<-fromJSON(x) ## read in the full court image and rasterize it in R con <- url("https://tcbanalytics.com/uploads/fullcourt.png", open="rb") rawpng <- readBin(con, what='raw', n=50000) close(con) the.court <- readPNG(rawpng) plot(0:94, xlim=c(0,94), ylim=c(50,0), type="n") lim <- par() rasterImage(the.court, lim$usr[1], lim$usr[3], lim$usr[2], lim$usr[4] )
We have a basketball court!

Next, we need to overlay James’ movement on this image.
Here we will capture the home and away team information.
## Read in the home and away team player information home.team <- the.data$home away.team <- the.data$visitor
This will capture the movement moment data and provide column headers when we are done compiling all of the data.
moments <- the.data$moments headers = c("team_id", "player_id", "x_loc", "y_loc", "radius", "game_clock", "shot_clock", "quarter") quarters <- unlist(lapply(moments, function(x) x[1])) game.clock <- unlist(lapply(moments, function(x) x[3])) shot.clock <- unlist(lapply(moments, function(x) x[4])) ## Add the quarter, game clock and shot clock information to each moment moment.details <- lapply(moments, function(x) x[[6]]) x<-mapply(function(a,b,c,d) cbind(a,b,c,d), moment.details, game.clock, shot.clock, quarters) all.movement<-do.call('rbind', x) colnames(all.movement) <- headers all.movement<-data.frame(all.movement) all.movement<-all.movement[order(all.movement$game_clock),] home.players <- home.team$players away.players <- away.team$players colnames(home.players)[3] <- "player_id" colnames(away.players)[3] <- "player_id"
Next we add the player name information to each set of movement data.
## Add the player name information to each movement moment home.movements<-merge(home.players, all.movement, by="player_id") away.movements<-merge(away.players, all.movement, by="player_id") home.movements <- home.movements[order(home.movements$game_clock, decreasing = TRUE),] away.movements <- away.movements[order(away.movements$game_clock, decreasing = TRUE),] all.movements <- rbind(home.movements, away.movements) ## Use James Harden as an example james <- home.movements[which(home.movements$lastname == "Harden"),] lines(james$x_loc, james$y_loc, type="b", col=cut(james$game_clock, breaks=3))
Here is James Harden’s movement with green being earlier in the play and black being later.

## Function to calculate player distance traveled travelDist <- function(xloc, yloc){ diffx <- diff(xloc) diffy <- diff(yloc) diffx2 <- diffx ^ 2 diffy2 <- diffy ^ 2 a<- diffx2 + diffy2 b<-sqrt(a) return (sum(b)) } travelDist(james$x_loc, james$y_loc) [1] 197.4482 ## Calculate distance traveled for each player player.groups <- group_by(all.movements, player_id) dist.traveled.players <- summarise(player.groups, totalDist=travelDist(x_loc, y_loc)) all.players <- rbind(home.players, away.players) player.travel <- merge(all.players, dist.traveled.players, by="player_id") arrange(player.travel, desc(totalDist))
James Harden moved the most around the court on that play while JJ Redick moved the second most.
player_id lastname firstname jersey position totalDist
1 201935 Harden James 13 G 197.4482
2 200755 Redick JJ 4 G 184.5041
3 101108 Paul Chris 3 G 176.1983
4 1891 Terry Jason 31 G 173.3089
5 2746 Smith Josh 5 F 162.2261
6 2440 Barnes Matt 22 F 161.9764
7 2772 Ariza Trevor 1 F 153.3894
8 201933 Griffin Blake 32 F 153.0766
9 2730 Howard Dwight 12 C 123.4396
10 201599 Jordan DeAndre 6 C 119.9199
Please tweet at us and let us know if you have any improvements upon this work or other ideas. The possibilities are endless with what we can start to do with this data.
•