BLOG

Introducing the Baseball Scorecard

By Alec Barrett

Developer at TWO-N. @alecbarrett

In honor of the World Series, TWO-N is excited to share our Baseball Scorecard, a novel and concise way of representing the major events of a professional baseball game. We took the fundamental data points of a baseball game -- runs, hits, and errors -- and arranged these spatially over the (usually nine) innings of a game. The Scorecard has more detail than the traditional box score: we broke down each half of an inning into the three outs that define it, and showed every pitch thrown. Viewers can quickly scan the entire course of the game to understand the trajectory of each team, or focus in on a single inning and digest the sequence of pitches, at-bats, and outs.

We'll produce these after every game of the 2017 World Series, and share them on Twitter @2nfo and Instagram @2ngrams. Follow us to stay up to date throughout the series!

Baseball Scorecard by TWO-N

How to read the Baseball Scorecard

Innings are arranged horizontally from left to right. The darker blue panels indicate the top and bottom of the innings, when each team is batting, and contain information about hits (in gold diamonds) and runs (in gold lines). The lighter panels directly above or below indicate the team fielding at the same time, and contain pitches (deep blue circles) and pitcher substitutions (deep blue ticks) that correspond with the hits and runs. As a result, all of the away team’s actions, both offensive and defensive, are in the top half of the graph, and all of the home team’s actions are on the bottom.

Each half of the inning contains three outs, so each panel has three columns of data. Almost every column has pitches, though these will be empty if the out was preceded by a double play. (No pitches were thrown when there was only one out if the inning went from zero outs to two outs in a single play.)

Hits and pitches use the same Y axis, on the left side of the graph, so you can find the pitch that corresponds with a hit in the same column, at the same height in its panel, directly above or below. A hit is necessarily related to an In Play pitch, though not all In Play pitches are hits: many are balls that were hit and successfully fielded, resulting in an out.

Runs have their own Y axis, on the right side, and are cumulative, so you can see a team’s score at any point in the game. A darker line indicate that a team is leading; a lighter line means the team is trailing or the game is tied.

Pitching substitutions are shown with a line between two pitches, within a given column, at which the change was made. The pitch directly below the line is the last thrown by the pitcher coming out of the game; the one directly above is the first thrown by the new pitcher.

Errors, when applicable, are shown in the fielding panels, at the top of the column in which the error occurred.

The design process

This project draws on the unique structure and language of baseball games and of American baseball culture. From the layout of the graph to the symbols we chose, like diamonds for hits and patterned circles for pitches, we wanted a graph that feels like baseball before you even start to interpret it.

The spatial names for the halves of innings -- top and bottom -- come from each team's position on a scoreboard, and the Scorecard's layout captures that. Teams also alternate between batting (offense) and fielding (defense), creating an elegant symmetry between the two halves of each inning when laid out one on top of the other. If you look at some of our tweets from earlier in the MLB Playoffs, you can see the evolution of our design.

The design also evokes two physical artifacts of American baseball: the score sheet, on which the official record of what transpired during a game is documented, and the baseball card, that childhood collectible that captures vital information in a memorable, shareable, tradeable form.

The data preparation process

Baseball is a traditionally data-friendly sport, with the whole discipline of sabermetrics dedicated to the study and use of baseball data. The data for this project was retrieved from MLB’s publicly available Gameday API at http://gd2.mlb.com/components/game/mlb/. (TWO-N’s Baseball Scorecard is not affiliated with MLB.)

A lot of the information in the visualization, like the type of pitch (ball, strike, or in play), comes directly from the API. The API provides information about halves of innings broken down by at-bats, and describes each at-bat with an event describing how it ended. We wrote our own logic to determine from an event whether a hit, an out, or an error had occurred during that at-bat, and to group at-bats into outs.

We made some simplifications in this process: we reclassified Catcher Interference, Hit by Pitch, Intent Walk, and Fan Interference as Walks, and on the graph we show Walks in the same category as hits, though they are not scored that way. This helps the viewer see base-reaching events besides hits, which could lead to runs being scored.

Our data processing code on Github will be linked here soon.

What’s next for the Baseball Scorecard

Not all of the data behind the Scorecard is displayed in the static image. We also know who was pitching and batting and the speeds and types of pitches thrown, and we can easily calculate the number of players on base at any given point in the game. We are exploring ways to add interactivity that would allow the viewer access, through clicks or hovers, to this additional dimension of data, which would be overwhelming if it were shown all at once.

The MLB Gameday API also updates during a game, which means it’s possible that in the future, a user could see a live version of the Scorecard update in real time during a game. Our processing is virtually instantaneous, so we could provide live updates for multiple games simultaneously during the regular season.

Function get_magic_quotes_gpc() is deprecated
The error has been logged in /anchor/errors.log
Uncaught Exception

Uncaught Exception

Function get_magic_quotes_gpc() is deprecated

Origin

system/boot.php on line 41

Trace

#0 [internal function]: System\Error::shutdown()
#1 {main}