Catch the “perfect game” during this year’s World Series between the Astros and Phillies?
I’m not referring to Wednesday night’s no-hitter.
This perfect game had nothing to do with how well the pitchers threw the ball and all to do with the umpire behind the plate during Game 2: Pat Hoberg. According to the website and Twitter bot Umpire Scorecards, Hoberg called all strikes and balls correctly, an impressive feat.
Based on Umpire Scorecards’ data, which date back to 2015, it’s the first time this has happened in Major League Baseball.
The Umpire Scorecards website has nearly 1.2 million page views this year, and the @UmpScorecard Twitter account reached 300,000 followers Tuesday. Few people know the popular software — which scrapes MLB data and uses algorithms to determine the accuracy, consistency, and biases of umpire calls at the plate — was built by a 21-year-old Boston University junior.
“Not that it’s really a secret, but I just don’t talk about it that much,” Ethan Singer said from his dorm room in Boston, which he shares with seven others.
What began as a personal project in 2020 after Singer graduated from high school in Bethesda, Md., has become a widely cited source on baseball stats. A couple dozen professional baseball players follow the Twitter account, and its insights sometimes make the news.
“It sort of snowballed,” said Singer, who majors in statistics and computer science and minors in public policy.
All the data that feed into the algorithms come from MLB. The program Singer wrote captures just five of the 89 attributes the league publishes on every pitch: its horizontal and vertical position when it crosses the plate, the top and bottom of the strike zone (adjusted for a batter’s stance), and the umpire’s call.
When an umpire calls a pitch wrong, that brings down the accuracy score. Umpire Scorecards also generates more sophisticated stats, such as “impactful missed calls,” which identifies bad calls that affected the game.
Every time a pitcher throws the ball, there are 288 possible “game states,” Singer said, a figure that accounts for the 12 potential pitch counts, how many outs there are, and which bases are occupied.
“Each one of those states is ascribed a run expectancy, which is just the average number of runs MLB teams have scored from that situation to the end of an inning since 2015,” Singer said.
All of this information gets pulled from MLB the morning after a game. Then it appears in Singer’s graphic templates that auto-post to Twitter.
In the early days, Singer said, his algorithms used to call every pitch inside the strike zone a strike and every pitch outside a ball. “Then we got some feedback from umpires that there’s error in the measurement systems, so it’s a little silly to use strict cut-offs,” he said.
To account for errors in MLB stadium camera tracking systems, Umpire Scorecards uses a mathematical technique to simulate all possible locations a ball could be when it crosses the plate — there are around 500 — and only determines calls are wrong when it’s 90 percent sure.
Other programs generate stats on umpires. But Singer said the MLB’s Baseball Savant, as well as stats produced by ESPN, do not account for how the strike zone moves up and down for the height and stance of the batters.
Umpire Scorecards maps the strike zone for every pitch and focuses on the ball’s position relative to that area, rather than the ball’s location in space when it passes the plate, Singer said.
While some Twitter bots post daily videos of missed calls, Singer said people follow Umpire Scorecards because it publishes the same data on every game played, rather than curating big moments. The website also allows people to filter through games, umpires, and teams, so it’s become a research tool, he said.
These days, Umpire Scorecards is a fully automated program running on a Google Cloud server. (Singer used to keep a laptop on top of the microwave in his dorm and run a Python program every morning before heading out to class.)
Singer said he’s in the process of setting up Umpire Scorecards as an official business entity. He runs ads on the website and collects Venmo donations. “It’s not an income, but it’s not nothing,” he said.
As for the future, Singer has no plans to turn his idea into something bigger. Ultimately, he wants to pursue a career in public policy.
“Baseball analytics is not my life,” he said.