SMRs and AMRs

Saturday, November 17, 2012

Everything went wrong, but at the end of the day it turned out right


Three members of Obama's tech team, from left to right: Harper Reed, Dylan Richard, and Mark Trammell (Photo by Daniel X. O'Neil).

When the Nerds Go Marching In
By Alexis Madrigal, The Atlantic

How a dream team of engineers from Facebook, Twitter, and Google built the software that drove Barack Obama's reelection

The Obama campaign's technologists were tense and tired. It was game day and everything was going wrong.

Josh Thayer, the lead engineer of Narwhal, had just been informed that they'd lost another one of the services powering their software. That was bad: Narwhal was the code name for the data platform that underpinned the campaign and let it track voters and volunteers. If it broke, so would everything else.

They were talking with people at Amazon Web Services, but all they knew was that they had packet loss. Earlier that day, they lost their databases, their East Coast servers, and their memcache clusters. Thayer was ready to kill Nick Hatch, a DevOps engineer who was the official bearer of bad news. Another of their vendors, PalominoDB, was fixing databases, but needed to rebuild the replicas. It was going to take time, Hatch said. They didn't have time.

They'd been working 14-hour days, six or seven days a week, trying to reelect the president, and now everything had been broken at just the wrong time. It was like someone had written a Murphy's Law algorithm and deployed it at scale.

(More here.)

0 Comments:

Post a Comment

<< Home