6 min read

How I made atariemailarchive.org

An artist's rendition of the author reading Jed Margolin's emails on the NYC subway. Credit: katiebcartoons.com
🥳
I've published the data behind atariemailarchive.org under a Creative Commons license. You can find it on Github.
🎙️
Update (March 29, 2023): You can listen to me talk about atariemailarchive.org on episode 5 of Data is Plural's inaugural podcast season.

In 2015 I discovered Jed Margolin's emails from his tenure as a hardware engineer at Atari: 4,128 messages he sent and received between 1983 and 1992.

... and then I read them all during my subway commute to work for the next several months.

This is what it looks like reading Jed's emails from his website on an iPhone. Not pictured: being jostled around on a crowded train commute while reading it.

Not all four thousand messages in Jed's inbox are captivating. I'll admit I glossed over a lot of the 1989 project status reports for Stun Runner. But they were collectively fascinating. I kept finding gems.

I loved all the day-to-day mundanity too: coworkers throwing wee jokes into their status reports, minutes from engineering meetings, Jed's constant displeasure with Bob Frye from facilities. (Why are there always ants in the cafeteria, Bob?!)

Technology's changed since the 80's and that came across loudly when I read Jed's emails. But all the human interaction and how engineers built technology and did creative work felt the same. Reading Jed's emails was relatable and comforting.

In 2017, two years after I discovered Jed's emails, I quit my job and took some time off. That spring I contacted Jed for permission to work on what would later become atariemailarchive.org.

Hi, Vikram.

Yes, I am ok with it (as long as you do not use it for any nefarious
purposes). :-)

I don't think there is any legal stuff to do.

And thank you. I don't want this stuff to disappear when I am gone. (I am
old and decrepit.)

Regards,

Jed

Confident that I met Jed's stringent criteria not to weaponize his publicly-available professional emails from the 80's, I began work.

I contacted Jed on May 15th and released atariemailarchive.org on December 15th. I worked on the project on and off over seven months.

atariemailarchive.org allows folks to enjoy Jed's emails without obsessively reading everything like I did. It gives readers threaded conversations, curation with a "best of" list, and an easier-to-read interface than a giant text file.

But that's not how it started. I built a version and wrote a bunch of code that I scrapped before I built what you see on atariemailarchive.org today.

The version that never saw the light of day

My goal with atariemailarchive.org is simple: I want it to be able to hook you during your lunch hour.

As a proxy to test if you might get hooked, I just want any indication that you are consumed by atariemailarchive.org when I tell you about it. If you tap around, chuckle a lot, and start ignoring me, that is sufficient.

By this rigorous and scientific measure, the first version I built failed spectacularly. Readers got bored quickly.

I had initially built a search interface and an easier way to read and navigate all of Jed's individual messages. When I showed it to folks, they tapped around, searched for jobs and wozniak (neither appears in the archive), and then asked me what they should look for.

If you aspire to curate a someone's public emails some day, here's a tip: search is an awful way to curate text when your readers don't know what is interesting about it. This is obvious in retrospect, but it didn't occur to me until I showed what I built to folks.

For example, a lot of folks have created neat search interfaces for the Enron emails. I know there are fascinating emails in that archive, yet I've barely read read any of them. I don't know where to look.

I realize it is significantly harder given that the Enron archive contains 500,000 messages, but I hope someone creates something like atariemailarchive.org for the Enron emails some day, somehow.

I know I'd spend my lunch hour reading it.

The version that did see the light of day

I realized that if I wanted folks to see what I saw in the archive, I'd need to guide them directly to it. atariemailarchive.org would need resemble the emails I sent to friends and coworkers when I wanted to share fun messages I found in Jed's inbox.

That's what led me to do three things:

  1. I created conversation threads from every message in the archive.
  2. I assembled a "best of" list with some commentary.
  3. I tagged threads.

I actually tried #2 and #3 on Jed's individual messages first to test it out. It worked. Folks tapped around and chuckled a lot. They told me they wanted more.

But when I tried to tag more messages and add them to the "best of" list, I quickly realized how many messages existed in a larger thread, and how important that context was for all the little discoveries I made to truly hit. You can see this easily if you click around on atariemailarchive.org today.

Threads did not exist in the 80's. Jed's inbox was just a bunch of email ordered chronologically. So I bit the bullet and created conversation threads from all 4,128 of Jed's messages.

As you can imagine, this was tedious. It was the most time consuming part of this project. I was living in Seattle at the time and I'd like to thank the folks at Porchlight Coffee for letting me hang out there over many, many afternoons while I manually threaded all of Jed's emails.

What's next for the archive?

I decided to publish the dataset I created for atariemailarchive.org under a Creative Commons license today. You can find it on Github.

I would be thrilled to see people do interesting things with it. If you do, please let me know.

People use the archive in different ways. Occasionally readers will link to messages that help them in some investigation or research they're doing. Here's one of my favorites. Jed received a price list for custom parts made for Atari hardware, and the person who authored this tweet found them in a breakdown of an arcade cabinet. You can see the part ids match in the image and the email.

It has been a joy to see folks link to and enjoy the archive and I hope to see more of that. I get messages of appreciation every now and then. I love receiving those. It validates all the time I spent on this bizarre obsession.

Technical notes for those interested

atariemailarchive.org began its life as a Python Pyramid app with a SQLite back-end in 2017. It ran on a $5/month Linode instance until early September 2022.

This week I ported it Python Flask and deployed it to Fly.io. The app is trivial and porting it to Python Flask took two afternoons. Deploying atariemailarchive.org to Fly.io took very little time – Fly.io was a joy to use and it will cost me nothing to host the archive there.

The dataset on Github is in a SQLite file. It contains structured, threaded messages that I parsed and assembled from text files on Jed's site. It does not contain any of the curation and editorialization on atariemailarchive.org.

Thanks to Katie Brookoff.

Questions? Comments? Feedback? Please send me a note!

Email me at voberoi@gmail.com or holler at me on Twitter or Threads.