Internet Archive

From RobLa
Wikipedia has an article on:

The Internet Archive is a multi-petabyte repository of all sorts of crazy things. The Wayback Machine is there, but there's a lot of other stuff there too.

Petabox

Wikipedia has an article on:

There's a really outdated article on Wikipedia about the Petabox. It's what folks at the Internet Archive call the racks-and-racks of machines that serve as a giant distributed cluster, which (IIRC) holds over 50 petabytes of stuff that people upload to https://archive.org.

Wayback Machine

main article: Wayback Machine

The Wayback Machine runs on top of the Petabox. It's confusing. I never figured it out.

2019

main article: 2019

I worked at Internet Archive back in 2019. It was quite the experience.

Here's the feed of all the official blog posts that I wrote when I was at Internet Archive:

You probably only see one blog post.[1] That's because that's all I was able to publish. Jonah Edwards microblogged about it.[2] I wanted to write a followup, but I didn't get the opportunity. Writing the kind of follow-up I envisioned would have involved getting much more of Jonah's time than I was able to get, and he was way too busy just trying to keep Internet Archive's Petabox running. Also, making wifi work in the woods.[3] The Internet Archive has a lot going on.

I blogged about leaving Internet Archvive[4]

Links

Footnotes