How to own your data
The internet was once pitched as an archive, a total codex of human knowledge. But in reality, nothing is really trustworthy on the internet. The whole thing is about as messy and shambolic as the rest of the planet. Stuff is hard to find, or it goes away entirely.
The internet is best for finding the thing, but keeping that thing is on you. One need only parse Gwern’s list of internet search tips to see how ephemeral things are. “Archives” are not really. “Caches” die. That means the onus is on you, the individual, to keep your own archive of what matters to you. There is nothing more reliable than redundant backups that are controlled by you.
Here’s what I recommend everyone do:
- Buy disk. Disk is cheap, and it’s always getting cheaper. I use SSDs, but even platter drives work, especially if you have a redundant backup. (More prone to failure.)
- Own your music. Buy albums. Own mp3s or flacs. Transfer them manually to the players of your choice.
- Own your video. Use yt-dlp to archive videos. For the rest, there are trackers.
- Own your photos. Create a separate backup from your existing photo libraries on first-party services. Keep your backups somewhere safe.
- Own your books. Since there are no truly open commercial ebook standards, most of what you want will be on Libgen. Save everything as PDF or ePub.
- Do everything in plain text. All other file formats will degrade. Plain text will not. My bookmarks file is in plain text. All of my notes are in plain text. This text is in plain text. Markdown works because it is plain text. Note that plain text locked into a proprietary platform does not count as plain text. It must be configured as a plain text file.
- Save pages. Did you find something cool you like? Great, now you get to save it locally as a full archive. The best browser for doing this, one which preserves the imagery and overall layout, is Firefox. One of the best bookmarking engines for doing this turned out to be run by a fairly problematic person. Don’t get stuck.
- Save your work. Many large tech companies allow you to download comprehensive archives of your data. Use them, often. (I recommend every month, but quarterly is fine, too, I guess.)
- Make backups of everything. I use Backblaze for offsite backups. I keep multiple copies of the important stuff.
I’m sure many of you are groaning about all of this, but I personally believe it is the easiest way to ensure that your future self will have access to your past. What you put in front of yourself defines you. So I believe that you cannot start any of this soon enough – especially now, when so many platforms have proven that they are no longer your friend.