General, minimal principles





How-To Geek's "What's the Best Way to Back Up My Computer?"
Eric Griffith's "The Beginner's Guide to PC Backup"
/r/techsupport's "backuptools wiki"





Data you could back up





Good idea to save snapshots of disk configuration into files, and back up those files, so you can rebuild the configuration of your system if necessary. Maybe a (Linux) script containing:

sudo blkid | grep -v squashfs >saved.blkid.txt
cp /etc/fstab saved.etcfstab.txt
lsblk --fs --list --paths >saved.lsblk.txt
sudo fdisk --list >saved.fdisk.list.txt
sudo inxi -Fmpx >saved.inxi.txt
tar --create --file saved.dot-ecryptfs.tar ~/.ecryptfs/*
# .ecryptfs files let you use ecryptfs-recover-private to recover access

Good idea to save browser things such as bookmarks, settings of "trained" browser add-ons (such as uBlock Origin, uMatrix, Privacy Badger, CanvasBlocker), digital certificates, into files and back those up. Also export RSS feed subscriptions out of email client or RSS reader to a file that will get backed up (but list of feeds may not be enough, you may want the whole database that shows which items you have/haven't read).



Have backups, don't just keep your data online:

+/-
From DrStephenPoop on reddit:
+/-
> BACK UP YOUR DATA

And not just what's on your hard drive.

Do not trust the cloud!

Google recently ended my account for an unidentified TOS violation. I am not sure what I did. I just logged into gmail one day and instead of an inbox I saw a message saying my account had been disabled. I lost:

8 years of email contacts

6 years of favorited YouTube videos

About a dozen videos I made with my brother that were uploaded to YouTube.

All my Drive/Doc files including original writing.

My passwords to several sites, including banking and insurance sites.

Three albums I had purchased from Google Play.

Here's the kicker: I was a google believer. I am one of the 5 or so non-developers who actually owns a first generation Chromebook. I believed in the cloud!

Use and enjoy Google's services, but do NOT rely on them. Even though you buy their computers and purchase music from them, you are STILL not the consumer with google. You are the product (sold to advertisers). So when you are shut out from their garden, you have no customer service to appeal to, or to even find out why you got tossed. You might as well be staring at an angel with a flaming sword, wondering where your pants are.

> Didn't you contact Support ?

When you get the "your account has been disabled" screen, they give you a link to voice your grievance. After submitting, you get a message that says something to the effect of: "If we find we have reason to contact you, we will contact you."

You can also go the community forums and plead for help. Sometimes someone associated with google will actually say: "I'll have people take a look at this." In all my pleas, I never got a response. That is as far as support goes. You are not a customer. You are the product, and you are merely a commodity. Have you ever heard of "commodity support"?
Tienlon Ho's "Can You Live Without Google?"
Gonzalo Sainz Trapaga's "A new and innovative way for Google to kill your SaaS startup"

From someone on reddit:
+/-
A few days ago my Facebook account was disabled suddenly and without warning. I've gone through what I thought was a fairly routine appeals process - filled in the form they link you to when you try to log in and included a scan of my photo ID as they requested to prove I'm a real person etc. However, I just received an email from Facebook saying the following:

> ... Upon investigation, we have determined that you
> are ineligible to use Facebook. ... Unfortunately, for
> safety and security reasons, we cannot provide
> additional information as to why your account
> was disabled. This decision is final. ...

This is really bizarre and quite upsetting - it's easy to forget just how much we rely on this service. If I can't get my account reactivated, that's six years of content (and memories) lost, and a huge blow to my ability to keep in contact with some friends and family.

The only possible reason I can think of for my account being disabled is what I was doing at the time - sending some photos to someone through the private messaging system. Some of the photos were (mildly) adult in nature (at her request!) which could be deemed a breach of the Community Standards if you look at it in strict black and white terms ("Facebook has a strict policy against the sharing of pornographic content"). However I can't bring myself to believe that there is someone monitoring private message attachments and instantly banning people if they see boobs. Beyond that, I genuinely can't conceive of a reason as to why my account was singled out for anything.

Any advice would be appreciated as to what I should do next - I am not yet willing to just give up and lose all of that content. I have replied to the email, though I doubt anyone will read it, but beyond that there's really no other contact options I can see, and Googling this problem does not produce much beyond more horror stories like this.

From /u/sugarbreach on reddit:
+/-
I am writing this to warn Google users to back up their data, and to realize that everything you take for granted can be taken away in an instant.

About a week ago I attempted to log into my Gmail account and was greeted with a page saying my account was disabled. It says that it was disabled due to a perceived violation of the terms of service and product specific polices. I have read and reread the google terms of service, and I know I haven't done anything to violate them. The only possibility I can think of is that someone may have hacked into my account. I have been an enthusiastic gmail user since it first came out in beta, and you had to be invited to get an account. I have relied on google apps to make my life easier. I have filled in their account recovery form, and even tried calling members of the Gmail team, but have had no luck. I also have posted on the gmail help forum, but an expert there said he contacted google and there was nothing he could do and google wouldn't tell him anything "for privacy reasons".

This has created the ultimate real-life nightmare, and has turned my life upside down, a few examples of which are listed below.

All of my contacts were linked to this account. I now do not have access to emails, phone numbers, addresses, etc.

My google voice telephone number is no longer working. I had this phone number on my business cards and email signature, and now when someone dials the number, they are given an error recording. "We could not complete your call, please try again".

My youtube account with many videos I cherished of my children are now gone.

I have all of my photos backed up to the account for nearly my entire life, as I thought this was the safest place to keep them (the cloud!) I have photos of my beloved grandparents who have since passed away, and the thought that I can no longer access these photos makes me sick. I also have thousands of pictures from vacations and of my children that I fear are gone forever.

A nice chromebook that I purchased to access all of the google apps is now almost useless since my account has been disabled.

I have multiple documents in my google drive that I have spent hours of work on, and can no longer access them.

I placed an enormous amount of faith and trust into google's products and services, as millions of people have worldwide. It is a shame that something this important in someone's life cannot even warrant a response from a live person at Google.

I have been very depressed because my entire life was encased in google's products, and now everything is gone.

Again, I am writing this to warn others that this can happen to anyone at any time, so it would be wise to back up treasured items in your google account. Ironically, google provides the means to do this through their "takeout" app, which I did not learn about until after my account was disabled. If there is anyone out there reading this that can offer any guidance for getting my account reinstated, I would sure appreciate it!

Jon Christian's "Deleting the Family Tree"
DanDeals' "PSA: Don't Mess With The Google!"
Alex Hern's "Pixel phone resellers banned from using Google accounts"
"A few reasons not to organise on Facebook"
Killed by Google

Matthew Miller's "SIM swap horror story: I've lost decades of data and Google won't lift a finger"
David Murphy's "I Lost Nine Years of Photos by Locking Myself Out of My Google Account"
Leo Notenboom's "A One-step Way to Lose Your Account ... Forever"

Paraphrased from someone on reddit 11/2019:
"As a prank, a friend changed the name of our WhatsApp group to something obscene. WhatsApp then banned the group and the accounts of everyone in the group ! My account has been banned !"
[Related: don't let unknown people add you to groups; you could get suspended or banned for being added to a malicous group. In Android app, relevant setting is Settings / Account / Privacy / Groups.]

Paraphrased from someone on reddit 12/2019:
"My Facebook account got banned (maybe for creating two accounts ?), and then a week later my WhatsApp account got banned too, I assume because my Facebook account got banned."

New Google TOS quoted by someone on reddit 11/2020:
"If your account is inactive in Gmail, Drive or Photos for more than two years, Google 'may' delete the content in that product. So if you use Gmail but don't use Photos for two years because you use another service, Google may delete any old photos you had stored there. And if you stay over your storage limit for two years, Google 'may delete your content across Gmail, Drive and Photos.'"

I've heard that some photo-storage sites feel free to downgrade the resolution of large photos, assuming that human viewers won't be able to see the difference. Know the exact policy of the site before using it, if this matters to you.



If you lose a cloud account, you can lose stored data, your calendar, remaining time on a subscription, any accumulated credit or "reputation" or gift cards, network link that makes some device (such as Amazon Echo, Google Home, etc) work, playlists, contact list, media you had bought or stored there, etc.

Do NOT use Facebook or Google or Apple or Microsoft as your login to lots of other web sites. Not only does it let your activity get shared to Facebook or etc, but if Facebook or etc ever deactivates your account for some reason, you've lost access to those other sites too.

Do NOT use Google's online password manager (holding passwords you've saved in Chrome or Android). If Google ever deactivates your account for some reason, maybe you've lost access to those other sites too, I'm not sure.

Do NOT use Facebook or Google or Pinterest or Amazon or etc as the sole, critical host of your business, if you can avoid it. They give the "appearance of ownership", but in fact you do not own the platform, you have "digital tenancy". If the service ever deactivates your account for some reason, your business is dead. And content you write on them (in FB Pages, Amazon items for sale, etc) probably is in a non-standard format and hard to move to elsewhere. If you absolutely must use such a service as your critical host, plan for the possibility that they may drop you. Keep backups, have a separate web site and email, have pages on other services, etc.

Do NOT rely on a high page-rank in Facebook or Google, or a high reputation rating in Amazon or iTunes or YouTube or AirBNB or Yelp or something, as the critical asset of your business, if you can avoid it. The algorithms behind those can change at any time. A couple of bad reviews from users can harm you greatly.

Do NOT use a free email account supplied by your ISP or cell-phone service provider. If you ever change service provider for some reason, you may lose that email account.

Maybe some people don't consider their email/messenger to be "cloud data", but it is. If you're saving 10 years of past messages in GMail or WhatsApp or something, it may be valuable to you, and it may be used or deleted by a hacker if your account gets hacked. It also may be hard to back up, and may be hard to move to elsewhere. I'm a big believer in keeping your email account as close to empty as feasible. Clean it out !

If you're running a business on a cloud service (Facebook, eBay, Shopify, Etsy, GMail, Amazon, AirBNB, etc), back up your data. The service may or may not be backing it up for you. Even if they are backing it up, getting it restored may take a while. And if they turn off your account for some reason, you need that data so you can move to another platform and continue to serve your customers. These services give the "appearance of ownership", but in fact you do not own the platform, you have "digital tenancy". If there's a way to use a custom domain name that you own, that's safer than using one provided by the service: if the service fails then you can make the domain name point to some new server. Same is true of a phone number, especially a VOIP number: you don't really own it, the provider owns it, and you can lose the number through disuse or failure to pay or some other mishap.

Do you actually "own" the things you think you own ? If a friend set up your domain registration or email account for you, is it in their name or yours ? If an employee administers the company email accounts on GMail, is the employee's personal account the only administrator for the whole company ? If someone gave you a used computer or phone or something, whose name is on any accounts or subscriptions associated with it ? If your relationship with your spouse or partner is failing, whose name is registered as the owner of various accounts ?

If you do lose access to something important, be wary of threats in search results. Lots of sites have been set up to provide "Facebook Support phone number" or "Unlock your banned WhatsApp account" or similar in search-engine results. But these big vendors with free services (Google, Facebook, WhatsApp, etc) deliberately do not HAVE a phone support number you can call. They have hundreds of millions or billions of free users; the LAST thing they want is for users to be able to call humans at their company. Any search result that gives you such a phone number is trying to connect you to a scammer. At best, they'll try to sell you something. At worst, they'll install ransomware, steal your money, and sell your information.



Other things to back up:

+/-
Do "backups" of old non-electronic data, such as family photos and diplomas and such. Scan them and back up the images.

From Justin Carroll on an ITRH podcast:
Kinds of information (for you and everyone in family, and pets) you should have backed up and available (carry with you) in event of a disaster:
  • Biographical (driver's license, passport, birth certificate, wedding license, divorce decree, firearm licenses, military history, etc).
  • Medical records (prescriptions, vaccination record, test results, etc).
  • Ownership and Financial records (titles of house, vehicles, insurance policies, bank accounts and statements, photos and info of expensive items, credit reports).
  • Other (family photos, etc).
Lisa Rowan's "Keep These Financial Records in Your 'Go Bag'"

Do a "backup" of your own memory: in a simple text file, write a summary autobiography. Dates and places you lived, went to school, worked, traveled, etc. Names of friends, roommates, coworkers, etc. Memory fades over time.



You don't have to back up everything. Consider what you're willing to lose. For example, I don't back up my operating system or applications. I can re-install them easily from the standard places.





Destinations to back up to





Backups to the cloud:

+/-
If you do backups to the cloud, don't leave those backups accessible from your machine via a "cloud drive" that is always mounted (shows up as drive H: or something). If you get hit by malware, it may affect files on all accessible drives, including your backups in the cloud.

Apparently, automatic cloud backups of your phone data can expire and be deleted if you don't use your phone for many months. Android backups in Google Drive Backup are deleted if you don't use the phone for 2 months ? iPhone backups in iCloud are deleted if the iCloud account is not used for 6 months ?

A factor to consider: today's cloud backup may be encrypted so well that no one can crack it. But that encrypted data may still be available somewhere in the cloud 20 years from now, and maybe 20-years-future technology WILL be able to crack today's encryption.

Eric Griffith's "Back Up Your Cloud: How to Download All Your Data"
Adam Dachis's "How to Protect Your Data in the Event of a Webapp Shutdown"



Note: a Btrfs or ZFS snapshot stored on the same disk as the original data is not a backup. If the disk fails, you lose the data and the snapshot.

Note: many forms of RAID are not backups. If you accidentally delete data, that data is deleted from everywhere in the RAID.





Ways to manage the backup process



Type of backups:

+/-
  • Full image: a block-level copy of the whole raw disk/partition contents.

    Good: Everything is copied, even hidden data and bootloader etc. Restoring is simple, everything gets copied back in one operation.

    Bad: Takes a lot of space and time, even if few files have changed. Restoring is an all-or-nothing operation, you can't restore just part of the system.

  • Full file-level: all files under some directory are copied, as files.

    Good: Simple and clear. You can do it manually if you wish. You can see and test what you did. Restoring is flexible, you can copy back only the files you need. Uses only standard default OS applications.

    Bad: Takes a lot of space and time, even if few files have changed. May miss hidden stuff or bootloader. Will miss partition table.

  • Incremental file-level: changed files under some directory are copied, as files, and stored as deltas from previous versions of the files.

    Good: Fast and space-efficient. Restoring is flexible, you can copy back only the files you need.

    Bad: Something has to decide what files have changed, which is tricky and maybe slow. May miss hidden stuff or bootloader. Restoring requires specific software, usually.

  • Incremental filesystem-level: filesystem marks "snapshot" points, and keeps track of state of everything at each snapshot.

    Good: Fast and space-efficient. Every detail of filesystem can be saved/restored.

    Bad: You have to decide when to take a snapshot, and you may forget or do at inappropriate times. I think all things that have changed will be saved/restored; no detailed control. Not supported by older filesystems. A newish feature, so may have bugs.




In incremental file-level backup, how to decide if a file has "changed":

+/-
One or more of:
  • Metadata:
    • File size.
      (Danger: some files such as encrypted container files never change size.)
    • Modified time.
      (Danger: some database or encryption apps may not change modified time.)
    • Inode number.
    • Access permissions / ownership.


  • Contents:
    • Hash.
    • Byte-by-byte comparison.





Ways to do the backup:

+/-
  • Copy files across manually, using Windows Explorer or similar.

  • Use general-purpose backup software.

    Some choices:
    +/-
    • rclone (CLI only).
      +/-
      This does an incremental file-level backup. What gets used to decide if a file has changed varies by type of destination. For example, with Mega, rclone only copies a file to Mega if the file size has changed. But locally, rclone also can filter via modified-time.

      
      # Installed it through Ubuntu's software store; it's a Snap.
      rclone --version	# v1.36
      rclone --help
      
      # https://rclone.org/mega/
      rclone config
      # type n for "new remote"
      # give it a name of "remote"
      # type number for MEGA
      # WHOOPS: this version doesn't support MEGA
      snap remove rclone
      
      # same with deb in Ubuntu MATE 20.04 repo
      # same with deb in Kubuntu 20.10 repo 12/2020, version 1.50.2
      
      
      # Go to https://rclone.org/downloads/ and download the "Intel/AMD - 64 Bit - deb"
      # Double-click on .deb file and install.
      rclone --version	# v1.52.n or better
      
      # https://rclone.org/mega/
      rclone config
      # type n for "new remote"
      # give it a name of "remote"
      # type number for MEGA (21 ?)
      # type username of your MEGA acct
      # type password of your MEGA acct
      # type n to not edit advanced config
      # type s to set configuration password
      # type a to add password
      # best to use MEGA acct password ?
      # type q to quit to main menu
      # type q to quit config
      
      # list top-level dirs
      rclone lsd remote:
      # FAIL: have to turn off 2FA on MEGA account
      
      # list top-level dirs with modified times
      rclone lsd remote:
      # list top-level dir names only
      rclone lsf --dirs-only remote: | sort
      # list all dir names only, recursively
      rclone lsf --dirs-only -R remote: | sort
      
      # show quota information about remote account
      rclone about remote:
      
      # Copy changed files from source to remote.
      # Recursive down dirtree.
      # Copies if contents have changed, not if only timestamp has changed.
      rclone sync ~/MYDIRNAME remote:MYDIRNAME --progress
      
      # Apparently rclone has no way to read its commands from a file.
      # But if doing multiple commands, set variable RCLONE_CONFIG_PASS
      # to avoid getting prompted for password multiple times.
      

      Simplified version of bash script I use:
      
      #!/usr/bin/env bash
      
      if [ -b /dev/mapper/veracrypt1 ] || [ -b /dev/mapper/veracrypt2 ]
      then
      	ls /dev/mapper/veracrypt*
      	echo Dismount VeraCrypt volumes before doing backup
      	read -p 'OK? '
      	exit
      fi
      
      export RCLONE_CONFIG_PASS
      read -sp "What is rclone password ? " RCLONE_CONFIG_PASS
      
      set -o verbose
      
      rclone lsd remote:
      
      rclone copy ~/DIR1 remote:DIR1 --progress
      rclone copy ~/DIR2 remote:DIR2 --progress
      
      rclone about remote:
      
      set +o verbose
      
      read -p 'OK? '
      

      Test your script VERY carefully. I found this MAJOR issue with using rclone with Mega.nz: "Mega does not support modification times or hashes yet." from https://rclone.org/mega/ , which means rclone only copies a file to Mega if the file size has changed. Some files (e.g. VeraCrypt containers) never change size; for other files some edits may leave the sizes unchanged. You may think these files are being backed up, but they're not. [Someone said "apparently mega encrypts all files locally before uploading them, which makes hashing not work".]

      Adding "--no-check-dest" will force copying always. But there seems to be no way to check for changed hashes and modified-times only on the client side. Maybe use "--max-age 7d" and then do a backup at least once a week ?

      Also, there is a distinction between "sync" and "copy": if you rename a file from A to B, with "sync" you will end up with only file B on the backup, but with "copy" you will end up with both files A and B on the backup.

      Linux Uprising's "How To Encrypt Cloud Storage Files With Rclone"


    • duplicity (CLI only).
      +/-
      Duplicity is available as a deb or snap.
      Ubuntu Community Help Wiki's "DuplicityBackupHowto"
      "duplicity --help | less"
      URL format for MEGA back-end: "megav2://user[:password]@other.host/some_dir"

      Deja Dup is a GUI front-end for duplicity.
      It's available as a deb or snap.


    • BorgBackup

      BorgBackup docs
      Vickie Li's "Backing Up With Borg"
      Teknikal's_Domain's "BorgBackup"


    • rdiff-backup
      +/-
      This does an incremental file-level backup, using size, modified-time, inode number to decide if a file has changed. (I turn off the inode-number-checking.)

      Patrik Dufresne's "What's new with rdiff-backup?"
      Patrik Dufresne's "Manage your Linux backups with Rdiffweb"

      
      sudo apt install rdiff-backup
      
      # put this script in a dir on the backup drive and run it from THERE:
      mkdir MYRDIFF
      rdiff-backup ~/MYDIR1 MYRDIFF/MYDIR1	# back up a dir tree
      
      # flags I use: --use-compatible-timestamps --no-compare-inode --create-full-path --print-statistics
      
      # If you want to back up just a single file ~/MYDIR1/FILENAME1:
      rdiff-backup --include '**FILENAME1' --exclude MYDIR1 ~/MYDIR1 MYRDIFF/MYDIR1
      

      When backing to a USB disk, if none of your files have changed since last backup, the scan of all files is VERY fast.

      What exactly is being tested to determine "file has changed" ? I think only metadata: size and modified-time. Not checking file contents. (Which means you should do a "touch" on VeraCrypt container files beforehand to make sure they get backed up.)


    Mehedi Hasan's "Open Source Backup Software for Linux"


  • Use backup software specific to a particular (destination) cloud backup service.

  • Use backup software specific to a particular (source) cloud service.

    Alan Pope's "Safely Backup Google Photos" (to Linux)


  • Use backup software specific to a particular filesystem type.

    For Btrfs: digint / btrbk, or Timeshift.


Note that a "sync" feature is not a backup. If something is deleted or corrupted on one end of it, that thing will be deleted or corrupted on the other end too. Usually. And if one system gets a bad date/time setting, the "sync" may copy old files over new files. Be careful.
David Murphy's "Why Did iCloud Delete All of My Photos?"
Goktug Kayaalp's "Do not use Syncthing"

/r/techsupport's "backuptools wiki"





Do before starting to make a backup



Clean out any caches in app-profiles that you're going to back up (browser cache, email client cache).

Maybe clean out system temp files or cache files or crash dumps before doing a backup.
See "Clean up space on disk" section of my Using Linux page.

Dismount VeraCrypt or LUKS filesystems using container files you want to back up.

Close running apps that maybe be using files you want to back up: browser, email client, VeraCrypt, password manager, database server, text editor, IDE, RSS reader/downloader, torrent client.





Scheduled backups or not ?







Restore



Schrodingers Backup

Think about how you would restore to a complete new computer if necessary.





Linux Software



Syncing your primary disk to a secondary disk, or syncing a primary disk to the cloud, is not the same as backing up that primary disk. With syncing, if you delete something from the primary or it gets corrupted, the problem will be copied to the other place, and you've lost data. Usually in a backup, the destination maintains multiple historical copies of each file, so a mistake/problem on your primary disk does not wipe out the previously-backed-up data.



Aaron Kili's "24 Outstanding Backup Utilities for Linux Systems in 2018"





Miscellaneous



If you're concerned about your backups functioning for 10, 20, 50 years:

+/-
Don't focus so much on "what media to use ?". Focus on "when should I make a perfect copy from my current media to new media ?". Where "new media" could be same type as current media, or some new type as the current type nears EOL.

For example, if you have backups on floppy disks, you need to copy to new floppy disks as the old ones threaten to degrade, or copy to USB flash sticks as the market threatens to stop making floppy drives and floppy disks.

Then in 10 or 20 years, you'll be copying from your USB flash sticks to whatever the new medium is (DNA or something ?).



Howard Fosdick's "My open source disaster recovery strategy for the home office"



My "Computer Theft Recovery" page
My "Computer Security and Privacy" page