Recovery on a salvaged PostgreSQL cluster with missing files
Hi! It's been a while. This post has been in my drafts since November of last year. I finally decided to finish it up, since the information presented here was quite hard to pinpoint.
As some of you may be aware of, my main server suffered an SSD failure on October 14th, 2023, with no backups available. This SSD had the data for many of the services I host, including my Akkoma instance, the mint.lgbt Matrix and XMPP servers, and even this blog! Thankfully, with the help of the donators now listed in the Coral Castle homepage, I sent it to a local, Chilean data recovery service called Nekiori, located in the Los Leones commercial neighborhood of the Providencia district. They were successful at the job, managing to recover around 300 GB of data, not to mention their excellent communication and pricing.
However, no data recovery job is perfect, and in many cases you'll still end up with a few files missing. This is what happened with the salvaged PostgreSQL cluster, and I'll explain how I was able to get a huge chunk of it back within about 2 hours.
Starting up the salvaged cluster
Before beginning, MAKE A COPY OF THE ORIGINAL DATA DIRECTORY. We'll be doing lots of destructive changes, so, if anything goes wrong, you'll want to have a checkpoint you can restart from. I know duplicating clusters is time-consuming, and may take up lots of space, but it's best to be safe than sorry (although, I'm expecting whoever has to go through this to already have learned this valuable lesson). Additionally, this guide only consists of getting past PostgreSQL errors, not regenerating any pages that were permanently lost.
Let's start by installing PostgreSQL on our work environment, making sure to match the version the recovered cluster was originally running on. On Alpine Linux, for instance, we can specify the cluster's version to be 15.x by running:
# apk add postgresql15
Once installed, we'll skip using the daemons provided by the package, and instead we'll run the server straight from the CLI, in order to have better control of each parameter. I would recommend using a multiplexer like Byobu, in order to view logs easily and be able to gracefully shut down the database when needed. Otherwise, if you're working on a full blown desktop, you can use multiple terminals. Let's break down the following command:
$ postgres -k /tmp -p 5433 -D postgresql/15/data/
postgres
is the runtime for our PostgreSQL server.-k /tmp
sets the Unix socket location to the host's ramdisk. However, you can set this location anywhere you'd like, as long as you use it for every other command which interact with PostgreSQL-p 5433
sets the port our server will start listening on. Even if you're not using TCP/IP, the socket will attempt to map a port, so it's still important to set this.-D postgresql/15/data/
is our salvaged cluster. Make sure to point it to the copy we made earlier.
After we've each parameter, let's run it!
...
All good? No? Don't worry. It's likely to fail its first launch, complaining about it not being able to find a directory in the data folder. If that's the case, create the missing directory (should be shown in the log) with:
$ mkdir [data]/[dir]
Then, try to execute the server again. Rinse, repeat, and eventually the server should begin running without a crash.
Locate a database
Now, since this cluster is in an unstable state, it shouldn't be used for production. Instead, we'll locate the most relevant databases we want to keep, so we can later dump them. Enter the psql
shell with the following command:
$ psql -h /tmp -p 5433 postgres postgres
Again, here's the breakdown:
– psql
is the command for the PostgreSQL shell
– -h /tmp
is the location of the Unix socket we defined on the last step.
– -p 5433
is the port we also defined earlier.
– First postgres
is the database name. Most PostgreSQL clusters use this database as a sort of starting point.
– Second postgres
is the user we're authenticating as, which in most clusters is also set to be a superuser, granting us elevated privileges.
If all went well, you should now be staring at a screen like this:
psql (15.6)
Type "help" for help.
postgres=#
Now, to list our available databases, we need to run the following command:
postgres=# \l
After pressing enter, we should see a table of every database present in the cluster. Write down the names of the databases to preserve, as shown in the “Name” column. If you want to explore the contents of the database, you can execute a series of commands like these:
postgres=# \c shop # Connects to the database "shop"
shop=# \dt # Displays all tables in database
shop=# SELECT * IN orders; # Displays all elements in the table "orders"
Once we've kept note of all the databases we want to preserve, we'll be exiting psql
by running:
postgres=# \q
Dump away!
In order to ease this process, I've created a script that will read the output off of the pg_dump
command, and create blank pages in the likely case that some couldn't be found. We can download it from my GitHub Gist using the following command:
$ curl -LO https://gist.githubusercontent.com/otoayana/7623d05b0b60c71160c37771398bfcaf/raw/ada88e9ad936e317ba19787f8886e24e2c96b123/pg_taprec
Once downloaded, we'll give it permissions to execute using chmod
, and afterwards we'll need to open it with a text editor. For the sake of simplicity, I'll be using nano
as an example.
$ chmod +x pg_taprec
$ nano pg_taprec
Inside this file, we'll need to modify the following variables:
– DATA
: Route of the cluster's data directory
– HOST
: The location for the PostgreSQL server's Unix socket, as defined earlier
– DB
: The database we want to recover. Set it to the name for one of the databases we noted down earlier.
– USER
: A superuser within our cluster. By default, should stay as postgres
.
Once we've edited the file, let's execute it, making sure to specify the final output as one of the arguments.
$ ./pg_taprec [file]
This process may take a while, as it needs to retry dumping whenever it encounters a missing page. In the meantime, go prepare whatever hot drink you like, and maybe read another article on this blog.
Once it's done, rinse and repeat, changing the DB
variable to the next database we want to dump, and changing the first argument in the command to a different path, otherwise we'll be overwriting the previous dump.
Shut down, then import
Once we've finished dumping every picked database, let's close the terminal running the server (or terminate the server using Ctrl+C). We can now delete the duplicated cluster, and keep the original somewhere safe, just in the case we need it at a later time.
Let's copy the database dumps into our new server, and install a fresh PostgreSQL server on it, if not yet installed, as specified at the beginning of this post.
Once the new cluster is running, let's create a new database corresponding to a dump, with an owner which can interact with it:
# su - postgres
$ createuser [owner]
$ createdb [database] -O [owner]
Finally, let's import the database dump into it. We can do so using the following command, making sure to replace [dump]
with the path where our dump is stored within the server:
$ psql [database] < [dump]
Rinse and repeat for each dumped database.
Conclusion
Congrats! You've managed to salvage what you could. However, it's always better to prepare for an emergency such as this one. Consider the following tips for preventing other catastrophes:
- By ALL MEANS keep constant, verified backups. If you can't afford downtime, you can replicate a cluster across multiple servers. If you don't have multiple servers, create a
pg_basebackup
monthly, and then store WAL pages afterwards. PostgreSQL's official documentation explains how to do so. - If you're doing hardware upgrade, check the hardware you're getting. It's not uncommon to install bad RAM or SSDs that are about to fail, both of which can corrupt your database beyond repair.
- Keep your backups in multiple locations. If your datacenter/house burns down, at least you'll have the files you kept on the cloud! I'm using Backblaze (not sponsored), and my bills are safely within the single USD digits.