4. Routine maintenance tasks

QuasarDB requires certain tasks be performed regularly to maintain optimal performace. The tasks discussed in this chapter are required, but they are repetetive in nature and can easily be automated using standard tools.

4.1. Graceful shutdown

QuasarDB ensures data consistency by using a Write-Ahead Log (WAL) <https://en.wikipedia.org/wiki/Write-ahead_logging> on top of its data store. This ensures that, in the event of a fatal system crash, we can restore a node to a known good state and as such achieve single-node durability.

This recovery procedure can be expensive, may cause a full database repair upon next boot, and is typically something you want to avoid in production whenever possible.

Additionally, forcefully disconnecting a node from a cluster will lead to timeouts on clients, and a delayed cluster rebalancing.

To work around these limitations, QuasarDB supports a graceful shutdown. On UNIX-like systems, this can be achieved by sending the QuasarDB daemon the SIGTERM or SIGQUIT signal. This will ensure QuasarDB gracefully leaves the cluster, and does the necessary housekeeping to ensure the data store is in a consistent state.

If you have installed QuasarDB through one of our packages, our packaging scripts typically take care of this. However, if you are using custom scripts to start/stop QuasarDB, please ensure a SIGTERM is sent before termination, and not killed. For example, with systemd, you will want to ensure you use an infinite termination timeout as follows:

[Service]
Type=simple
User=qdb
Group=qdb
ExecStart=/usr/bin/qdbd -c /etc/qdb/qdbd.conf
Restart=on-failure
LimitNOFILE=65536
TimeoutStopSec=infinity

Here we explicitly tell systemd to wait as long as it takes to have QuasarDB terminate gracefully.

4.2. Cluster trimming

QuasarDB requires periodic trimming of the dataset. When trimming, it performs several operations:

  • it compacts the Write-Ahead-Log and merges this into the database;

  • it removes MVCC-transaction metadata;

  • it reclaims any unused space.

While these tasks are also done automatically by the database, we strongly recommend periodic explicit trimming, for example using a daily or weekly cron job. As trimming can be an expensive operation, we recommend to avoid doing this during peak hours.

You can issue a cluster trim using the QuasarDB shell:

$ qdbsh
qdbsh > cluster_trim

If you observe the logs of the database daemon, you should see something like this:

2019.11.05-16.37.42.953826008   3339    3438       info         trimming all 412,160 entries (325,396 versions in memory)
2019.11.05-16.37.57.683002389   3339    3438       debug        successfully cleaned allocator buffers
2019.11.05-16.37.57.683022210   3339    3438       info         434,001/325,594 entries were trimmed (for a size of 14.91 GiB). we now have 428,327 entries in total and 309,597 entries in memory for a total size of 60.238 GiB (16.0 EiB in memory)
2019.11.05-16.37.57.713021310   3339    3438       info         requesting complete database compaction
2019.11.05-16.37.57.720022210   3339    3438       info         complete database compaction took 47 ms

4.3. Log rotation

QuasarDB does not support log rotation out of the box, and its log files will increase in size over time. We recommend:

  • weekly rotation of all log files stored in /var/log/qdb;

  • all logs older than 7 days should be removed.

Additionally, you might want to setup a centralised log aggregator to ease the management of your logs.

4.4. Backups

Although it’s not required, it’s recommended you take periodic backups of your database and store them in a safe, offsite location.

We give the following process recommendations for backups:

  1. Schedule your backups during off-hours, for example at night;

  2. Before making a backup, issue a cluster trim in order to remove any unnecessary metadata and compact the Write-Ahead-Log (WAL);

  3. Shut the QuasarDB daemon you are about to take a backup of, for example using systemctl stop qdbd. Make sure to send the normal termination signal SIGTERM so that the process gracefully shutds down.

  4. Copy the data directory, including the WAL, to your backup location. The precise location of your data directory depends upon your storage configuration, but is typically /var/lib/qdb.

  5. After the copy operation has completed, you can restart the QuasarDB daemon.

Repeat this process for each of the nodes you want to back up.