6.2. Reaching your performance goals

6.2. Reaching your performance goals

We have designed the default settings to work on as many environments as possible, to allow you to test the fitness of QuasarDB for your needs quickly. To reach peak performance, you may need to review your system architecture, deamon configuration, and API usage. A worthwhile investment: this review can yield one to two orders of magnitude performance improvement. In this section, we will go through all the essential parts.

6.2.1. System architecture

6.2.1.1. Client-server

In general, it is advised to ensure that the client and the server reside on two separate machines. Client and server may indeed fight for resources.

6.2.1.2. Rest server

The rest server should not be installed on the same machine as the QuasarDB to ensure the daemon has all memory available for caching.

6.2.1.3. Storage

QuasarDB’s write and read speed are limited by the underlying storage technology that you choose.

Storage can be physically connected to the computer, a remote attached file system, or object storage.

In theory, a physical connection is the fastest, then remote attached, then object storage. However, this much depends on the underlying media.

In the cloud, it is even made more complicated by I/O rate limiting, which may be a function of your instance type.

That is why it is highly recommended that you first benchmark your storage and ensure the raw speed is sufficient for your needs before doing any QuasarDB tuning.

6.2.1.4. Memory

Memory has a significant impact on performance. Ideally, you want your working set to fit in RAM. Since QuasarDB is distributed, that doesn’t mean you need a machine with a large amount of RAM: the load will be distributed across the cluster.

6.2.1.5. System tuning

We have a complete section dedicated to tuning the host system, see daemon tuning.

6.2.2. QuasarDB daemon configuration

For more information about daemon configuration, see daemon configuration.

6.2.2.1. Limiter

By default, QuasarDB will use half of the system memory for caching. You can change this value in the limiter section of the configuration.

6.2.2.2. Persistence

The default persistence layer is RocksDB. Several parameters have a massive impact on performance:

  • The disable_wal option: If set to true, the WAL will be in-memory only, significantly increasing the capacity of a node to absorb write spikes, at the cost of durability.

  • The threads option: Increasing the number of threads can speed-up RocksDB compaction

  • Persistence cache: You can use a local disk as a cache of remote disk to speed up reads.

The caching size of RocksDB has a limited impact on performance. That is because the QuasarDB cache that sits on top of RocksDB significantly reduces its importance.

6.2.3. Database architecture

6.2.3.1. One table or many tables?

Traditionally, in a Relational DataBase Management System (RDBMS), you would want to limit the number of tables in the database.

However, QuasarDB has unique mechanisms that enable you to have thousands of tables.

Pros

  • Higher write performance

  • Higher read performance

  • Higher ASOF join performance

  • Faster WHERE clause

  • More flexibility

  • Potentially better compression

Cons

  • Before 3.10, slower to read all the data from many tables than from a single table.

  • Some joins may be slower

For more information, see managing tables.

6.2.3.2. Shard size

When you create a table, you need to choose a shard size, which is crucial for performance.

The shard size is a time duration that the database will use to bucket the timeseries across the cluster.

When your shard size is too large, meaning that you have a lot of points per bucket, performance will suffer as you may keep updating the same shard as you insert. Additionally, you may get a lot of read-amplification when running queries. Read amplification is when you read more data than you need.

When your shard size is too small, meaning that you don’t have a lot of points per bucket, performance will suffer because you may spend more time working with metadata and slicing requests than doing useful work. Additionally, you will prevent compression from working effectively.

Ideally, you want a shard size to contain between 100,000 and 1,000,000 points. It’s ok if a couple of buckets are out of this zone, but you want to keep most of them within that limit.

6.2.3.3. Data type

When possible, it is recommended to use integers or floats for data types. Compared to strings or blobs, these types are:

  • More compact: they use less memory and disk space

  • Have dedicated, high-performance compression algorithms

  • Have optimized SIMD aggregation functions

  • A faster scan speed which means faster WHERE clause execution

Starting with QuasarDB 3.10, QuasarDB supports a dynamic symbol table that is an excellent replacement for strings and blobs without the hassle of manually handling a mapping table.

6.2.4. Importing data

6.2.4.1. Using Railgun

When loading a CSV (or a TSV) into QuasarDB, we strongly recommend using Railgun.

The bottleneck is reading the file, not writing it into QuasarDB. Railgun has been heavily optimized to deliver optimal performance.

While Railgun can load unsorted files, it greatly increases performance if the file is sorted by the primary timestamp.

An experimental version of Railgun for parquet files is available upon request.

For more information, see the railgun documentation.

6.2.4.2. Batch API

The Batch API solves a lot of headaches related to efficiently loading data into QuasarDB and is the best choice when writing custom loaders.

When writing to multiple tables at once, using the “pin column” API is recommended for optimal performance.

6.2.5. Running queries

6.2.5.1. About the query engine

When you run a query, QuasarDB splits it into sub-queries and run it on all relevant nodes of the cluster. This process is transparent.

6.2.5.2. Locating the bottleneck

If you are not satisfied with the speed at which your query is running, you need to find out where you spend most of your time.

Is it in the client or on the server? If it is on the server, where is it?

Enabling performance profiling in the server and the shell will give you a comprehensive overview.

If you have identified the server’s bottleneck, go through or list recommendations for server processing.

If the bottleneck is on the client-side, ensure custom code is not the cause.

Your solutions architect will be happy to help you debug your performance problem.

6.2.5.3. API considerations

Avoid creating and destroying handles. Keep your handles long-lived. Creating a handle and establishing a connection to the cluster is a very expensive operation.

By default, the client API uses a buffer size of 128 MiB. Increasing the buffer size can improve performance if you intend to have large replies.

6.2.5.4. Multi-threading

Starting with 3.10, the client API can dynamically multi-thread sub queries to leverage the parallel architecture of QuasarDB.

Before that, each sub-query would be run in sequence on a single thread. That could result in a very long-running query despite a low CPU usage on the cluster.