quasardb daemon#

Introduction#

The quasardb daemon is a highly scalable data repository that handles requests from multiple clients. The data is cached in memory and persisted on disk. It can be distributed on several servers to form a cluster.

The persistence layer is based on RocksDB (c) RocksDB authors. All rights reserved. The network distribution uses the Chord protocol.

The quasardb daemon does not require privileges (unless listening on a port under 1024) and can be launched from the command line. From this command line it can safely be stopped with CTRL-C. On UNIX, CTRL-Z will also result in the daemon being suspended.

Important

Without a valid license (see License), quasardb will run in “community edition” mode. The community edition is limited to 16 GiB of storage and 4 GiB of RAM per node, with a maximum of two nodes per cluster.

Quick Reference#

Option

Usage

Default

Global

Req. Version

-h, --help

display help

No

-v, --version

display version information

No

--gen-config

generate default config file

No

>=1.1.3

-c, --config

specify config file

No

>=1.1.3

-d, --daemonize

daemonize

No

--license-file

specify license

qdb_license.txt

No

-a, --address

address to listen on

127.0.0.1:2836

No

-s, --sessions

max client sessions

20000

No

--idle-timeout

idle timeout in ms

600,000

No

--request-timeout

request timeout in ms

60,000

No

--peer

one peer to form a cluster

No

--id

set the node id

generated

No

--storage_engine

select the storage engine

rocksdb

Yes

>=3.1.0

-r, --rocksdb-root

persistence directory

./db

Yes

--replication

sets the replication factor

1

Yes

--rocksdb-max-open-files

rocksdb maximum open files

256

No

>=3.0.0

--limiter-max-bytes-soft

max bytes in cache (soft)

Automatic

No

>=3.5.0

--limiter-max-bytes-hard

max bytes in cache (hard)

Automatic

No

>=3.5.0

-l, --log-directory

log in the given directory

No

--log-syslog

log on syslog

No

--log-level

change log level

info

No

--log-flush-interval

change log flush

3

No

Configuration#

Global and local options#

When a node connects to a ring, it will first download the configuration of this ring and overwrite its parameters with the ring’s parameters.

This way, you can be sure that parameters are consistent over all the nodes. This is especially important for parameters such as replication where you need all nodes to agree on a single replication factor.

This is also important for persistance as having a mix of transient and non-transient nodes will result in undefined behaviour and unwanted data loss.

However, not all options are taken from the ring. It makes sense to have a heterogenous logging threshold for example, as you may want to analyze the behaviour of a specific part of your cluster.

In addition, some parameters are node specific, such as the listening address or the node ID.

An option that applies cluster-wide is said to be global whereas other options are said to be local. The value of a global option is set by the first node that creates the ring, all other nodes will copy these parameters. On the other hand, local options are read from the command line as you run the daemon.

Network distribution#

qdbd distribution is peer-to-peer. This means:

  • The unavailability of one server does not compromise the whole cluster

  • The memory load is automatically distributed amongst all instances within a cluster

Each server within one cluster needs:

  • An unique address on which to listen (you cannot use the any address) (-a)

  • At least one node within the cluster to contact (--peer)

Note

It’s counter-productive to run several instances on the same node. qdbd is hyper-scalar and will be able to use all the memory and processors of your server. The same remark applies for virtual machines: running quasardb multiple times in multiple virtual machines on a single physical server will not increase the performances.

The daemon will automatically launch an appropriate number of threads to handle connection accepts and requests, depending on the actual hardware configuration of your server.

Logging#

By default, a non-daemonized qdbd will log to the console. If daemonized, logging is disabled unless configured to log to files (--log-directory) or to the syslog (--log-syslog) on Unix.

There are six different log levels: detailed, debug, info, warning, error and panic. You can change the log level (--log-level), it defaults to info.

You can also change the log flush interval (--log-flush-interval), which defaults to 3,000 ms.

Cache#

QuasarDB caches data in RAM in addition to the cache provided by the persistence layers. QuasarDB avoids loading data if it can answer a query based on information contained in the indexes, and will use a LRU strategy for timeseries buckets. Inserting data in parallel of querying it is well supported and will not pollute the cache.

The daemon will start to evict entries when the process memory usage reaches the soft limit. It will stop all processing and evict has much as it can when the process memory usage hits the hard limit. When QuasarDB evicts entries, it will throttle down queries to prevent a situation where users would load data faster than QuasarDB could evict it.

Thus, the memory usage is kept between the soft and the hard limit, possibly below the soft limit.

The memory usage measurement is based on the process size, which means that file system caches, and persistence layer caches are included in this measurement. It is thus important to ensure that the soft limit is well above the sum of all caches of the persistence layer. Failure to do so may result in continuous eviction and poor performance.

By default, the hard limit will be 80% of the physical RAM present on the machine, and the soft limit will be 75% of the hard limit.

On a machine with 64 GiB of RAM, the hard limit will thus be around 51 GiB, and the soft limit will be 38 GiB.

Each parameter can be configured independently, the hard limit must always be greater than the soft limit.

Ideally, you want your working set to fit in memory. The working set for a single node is close to the total working set divided by the number of nodes.

Note

The cache size has a huge impact on performance. Your QuasarDB solutions architect will be happy to assist you in finding the optimal setting for your usecase.

Data Storage#

Note

Data storage options are global for any given ring.

QuasarDB has three storage engines:

  • RocksDB (default)

  • Transient (no data is written to disk)

Entries are often kept resident in a write cache so the daemon can rapidly serve a large amount of simultaenous requests. Data may not be synced to the disk at all times.

The RocksDB persistence engine allows you to sync every write to disk, if needed, thanks to the “sync” option setting.

For more information, see Data Storage and Data Transfer.

RocksDB#

RocksDB is an open-source, persistent, key-value store for fast storage environments. It is based on LevelDB and uses LSM trees.

To enable RocksDB, one sets the storage engine configuration parameter to “rocksdb” and set a “root” directory in the rocksdb configuration section. QuasarDB will then write the data under this directory using our tuned RocksDB implementation. The data stored is 100% compatible with RocksDB and can be used via the RocksDB API, if needed.

The directory can be absolute or relative, for production we recommend using absolute directory.

It is possible to limit the amount of space a node will occupy with the “max_size” option. The writes to the node will fail when the disk usage reaches that limit, warnings being emitted before that point. The write-ahead log is not accounted in the space usage meaning that the actual disk usage may be greater than the limit. Compression may also reduce the actual disk usage.

Note

RocksDB is a safe default, however it can limit the performance of your QuasarDB cluster.

Persistent read cache#

Note

The persistent read cache is only available for the RocksDB persistence layer.

The persistent read cache optimizes I/O in buffering data from a (potentially remote) storage into a faster, local storage (See Persistent read cache). The persistent read cache does not buffer writes.

There are three configuration settings:

  • The path of the persistent read cache. It should be used on local, fast storage only (SSD, NVMe, or Optane). Using remote or slow storage will have a detrimental effect on performance.

  • The maximum size of the persistent read cache

  • Should the persistent read cache be optimized for NVMe.

The persistent read cache is disabled by default.

Partitions#

A partition can be seen as a worker thread. The more partitions, the more work can be done in parallel. However if the number of partitions is too high relative to your server capabilities to actually do parallel work, performance will decrease.

quasardb is highly scalable and partitions do not interfere with each other. The daemon’s scheduler will assign incoming requests to the partition with the least workload.

The ideal number of partitions is close to the number of physical cores your server has. By default the daemon chooses the best compromise it can. If this value is not satisfactory, you can use the partitions_count config file option to set the value manually.

Note

Unless a performance issue is identified, it is best to let the daemon compute the partition count.

Asynchronous time series inserter#

The server has a built-in asynchronous timeseries inserter that is acceded through the batch insert API. The inserter will buffer updates to commit them in batch for optimal performance.

When an asynchronous request arrives, the following happens:

  • The server validates the request

  • The request will be dispatched to one of the pipeline through an asynchronous queue, if there is room.

  • If the queue is full, it will wait a random amount of time and try again. If it still fails after three updates the server will return to the client and error “busy, try again later”

The asynchronous workers run in a configurable number of pipelines which run in independent threads.

Each pipeline does the following:

  • If the queue is half full, or if a configured amount of time has elapsed, the content of the queue will be inspected.

  • Requests updating the same timeseries bucket will be merged in a single request

  • Merged requests will be written to disk

The following parameters are configurable:

  • The number of pipelines (by default, 1)

  • The amount of data a queue may contain

  • The number of requests a queue may contain

  • The maximum amount of time before the content of a queue is inspected and written to disk

Statistics#

By default the server will collect statistics. Statistics are stored in blobs or integer keys for convenient consumption. They represent the value since the latest refresh. The refresh interval is configurable and by default 5,000 milliseconds (5 seconds).

The key name of the statistics are in the form “$qdb.statistics.{node id}.{stat name}”. For example, for the node ID 1-0-0-0, the cummulated amount of bytes written to disk key name is “$qdb.statistics.1-0-0-0.persistence.bytes_written”.

Supported metrics#

The currently supported metrics are:

  • engine_build_date: blob, the QuasarDB engine build date

  • engine_version: blob, the QuasarDB engine version

  • hardware_concurrency: int64, the detected concurrent number of hardware threads supported on the system

  • startup: int64, the startup timestamp

  • node_id: blob, a string representing the node id

  • operating_system: blob, a string representing the operating system

  • partitions_count: int64, the number of partitions

  • cpu.idle: int64, the cumulated CPU idle time

  • cpu.system: int64, the cumulated CPU system time

  • cpu.user: int64, the cumulated CPU user time

  • disk.path: blob, the persistence path

  • disk.bytes_free: int64, the bytes free on the persistence path

  • disk.bytes_total: int64, the bytes total on the persistence path

  • memory.bytes_resident_size: int64, the computed amount of RAM used for data by QuasarDB

  • memory.physmem.bytes_total: int64, physical RAM free bytes count

  • memory.physmem.bytes_used: int64, physical RAM used bytes count

  • memory.resident_count: int64, the number of entries in RAM

  • memory.vm.bytes_total: int64, virtual memory free bytes count

  • memory.vm.bytes_used: int64, virtual memory used bytes count

  • network.current_users_count: int64, the current users count

  • network.sessions.max_count: int64, the configured maximum number of sessions

  • network.sessions.available_count: int64, the current number of available sessions

  • network.sessions.unavailable_count: int64, the current number of used sessions

  • persistence.bytes_capacity: int64, the persistence layer storage capacity, in bytes. May be 0 if the value is unknown.

  • persistence.bytes_utilized: int64, how many bytes are currently used in the persistence layer.

  • persistence.bytes_read: int64, the cumulated number of bytes read

  • persistence.bytes_written: int64, the cumulated number of bytes written

  • persistence.entries_count: int64, the current number of entries in the persistence layer

  • requests.total_count: int64, the cumulated number of requests

  • requests.successes_count: int64, the cumulated number of successful operations

  • requests.bytes_out: int64, the cumulated number of bytes sent by the server

Performance data#

When performance profiling is enabled, each request will store an accumulator of the time spent, in nanoseconds, in each step of the process.

The performance metrics are stored in the “$qdb.statistics.{node id}.perf.” subfield.

Operating limits#

Theoretical limits#

Entry size

An entry cannot be larger than the amount of virtual memory available on a single node. This ranges from several megabytes to several gigabytes depending on the amount of physical memory available on the system. It is recommended to keep entries size well below the amount of available physical memory.

Key size

As it is the case for entries, a key cannot be larger than the amount of virtual memory available on a single node.

Number of nodes in a grid

The maximum number of nodes is \(2^{63}\) (9,223,372,036,854,775,808)

Number of entries on a single grid

The maximum number of entries is \(2^{63}\) (9,223,372,036,854,775,808)

Node maximum capacity

The node capacity depends on the available disk space on a given node. The community edition is limited to 16 GiB on disk and 4 GiB in RAM.

Total amount of data

The total amount of data a single grid may handle is 16 EiB (that’s 18,446,744,073,709,551,616 bytes).

Practical limits#

Entry size

Very small entries (below a hundred bytes) do not offer a very good throughput because the network overhead is larger than the payload. This is a limitation of TCP. Very large entries (larger than 10% of the node RAM) impact performance negatively and are probably not optimal to store on a quasardb cluster “as is”. It is generally recommended to slice very large entries in smaller entries and handle reassembly in the client program. If you have a lot of RAM (several gigabytes per node) do not be afraid to add large entries to a quasardb cluster. For optimal performance, it’s better if the “hot data” - the data that is frequently acceded - can fit in RAM.

Simultaneous clients

A single instance can serve thousands of clients simultaneously. The actual limit is the network bandwidth, not the server. You can set the -s to a higher number to handle more simultaneous clients per node. Also you should make sure the clients connects to the nodes of the cluster in a load-balanced fashion.

Parameters Reference#

Parameters can be supplied in any order and are prefixed with --. The arguments format is parameter dependent.

Instance specific parameters only apply to the instance, while global parameters are for the whole ring. Global parameters are applied when the first instance of a ring is launched.

Instance specific#

-h, --help#

Displays basic usage information.

Example

To display the online help, type:

qdbd --help
-v, --version#

Displays qdbd version information.

--gen-config#

Generates a JSON configuration file with default values and prints it to STDOUT.

Example

To create a new config file with the name “qdbd_default_config.json”, type:

qdbd --gen-config > qdbd_default_config.json

Note

The –gen-config argument is only available with quasardb 1.1.3 or higher.

-c, --config#

Specifies a configuration file to use. See Config File Reference.

  • Any other command-line options will be ignored.

  • If an option is omitted in the config file, the default will be used.

  • If an option is malformed in the config file, it will be ignored.

Argument

The path to a valid configuration file.

Example

To use a configuration file named “qdbd_default_config.json”, type:

qdbd --config=qdbd_default_config.json

Note

The –config argument is only available with quasardb 2.0.0 or higher.

-d, --daemonize#

Runs the server as a daemon (UNIX only). In this mode, the process will fork and prevent console interactions. This is the recommended running mode for UNIX environments.

Example

To run as a daemon:

qdbd -d

Note

Logging to the console is not allowed when running as a daemon.

--license-file#

Specifies the location of the license file. A valid license is required to run the daemon (see License).

Argument

The path to a valid license file.

Default value

qdb_license.txt

Example

Load the license from license.txt:

qdbd --license-file=license.txt
-a <address>:<port>, --address=<address>:<port>#

Specifies the address and port on which the server will listen.

Argument

A string representing one address the server listens on and a port. The address string can be a host name, an IP address or an interface (BSD and Linux only).

Default value

127.0.0.1:2836, the IPv4 localhost and the port 2836

Example

Listen on localhost and the port 5910:

qdbd --address=localhost:5910

Listen on eth0 port 2836:

qdbd --address=eth0:2836

Note

The unspecified address (0.0.0.0 for IPv4, :: for IPv6) is not allowed.

-s <count>, --sessions=<count>#

Specifies the number of simultaneous sessions per partition.

Argument

A number greater or equal to fifty (50) representing the number of allowed simultaneous sessions.

Default value

64

Example

Allow 10,000 simultaneous session:

qdbd --sessions=10000

Note

The sessions count determines the number of simultaneous clients the server may handle at any given time. Increasing the value increases the memory load. This value may be limited by your license.

--idle-timeout=<duration>#

Sets the timeout after which inactive sessions will be considered for termination.

Argument

An integer representing the number of milliseconds after which an idle session will be considered for termination.

Default value

300,000 (300 seconds, 5 minutes)

Example

Set the timeout to one minute:

qdbd --idle-timeout=60000
--request-timeout=<timeout>#

Sets the timeout after which a request from the server to another server must be considered to have timed out.

Argument

An integer representing the number of milliseconds after which a request must be considered to have timed out.

Default value

60,000 (60 seconds, 1 minute)

Example

Set the timeout to two minutes:

qdbd --request-timeout=120000
--peer=<address>:<port>#

The address and port of a peer to which to connect within the cluster. It can be any server belonging to the cluster.

Argument

The address and port of a machines where a quasardb daemon is running. The address string can be a host name or an IP address.

Default value

None

Example

Join a cluster where the machine 192.168.1.1 listening on the port 2836 is already connected:

qdbd --peer=192.168.1.1:2836
--id=<id string>#

Sets the node ID.

Argument

A string representing the ID to of the node. This can either be a 256-bit number in hexadecimal form, the value “random” and use the indexed syntax. This value may not be zero (0-0-0-0).You are strongly encouraged to use the indexed syntax. See Clustering.

Default value

Unique random value.

Example

Set the node ID to 1-a-2-b:

qdbd --id=1-a-2-b

Set the node ID to a random value:

qdbd --id=random

Set the node to the ideal value for the third node of a cluster totalling 8 nodes:

qdbd --id=3/8

Warning

Having two nodes with the same ID on the ring leads to undefined behaviour. By default the daemon generates an ID that is guaranteed to be unique on any given ring. Only modify the node ID if the topology of the ring is unsatisfactory and you are certain no two node IDs are the same.

-l <path>, --log-directory=<path>#

Logs in the designated directory.

Argument

A string representing a path to a directory where log files will be created.

Example

Log in /var/log/qdb:

qdbd --log-directory=/var/log/qdb
--log-syslog#

UNIX only, activates logging to syslog.

--log-level=<value>#

Specifies the log verbosity.

Argument

A string representing the amount of logging required. Must be one of:

  • detailed (most output)

  • debug

  • info

  • warning

  • error

  • panic (least output)

Default value

info

Example

Request a debug level logging:

qdbd --log-level=debug
--log-flush-interval=<delay>#

How frequently log messages are flushed to output, in milliseconds.

Argument

An integer representing the number of milliseconds between each flush.

Default value

3,000

Example

Flush the log every minute:

qdbd --log-flush-interval=60000
--rocksdb-max-open-files=<count>#

Sets the maximum number of open files for the RocksDB persistence layer. The default value is configured in a way that it will not stress the operating system, at a potential performance cost. Increasing the default value is recommended for production servers. This only applies to the RocksDB persistence layer.

Argument

An integer representing the maximum number of open files at any point in time.

Default value

256

Example

Increase the number to 10,240:

qdbd --rocksdb-max-open-files=10240
--rocksdb-max-bytes=<size-in-bytes>#

Sets the maximum amount of disk usage for each node’s database in bytes. Any write operations that would overflow the database will return a qdb_e_system_remote error stating “disk full”. The write-ahead log is not accounted in the disk usage.

Due to excessive meta-data or uncompressed db entries, the actual database size may exceed this set value by up to 20%.

Only applies to the RocksDB storage engine.

Argument

An integer representing the maximum size of the database on disk in bytes. The minimum value is 134,217,728 (128 MB).

Default value

0 (disabled)

Example A

To limit the database size on each node to 12 Terabytes:

\[\begin{split}\text{Max Depot Size Value} &= \text{12 Terabytes} \: * \: \frac{1024^4 \: \text{Bytes}}{\text{1 Terabyte}}\\ &= \text{13194139533312 Bytes}\end{split}\]

And thus the command:

qdbd --rocksdb-max-bytes=13194139533312

This database may expand out to approximately 14.4 Terabytes due to meta-data and uncompressed db entries.

Example B

This example will limit the database size to ensure it fits within 1 Terabyte of free space. Since limiting to a specific overhead is important in this example, the filesystem cluster size is also taken into account; the default for most filesystems is 4096 bytes.

\[\begin{split}\text{Max Depot Size Value} &= \text{1099511627776 Bytes} - \text{(1099511627776 Bytes} \: * \: 0.2 \text{)} - \text{Cluster Size of 4096} \\ &= \text{1099511627776 Bytes} - \text{219902325555.2 Bytes} - \text{4096 Bytes} \\ &= \text{879609298124.8 Bytes}\end{split}\]

And thus the command, truncating down to an integer:

qdbd --rocksdb-max-bytes=879609298124

This database should not exceed 1 Terabyte.

Note

The –rocksdb-max-bytes argument is only available with quasardb 1.1.2 or higher.

Note

Using a max depot size may cause a slight performance penalty on writes.

--limiter-max-bytes-hard=<value>#

The hard limit after which the system will take drastic measures to lower memory usage. When the hard limit is reached, processing is temporarly stopped, every entry is evicted from memory, file cache is flushed, and every buffer is purged to free memory.

The total process memory usage is measured, including system caches.

Argument

An integer representing the hard limit, in bytes.

Default value

0 (automatic, 4/5 the available physical memory).

Example

To allow only 100 KiB of entries:

qdbd --limiter-max-bytes-hard=102400

To allow up to 8 GiB:

qdbd --limiter-max-bytes-hard=8589934592

Note

This value has a huge impact on performance.

--limiter-max-bytes-soft=<value>#

The limit after which the system will start to evict entries. When the soft limit, queries will be throttled down to let the system clear memory. The soft limit must always be lower than the hard limit.

The total process memory usage is measured, including system caches.

Argument

An integer representing the soft limit, in bytes.

Default value

0 (automatic, 3/4 of the hard limit).

Example

To allow only 100 KiB of entries:

qdbd --limiter-max-bytes-soft=102400

To allow up to 8 GiB:

qdbd --limiter-max-bytes-soft=8589934592

Note

This value has a huge impact on performance.

-r <path>, --rocksdb-root=<path>#

Specifies the directory where data will be persisted when using the RocksDB storage engine for the node where the process has been launched.

Argument

A string representing a full path to the directory where data will be persisted.

Default value

The “db” subdirectory relative to the current working directory.

Example

Persist data in /var/quasardb/db

qdbd --rocksdb-root=/var/quasardb/db

Note

Although this parameter is global, the directory refers to the local node of each instance.

--security=<boolean>#

Enables or disables cluster security.

Argument

A boolean specifiying whether or not security should be enabled.

Default value

True

Example

To disable security completely:

qdbd --security=false

Note

To work, security needs a cluster private key and an users list.

--cluster-private-file=<path>#

A path to the cluster private key file.

Argument

A string representing a full path to the cluster private key file.

Example

Use the file /etc/qdbd/cluster_private.key:

qdbd --cluster-private-file=/etc/qdbd/cluster_private.key

Note

A cluster private key file is required for security to work (see quasardb cluster key generator).

--user-list=<path>#

A path to the user lists containing the user names and their respective public key in JSON format.

Example

Use the file /etc/qdbd/users.cfg:

qdbd --user-list=/etc/qdbd/users.cfg

Note

A users list is required for security to work (see quasardb user adder).

Global#

--replication=<factor>#

Specifies the replication factor (global parameter). For more information, see Data replication.

Argument

A positive integer between 1 and 4 (inclusive) specifying the replication factor. If the integer is higher than the number of nodes in the cluster, it will be automatically reduced to the cluster size.

Default value

1 (replication disabled)

Example

Have one copy of every entry in the cluster:

qdbd --replication=2
--storage_engine=<engine>#

Specifies the storage engine.

Argument

A string representing the storage engine to use. Can be one of ‘transient’, and ‘rocksdb’.

Default value

“rocksdb”

Example

Use the transient persistence engine:

qdbd --storage_engine=transient

Config File Reference#

As of quasardb version 1.1.3, the qdbd daemon can read its parameters from a JSON configuration file provided by the -c command-line argument. Using a configuration file is recommended.

Some things to note when working with a configuration file:

  • If a configuration file is specified, all other command-line options will be ignored. Only values from the configuration file will be used.

  • The configuration file must be valid JSON in ASCII format.

  • If a key or value is missing from the configuration file or malformed, the default value will be used.

  • If a key or value is unknown, it will be ignored.

The default configuration file is shown below:

{
    "local": {
        "depot": {
            "rocksdb": {
                "sync_every_write": false,
                "root": "db",
                "max_bytes": 0,
                "storage_warning_level": 90,
                "storage_warning_interval": 3600000,
                "disable_wal": false,
                "direct_read": false,
                "direct_write": false,
                "max_total_wal_size": 1073741824,
                "metadata_mem_budget": 268435456,
                "data_cache": 134217728,
                "threads": 4,
                "hi_threads": 2,
                "max_open_files": 0,
                "persistent_cache_path": "",
                "persistent_cache_size": 0,
                "persistent_cache_nvme_optimization": false
            },
            "async_ts": {
                "pipelines": 1,
                "pipeline_buffer_size": 1073741824,
                "pipeline_queue_length": 1000000,
                "flush_deadline": 4000
            }
        },
        "user": {
            "license_file": "",
            "license_key": "",
            "daemon": false
        },
        "limiter": {
            "max_resident_entries": 0,
            "max_bytes_soft": 0,
            "max_bytes_hard": 0
        },
        "logger": {
            "log_level": 2,
            "flush_interval": 3000,
            "log_directory": "",
            "log_to_console": false,
            "log_to_syslog": false
        },
        "network": {
            "server_sessions": 64,
            "partitions_count": 4,
            "idle_timeout": 600000,
            "client_timeout": 60000,
            "max_in_buffer_size": 134217728,
            "max_out_buffer_size": 134217728,
            "listen_on": "127.0.0.1:2836",
            "advertise_as": "127.0.0.1:2836",
            "profile_performance": false
        },
        "chord": {
            "node_id": "0-0-0-0",
            "no_stabilization": false,
            "bootstrapping_peers": [],
            "min_stabilization_interval": 100,
            "max_stabilization_interval": 60000
        }
    },
    "global": {
        "cluster": {
            "storage_engine": "rocksdb",
            "enable_statistics": true,
            "statistics_refresh_interval": 5000,
            "replication_factor": 1,
            "max_versions": 3,
            "max_transaction_duration": 15000,
            "acl_cache_duration": 60000,
            "acl_cache_size": 100000,
            "persisted_firehose": "$qdb.firehose"
        },
        "security": {
            "enable_stop": false,
            "enable_purge_all": false,
            "enabled": false,
            "encrypt_traffic": false,
            "cluster_private_file": "",
            "user_list": ""
        }
    }
}
local::depot::rocksdb::sync_every_write

A boolean representing whether or not the node should sync to disk every write. This option has a huge negative impact on performance, especially on high latency media and adds only marginal safety compared to the sync option. Disabled by default. This setting only applies to RocksDB.

local::depot::rocksdb::root

A string representing the relative or absolute path to the directory where data will be stored. Specifying this string will enable RocksDB as the persistence layer.

local::depot::rocksdb::max_bytes

An integer representing the maximum amount of disk usage for each node’s database in bytes. Any write operations that would overflow the database will return a qdb_e_system_remote error stating “disk full”.

Due to excessive meta-data or uncompressed db entries, the actual database size may exceed this set value by up to 20%.

See --rocksdb-max-bytes for more details and examples to calculate the max_bytes value.

local::depot::storage_warning_level

An integer between 50 and 100 (inclusive) specifying the percentage of disk usage at which a warning about depleting disk space will be emitted. See also local::depot::storage_warning_interval.

local::depot::storage_warning_interval

An integer representing how often quasardb will emit a warning about depleting disk space, in milliseconds. See also local::depot::storage_warning_level.

local::depot::rocksdb::disable_wal

A boolean repersenting whether or not the write-ahead log should be used. When you write data to quasardb, it is added in a buffer who is backed by a disk file called the write-ahead log. In case of failure, quasardb is able to recover by reading from the write-ahead log. For applications that are looking for maximum write performance, you may want to disable the write-ahead log. However, disabling the write-ahead log means that you can lose data should a failure occur before the buffer is flushed into the database. Disabled by default (that is, by default, buffers are backed by disk).

local::depot::rocksdb::direct_read

A boolean repersenting whether or not reads from the disk should be direct (i.e. bypass OS buffers). This setting has an impact on performance and memory usage depending on the hardware configuration. It is generally advised to not use direct reads with spinning disks.

local::depot::rocksdb::direct_write

This option currently does not have any effect.

local::depot::rocksdb::max_total_wal_size

The maximum size, in bytes, of the write-ahead log.

local::depot::rocksdb::metadata_mem_budget

An integer representing the approximate amount of memory (RAM) that should be dedicated to the management of metadata.

local::depot::rocksdb::data_cache

An integer representing the apporximate amount of memory (RAM) that should be used for caching data blocks. This setting only applies to RocksDB.

local::depot::rocksdb::threads

An integer representing the number of threads dedicated to the persistence layer.

local::depot::rocksdb::hi_threads

An integer representing the number of high-priority threads dedicated to the persistence layer, in addition to the normal priority threads.

local::depot::rocksdb::max_open_files

An integer representing the maximum number of files to keep open at a time. 0 for auto-detection.

local::depot::rocksdb::persistent_cache_path

A string representing the path to the peristent cache path. An empty value means the read persistent cache is disabled.

local::depot::rocksdb::persistent_cache_size

An integer representing the maximum size of the persistent read cache. This value cannot be lower than 10 MiB.

local::depot::rocksdb::persistent_cache_nvme_optimization

A boolean specifying whether the persistent read cache should be optimized for NVMe or not. Disabled by default.

local::depot::async_ts::pipelines

The number of asynchronous time series pipelines (see Asynchronous time series inserter).

local::depot::async_ts::pipeline_buffer_size

The maximum amount of data in a pipeline buffer before requests are refused (see Asynchronous time series inserter).

local::depot::async_ts::pipeline_queue_length

The maximum count of requests in the pipeline buffer before requests are refused (see Asynchronous time series inserter).

local::depot::async_ts::flush_deadline

The maximum amount of time before the pipeline buffer gets inspected and flushed to disk (see Asynchronous time series inserter).

local::user::license_file

A string representing the relative or absolute path to the license file. Providing an empty string to both license_key and license_file runs quasardb in evaluation mode.

local::user::license_key

A string representing the license. Providing an empty string to both license_key and license_file runs quasardb in evaluation mode.

local::user::daemon

A boolean value representing whether or not the quasardb daemon should daemonize on launch.

local::limiter::max_bytes_soft

The limit after which the system will start to evict entries. When the soft limit, queries will be throttled down to let the system clear memory. The soft limit must always be lower than the hard limit. The total process memory usage is measured, including system caches.

local::limiter::max_bytes_hard

The hard limit after which the system will take drastic measures to lower memory usage. When the hard limit is reached, processing is temporarly stopped, every entry is evicted from memory, file cache is flushed, and every buffer is purged to free memory.

local::limiter::max_trim_queue_length

An integer representing the total maximum number of updated entries that may be queued for asynchronous trimming. Trimming is a background process that optimizes disk space.

local::logger::log_level

An integer representing the verbosity of the log output. Acceptable values are:

0 = detailed (most output)
1 = debug
2 = info (default)
3 = warning
4 = error
5 = panic (least output)
local::logger::flush_interval

An integer representing how frequently quasardb log messages should be flushed to the log locations, in milliseconds. Default value: 3,000 ms.

local::logger::log_directory

A string representing the relative or absolute path to the directory where log files will be created.

local::logger::log_to_console

A boolean value representing whether or not the quasardb daemon should log to the console it was spawned from. This value is ignored if local::user::daemon is true.

local::logger::log_to_syslog

A boolean value representing whether or not the quasardb daemon should log to the syslog.

local::network::server_sessions

An integer representing the number of server sessions the quasardb daemon can provide.

local::network::partitions_count

An integer representing the number of partitions, or worker threads, quasardb can spawn to perform operations. The ideal number of partitions is close to the number of physical cores your server has. If set to 0, the daemon will choose the best compromise it can.

local::network::idle_timeout

An integer representing the number of milliseconds after which an inactive session will be considered for termination.

local::network::client_timeout

An integer representing the number of milliseconds after which a client session will be considered for termination.

local::network::max_in_buffer_size

The maximum input size that will be allowed by the server in a single message. Any ingoing message larger than this value may be dropped by the server.

local::network::max_out_buffer_size

The maximum output size that will be allowed by the server in a single message. Any outgoing message larger than this value may be dropped by the server.

local::network::max_ts_buffered_queue_length

The maximum length of the asynchronous write queue. When doing asynchronous updates to a timeseries, updates are queued to be processed later. If the length of the queue exceeds this value, updates may be refused by the server.

local::network::max_ts_async_writer_interval

The interval, in milliseconds, at which the asynchronous timseries updates queue will be processed.

local::network::listen_on

A string representing an address and port the daemon should listen on. The string can be a host name or an IP address. Must have name or IP separated from port with a colon.

local::network::advertise_as

A string representing an address and port the daemon will adverstise itself to other daemons as. This setting is mainly used for cloud and container deployments. Must have name or IP separated from port with a colon.

local::network::performance

Enable performance metrics collection and return the data to client for every call. Disabled by default for performance and security reasons.

local::chord::node_id

A string representing the ID to of the node. This can either be a 256-bit number in hexadecimal form, the value “random” and use the indexed syntax. This value may not be zero (0-0-0-0). If left at the default of 0-0-0-0, the daemon will assign a random node ID at startup. You are strongly encouraged to use the indexed syntax. See Clustering.

local::chord::no_stabilization

A read-only boolean value representing whether or not this node should stabilize upon startup. Even if set to true, stabilization will still occur.

local::chord::min_stabilization_interval

The minimum wait interval between two stabilizations, in milliseconds. The default value is 100 ms, it is rarely needed to change this value. This value cannot be zero.

local::chord::max_stabilization_interval

The maximum wait interval between two stabilizations, in milliseconds. Nodes disapearance will take at least that amount of time. The default value is 60,000 ms (one minute). This value must be greater than the minimum stabilization interval, and cannot be lower than 10 ms.

local::chord::bootstrapping_peers

An array of strings representing other nodes in the cluster which will bootstrap this node upon startup. The string can be a host name or an IP address. Must have name or IP separated from port with a colon.

global::cluster::storage_engine

A string reprensenting the storage engine to use for persistence. Must be either “rocksdb”, or “transient”. By default “rocksdb”.

global::cluster::enable_statistics

A boolean setting whether or not the cluster should gather statistics. Small performance impact. Enabled by default.

global::cluster::statistics_refresh_interval

An integer representing the refresh interval of the statistics, in milliseconds. May not be below 1,000 milliseconds (1 second).

global::cluster::replication_factor

An integer between 1 and 4 (inclusive) specifying the replication factor for the cluster. A higher value indicates more copies of data on each node.

global::cluster::max_versions

An integer represending the maximum number of copies the cluster keeps for transaction history. If an entry has more versions than this value, the oldest versions are garbage collected.

global::cluster::max_transaction_duration

An integer representing the maximum guaranteed duration of a transaction, in milliseconds. Transactions lasting longer than this interval will be rolled-back. Default value, 15,000 ms.

global::cluster::acl_cache_duration

An integer representing the maximum guaranteed duration of cached ACL information. Nodes may cache ACL information to improve performance.

global::security::enable_stop

Allows a node to be remotely stop via an API call. False by default.

global::security::enable_purge_all

Allows the cluster to be remotely purged via an API call. False by default.

global::security::enabled

Require cryptographically strong authentication to connect to the cluster. True by default.

global::security::encrypt_traffic

In addition to requiring authentication, encrypt all network traffic. This setting can have a negative performance impact. False by default.

global::security::cluster_private_file

Specifies the path to the cluster private key file (see quasardb cluster key generator). This file must not be accessible to the daemon only.

global::security::user_list

Specifies the path to the users list (see quasardb user adder). This file must be writable by the administrator only.

Performance considerations#

Persistence#

When using RocksDB as a persistence layer, there are a certain number of parameters that are key to have been set properly. Default values were engineered to be “safe” to use without any system-wide tuning. For production environements, it’s generally advised to change these default values.

Important settings to look at:

  • local::depot::disable_wal: Disabling the WAL means that fresh updates will be kept in memory and not written to disk. This can greatly increase performance at the cost of potential data loss.

  • local::depot::max_open_files: The default value of 0 will request QuasarDB to use as many as it’s safe to have on the current system. If the automatic value isn’t satisfactory, ensure that your file descriptor limit is high enough. If for some reason, the value is still incorrect, you can specify manually a value. A low value will make RocksDB open and close files more than necessary.

  • local::depot::sync_every_write: Disabled by default. Enabling it will greatly reduce performance for a small reliability benefit.

It is not recommended to touch the caching and threading parameters without the assitance of a QuasarDB Solutions Architect.

Network#

The default network settings ensure QuasarDB will not exhaust sockets and can be used for testing purposes. In production environments where the connections can be in the realm of thousands, the default settings will not be adequate.

Important settings to look at:

  • local::network::server_sessions: The default value of 64 will be insufficient for serious production use. Provided you increased the descriptors limits, this value can be greater than 1,024. Each partition will preallocate the session, meaning high values can result in high memory usage as well.

  • local::network::partitions_count: The number of independent processing partition. Having more partitions than cores is generally counter-productive. Allocating half of your cores is a safe start, as QuasarDB will consume additional cores for persistence and upkeeping tasks.

  • local::network::idle_timeout: The default value of 10 minutes, means that a daemon guarantees it will not initiate termination of a client connection before 10 minutes. This setting may be too high if the daemon has to handle misbehaving applications that do not properly terminate the connection, resulting in premature session exhaustion.

Memory management#

QuasarDB paginates memory usage for you, meaning it will load data from disk if missing from memory and evict it from memory if the usage exceeds configured thresholds. Memory usage measurement is always approximative that’s why QuasarDB has an additional parameter that will base eviction on the number of entries in memory.

If the eviction thresholds are too low, QuasarDB will not properly use all the memory available and may spend a lot of time paging in and out entries.

If the evictions thresholds are too high, system memory may be exhausted, which at best results in dramatic performance loss and at worst in operating system failure.

Important settings to look at:

  • local::limiter::max_bytes: The maximum memory usage the daemon is allowed to have. When this setting is zero, QuasarDB will use half of the available memory. In the community edition, this setting is capped to 4 GiB. Ensure the baseline memory usage of QuasarDB is significantly lower than this value, otherwise QuasarDB will keep evicting entries.

  • local::limiter::max_resident_entries: The maximum number of entries allowed in memory. When this value is excedeed, QuasarDB will evict entries to be half of this setting. When this setting is zero, QuasarDB will use max_bytes divided by 1,024. This setting exists to protect QuasarDB and the OS should for some reason QuasarDB be unable to accurately measure memory usage.

Do not hesistate to contact your QuasarDB Solutions Architect should you be unsure of your memory management settings.

Security#

Authentication has little to no impact on server performance, however, encrypt the traffic will cap the bandwidth to the encryption speed of your server. QuasarDB uses AEGIS-256 for traffic encryption. On 10 GBit networks this can result in an observable drop in maximum transfer bandwidth.