2. Configuration#
2.1. QuasarDB Daemon#
When it comes to configuring your environment, both Docker and QuasarDB Daemon offer distinct sets of options. Here’s a breakdown of the configuration possibilities:
QuasarDB stores its configuration options in a qdbd.conf
file. The precise location on where this file resides depends on your method of installation, but is typically /etc/qdb/qdbd.conf
.
This section is organised per variable (config file, environment variable, command-line argument) and its description.
2.1.1. License#
You can set a license either by file or by putting the license key as a string directly into your configuration.
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
Absolute location to the license file. Be careful to ensure the license file is readable by the |
|
|
|
License key as a string. |
2.1.2. Parallelism#
Warning
Improper configuration of the parallelism will negatively impact performances.
QuasarDB has been designed for multicore architectures. Queries coming from clients are processed in parallel into disctinct shards called “partitions”. Each partition has a configurable number of threads.
In most instances, you want to use the unified setting that gives a “budget” to let QuasarDB pick for you the right number of partitions and threads per partitions.
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
How many cores to allocate to process queries. The default of 0 will let QuasarDB pick a number depending on the computing resources available on the computer. |
It’s possible to have total control over the configuration in using the partitions_count
and threads_per_partition
settings. If these values are not 0, the parallelism
setting will be ignored and QuasarDB will use the exact configured number of partitions and threads per partition. It is strongly advised to set partitions_count
and threads_per_partition
to 0 and use the parallelism
setting instead.
Note
The storage layer has independent multithreading settings that are not affected by the settings above. In other words, the cores attributed to storage come in addition to the ones used for queries processing.
For Docker: QuasarDB will dedicate this amount of threads to serve requests. Use in conjunction with ROCKSDB_THREADS
to properly tune for your available processing power.
2.1.3. Storage#
Config File,Command-Line Argument & Environment Variable |
Description |
---|---|
|
Specifies which storage engine to use. Accepted values: |
|
Location to the root folder of where your database files are stored. Defaults to |
|
Location to the Write-Ahead-Log (WAL) folder. Defaults to empty, which will make it a subdirectory of the root directory called “wal”. |
|
An array in the form [{“path”: “path1”, “size”: size1}, {“path”: “path2”, “size”: size2}] which specified, which directory, in order should be used for storage. When a storage exceeds capacity, older data is moved to the next storage in the list. Actual usage may exceed the size specified. Size is specified in bytes. |
|
The compaction strategy to use. By default it is “leveled”, but “universal” is also accepted. Universal improves write amplification at the cost of space amplification. |
|
When this option is set to true, writes will not go to the Write-Ahead-Log (WAL), increasing performance at the cost of durability. Writes go to the WAL by default. Use this option with caution. |
|
Size, in bytes, of the Write-Ahead-Log (WAL). The WAL is where every update gets written. When the WAL is full, memtables are flushed to disk. If the WAL is too small, memtables may thus be flushed before they are full, impacting write speed. However, large WALs means increased memory usage and potentially higher compaction. Your WAL should be large enough to absorbs spikes, but cannot be used to compensate for a persistence layer too slow to absorb the load or memtables being too small. See |
|
How many threads will be dedicated to writing data to disk. Write heavy scenarios may benefit from a higher count. Defaults to |
|
The maximum amount of data to store, in bytes. Inserting data will fail with “quota exceeded” when the limit is reached. If set to 0, no check is done. |
|
A path to a local disk to be used as a persistent cache. May increase performance when data is stored in a remote disk. |
|
The maximum size of the persistent cache, in bytes. Cannot be zero if the persistent cache path is specified. |
|
If your persistent cache is on an NVME, enabling this option may increase performance. |
|
Specifies additional configuration options to fine-tune RocksDB storage behavior. |
|
Allows configuration of specific options for block-based tables used in RocksDB storage. |
|
An integer representing how often quasardb will emit a warning about depleting disk space, in milliseconds. |
|
The interval for synchronizing the read-only RocksDB-cloud with the master database. It cannot exceed 30 minutes - half the time it takes for deleted files. |
|
A boolean value indicating whether the storage engine should create a new database if it doesn’t already exist. When set to true, the storage engine will create a new database if one with the specified name is not found. |
|
A boolean value that, when set to true, disables automatic compaction of SST files in the storage engine. Compaction is a process that merges and optimizes data files to improve storage efficiency. Disabling auto-compaction means users must manually run cluster_compact in qdbsh often. This boosts write performance but can harm read performance. |
|
A boolean value that, when set to true, prevents the storage engine from synchronizing manifest files with disk. Manifest files are used to track the state of the SST files in the storage engine. |
|
An integer specifying the maximum number of log files to keep in the storage engine’s log directory. Log files contain records of database operations and are used for crash recovery. |
|
An integer representing the time interval, in milliseconds, after which log files in the storage engine will be rolled. Rolling involves closing the current log file and starting a new one to manage the size of log files. |
|
An integer specifying the maximum size of log files in the storage engine, in bytes. When a log file reaches this size, it will be rolled to a new file. |
|
A boolean value indicating whether to perform extra checks for data integrity and correctness. When enabled, the storage engine will perform additional validation to catch potential errors. |
|
A boolean value indicating whether to perform extra checks on data files for integrity and correctness. Similar to paranoid_checks, this option applies specifically to data files in the storage engine. |
|
The rate limit is specified in bytes per second; a value of 0 indicates that rate limiting is disabled. This rate limit applies to the amount of I/O allocated for flushing and compaction, with flushing operations prioritized over compaction |
|
A boolean value indicating whether the database should be opened in read-only mode. When set to true, write operations will be disallowed. |
|
Specifies the interval for emitting warnings about depleting disk space. This warning helps users take timely action to prevent running out of disk space. |
|
Should be increased if the daemon is under constant insertion pressure, and there is still available CPU: it increases the amount of threads the storage engine has available for background operations such as compaction. |
|
The number of threads allocated to high-priority operations such as flushes to disk. The default setting of 2 is sufficient for most cases. Increase if the server cannot keep up with the persistence layer. |
|
The AWS access key id to use for authentication. |
|
The secret key to use for authentication. You need both the access key and secret key set for authentication to work. |
|
Specifies a constant size for SST files within the SST file manager in cloud storage. The value is set to -1, indicating that the default size is used. |
|
Determines the number of objects listed in a single iteration when interacting with cloud storage. The value is set to 5000. |
|
Specifies the path to the SSL certificate authority (CA) certificate for secure communication with cloud storage. The value is an empty string. |
|
Controls SSL verification when communicating with cloud storage. The value is set to true. |
|
The size of the local SST cache, in bytes. If 0, no SST file will be kept locally. The default value is 18446744073709551615, which means every SST file will be kept locally and if the files are missing at startup, they will be downloaded in the background. |
|
The timeout used for cloud queries, in milliseconds. The default of 0 means that the cloud provider’s default will be used. |
|
Will create the bucket if it does not exist. |
|
The bucket name where to store the data. On AWS, the bucket name must be unique. If name is empty, the database will not upload its content to the cloud. |
|
The bucket name where to read the data from. On AWS, the bucket name must be unique. If name is empty, the database will not read its content from the cloud. |
|
The configuration file to use for authentication. When using a configuration file, |
|
Enabling this option leverages the parallelism of uploads and downloads of files to and from S3. This optimization enhances performance but necessitates increased memory usage for various buffers. |
|
The number of threads to be used by the AWS transfer manager. |
|
The buffer size for the AWS transfer manager. Files larger will be broken into smaller pieces, transfered in parallel. |
|
This configuration is intended for non-AWS S3 deployments. It allows you to specify a different hostname or IP address to communicate with, replacing the default AWS S3 endpoint. |
|
The section name in the configuration file with the |
|
Enabling server-side encryption, if set to true, utilizes AWS KMS when used with encryption_key_id in S3 mode. In other cases, it employs Amazon’s automatically generated S3 server-side encryption key, if provided, to perform encryption. |
|
When enabled, this configuration utilizes EC2 instance metadata for automatic authentication credential resolution when communicating with the S3 bucket. It is typically recommended to set this to |
|
Quasardb will use |
|
The region where the bucket is located. |
|
A string either “none” or “aws” specifying your cloud provider to use to store the cloud data. |
For more information regarding RocksDB tuning refer to RocksDB Tuning Guide
2.1.4. Networking#
The QuasarDB daemon uses TCP for network communications and listens to two ports. The primary port is configurable and is the port used by client applications. Its default value is 2836. The secondary port, is used for cluster discovery mechanisms and queries run on separate, higher priority partitions. This design ensures that when the server is busy processing queries, it can still service cluster management requests and preserve stability.
The value of the secondary port is set to the value of the primary port plus one (1). Thus, by default, the value of the secondary port is 2837.
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
Local address and port to bind to for accepting new connections. Can also refer to network adapter, e.g. |
|
|
|
Specify if the local address we bind to is not the same as other nodes or clients should use to connect to the daemon (e.g. when behind a NAT). Defaults to |
|
|
|
Time (in milliseconds) until a server socket is automatically recycled. |
|
|
|
Time (in milliseconds) until a client connection without activity is automatically disconnected. |
|
|
|
Size of the buffer the server allocates for receiving data from clients. Increase this size if you experience unexpected closed connections from the client, or the server logs mention |
|
|
|
Size of the buffer the server allocates for sending data to clients. Increase this size if you experience unexpected closed connections from the client, or the server logs mention |
|
|
|
Enable changes publications on the firehose. |
|
|
|
Endpoint the firehose should listen to subscribers. Only active if the firehose publisher is enabled. |
|
|
|
A string representing the name of the persisted firehose. An empty string disables the persisted firehose. The default value is “$qdb.firehose”. |
|
|
|
Specifies the size (in milliseconds) of a shard in a persisted firehose. A shard is a logical unit of data within a firehose stream. |
|
|
|
Enable replication. Enabled by default. |
|
|
|
Specifies the number of threads used for publishing data to Firehose. The default value is 1. |
|
|
|
Controls whether the system checks for new versions. |
|
|
|
Defines the maximum number of total server sessions allowed. |
|
|
|
Specifies the threshold in milliseconds for logging slow network operations. |
2.1.5. Clustering#
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
Unique identifier for this node. See also: Node ID configuration. |
|
|
|
One or more peers that should be connected to to discover and bootstrap the cluster. Peers should be specified as a JSON array of tuples of address:port, e.g. |
|
|
|
Replication factor of data stored inside the cluster. Must be equal to or lower than the amount of nodes in your cluster, and not higher than |
|
`` –global-cluster-acl-cache-duration`` |
|
An integer representing the maximum guaranteed duration of cached ACL information. Nodes may cache ACL information to enhance performance. |
|
`` –global-cluster-acl-cache-size`` |
|
An integer representing the maximum number of cached ACL information entries per node. Cached ACL information helps optimize performance. |
|
`` –global-cluster-enable-remote-acl-fetch`` |
|
Enabling this feature allows the cluster to ensure that ACL information remains up-to-date across distributed systems. When set to true, the cluster will actively fetch ACL data remotely. |
|
|
|
An integer representing the maximum guaranteed duration of a transaction, in milliseconds. Transactions exceeding this interval are rolled back. The default value is 15,000 ms. |
|
|
|
An integer indicating the maximum number of copies the cluster retains for transaction history. If an entry has more versions than this value, the oldest versions are subject to garbage collection. |
|
|
|
The maximum wait interval between two stabilizations, in milliseconds. Nodes disappearance will take at least that amount of time. The default value is 60,000 ms (one minute). This value must be greater than the minimum stabilization interval and cannot be lower than 10 ms. |
|
|
|
The minimum wait interval between two stabilizations, in milliseconds. The default value is 100 ms; it is rarely needed to change this value. This value cannot be zero. |
|
|
|
A read-only boolean value representing whether or not this node should stabilize upon startup. Even if set to true, stabilization will still occur. |
|
|
|
Enable or disable synchronization of data before joining the cluster. Enabling it might provoke spurious data downloads and higher disk usage. Disabling it on a clear node might result in missing historical data on this node. |
2.1.6. Security#
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
Boolean that determines whether security should be enabled. Valid values are |
|
|
|
Absolute location to the file that contains the QuasarDB users. Typically should be |
|
|
|
Absolute location to the cluster private key file. Typically should be |
|
|
|
Boolean that determines whether node-to-node communication should be encrypted. Valid values are |
|
|
|
Allows the cluster to be remotely purged via an API call. False by default. |
|
|
|
Allows a node to be remotely stop via an API call. False by default. |
2.1.7. Performance tuning#
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
How many sessions (connections) each server thread is able to handle concurrently. Too few will cause significant performance degradations. Increase this value if you see “out of free server sessions” messages in the logs. |
|
|
|
The number of threads in the main worker pool. Handles blocking I/O operations, and should be increased if your clients are waiting on the server but the server is not at 100% CPU. These are the queries handled by the main port. |
|
|
|
The number of threads dedicated for high-priority, service, messages. The default value of 2 is adequate for most setups and should be only increased for cluster with more than 20 nodes. These are the queries handled by the secondary port. |
|
|
|
Maximum number of open files that the storage engine is allowed to use. This should typically be close to the number of files the |
|
|
|
The maximum amount of memory (in bytes) QuasarDB will use. Any usage above this will cause QuasarDB to evict entries from its cache. |
|
|
|
Same like |
|
|
|
Determines whether newly written data is automatically cached in memory. When set to |
|
|
|
An integer (0-100) representing the percentage of the hard limit at which drastic measures are taken. For example, at 90% of the hard limit, drastic measures are triggered when memory usage exceeds this percentage. |
|
|
|
An integer (0-100) representing the percentage of the soft limit. When memory usage approaches this percentage, entries are evicted to prevent memory exhaustion. |
|
|
|
An integer representing the total maximum number of updated entries that may be queued for asynchronous trimming. Trimming is a background process that optimizes disk space. |
|
|
|
An integer representing the interval (in ms) at which memory statistics are refreshed. Provides insight into system memory usage. |
|
|
|
An integer representing the intensity of soft eviction. It determines the number of entries that are evicted for each newly cached item when the memory usage approaches the soft limit. |
|
|
|
An integer representing the threshold (in bytes) above which memory allocations use huge pages, improving memory performance for large allocations. |
|
|
|
An integer representing the soft memory limit for the Thread Building Blocks (TBB) allocator, responsible for memory management in the cluster. |
|
|
|
When set to true, the TBB allocator uses huge pages for memory allocations to improve memory performance for large allocations. |
2.1.8. Observability#
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
An integer representing verbosity. Use |
|
|
|
Absolute path where log files will be written into. Defaults to |
|
|
|
Enables logging in JSON file format. Required to display user properties. Disabled by default. |
|
|
|
Enables performance profiling from the server side, so that clients can start running performance traces. It has a small performance impact, and is disabled by default. |
|
|
|
Logs warning message when pushes increase shard volume by less than set percent. The default value is 0% - logging disabled. |
|
|
|
Enables runtime statistics collected server side. It has a small performance impact, and is disabled by default. |
|
|
|
Time (in milliseconds) between refreshes of the statistics. Should typically be set to the interval in which you query the node statistics; e.g. if you only poll your statistics once a minute, this value should be |
2.1.9. Asynchronous Timeseries Inserter#
The asynchronous timeseries inserter is a mechanism in QuasarDB to buffer inserts server-side, which you can tune using the configuration options below.
Config File |
Command-Line Argument |
Environment Variable |
Description |
---|---|---|---|
|
|
|
The number of asynchronous time series pipelines. Increase this number if you are experiencing slow inserts, but CPU is not yet at 100%. |
|
|
|
Maximum size of the buffer in bytes for each of the pipelines. Increase this value to if you are experiencing slow inserts and CPU is at 100%. |
|
|
|
Maximum number of entries to buffer for each of the pipelines. Increase this value to if you are experiencing slow inserts and CPU is at 100%. |
|
|
|
Maximum time duration between consequent pipeline flushes. |
|
|
|
The maximum time duration, in milliseconds, allowed for old data to remain in the asynchronous time series flush pipeline before it is forced to be flushed. Data that remains in the pipeline beyond this deadline will be flushed to storage. The default value is 3,600,000 ms (1 hour). |
|
|
|
A threshold that determines which data is considered “old” and should be processed by the asynchronous time series inserter. Data older than this threshold, expressed in milliseconds, will be processed. The default value is 0 ms. |
2.2. QuasarDB Rest Server#
See quasardb REST API for detailed information on configuring the QuasarDB Rest Server, including available options for command line arguments, environmental variables, and JSON config files.
2.3. Docker#
As it can a bit tedious to edit a configuration file inside a Docker container, the bureau14/qdb
docker container provides several environment variables you can use to configure the most common configuration options.
Below you find an overview of the different environment variables the image supports:
Variable |
Example usage |
---|---|
|
docker run -d \
-e QDB_LOCAL_USER_LICENSE_KEY="$your_license_key" \
bureau14/qdb
|
|
docker run -d \
-v /path/to/my/qdb.key:/qdb.key \
-e QDB_LOCAL_USER_LICENSE_FILE=/qdb.key \
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_NETWORK_LISTEN_ON=102836 \
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_NETWORK_ADVERTISE_AS="172.16.64.8" \
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_LOGGER_LOG_LEVEL=1 \
bureau14/qdb
|
|
docker run -d \
-v /my/qdb/log/output:/logs \
-e QDB_LOCAL_LOGGER_LOG_DIRECTORY="/logs" \
bureau14/qdb
|
|
docker run -d \
-e QDB_GLOBAL_SECURITY_ENABLED=true \
\
-v /path/to/my/qdb/private.key:/private.key \
-e QDB_GLOBAL_SECURITY_CLUSTER_PRIVATE_FILE=/private.key \
\
-e QDB_GLOBAL_SECURITY_USER_LIST=... \
bureau14/qdb
|
|
docker run -d \
-e QDB_GLOBAL_CLUSTER_REPLICATION_FACTOR=3 \
\
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_NETWORK_TOTAL_SERVER_SESSIONS=8192 \
\
bureau14/qdb
|
|
docker run -d \
--cpu-number=32 \
-e QDB_LOCAL_NETWORK_PARALLELISM=24 \
-e QDB_LOCAL_DEPOT_ROCKSDB_THREADS=8 \
-e QDB_LOCAL_DEPOT_ROCKSDB_HI_THREADS=1 \
\
bureau14/qdb
|
|
docker run -d
-e QDB_GLOBAL_CLUSTER_PUBLISH_FIREHOSE="true" \
\
bureau14/qdb
|
|
docker run -d
-e QDB_GLOBAL_CLUSTER_PUBLISH_FIREHOSE="true" \
-e QDB_LOCAL_NETWORK_FIREHOSE_PUBLISHING_THREADS=4 \
\
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_LIMITER_MAX_BYTES_SOFT=17179869184 \
-e QDB_LOCAL_LIMITER_MAX_BYTES_HARD=25769803776 \
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_LIMITER_MAX_BYTES_SOFT=17179869184 \
-e QDB_LOCAL_LIMITER_MAX_BYTES_HARD=25769803776 \
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_DEPOT_ROCKSDB_THREADS=8 \
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_DEPOT_ROCKSDB_THREADS=16 \
-e QDB_LOCAL_DEPOT_ROCKSDB_HI_THREADS=2 \
bureau14/qdb
|
|
docker run -d \
-e QDB_LOCAL_DEPOT_ROCKSDB_CLOUD_PROVIDER="aws" \
bureau14/qdb
|
2.4. Configuring Short-Term Credentials for Amazon S3#
When connecting to Amazon S3 using short-term credentials, it’s essential to properly configure your application. This section guides you through the necessary steps and parameters.
2.4.1. Configuration Parameters#
You need to specify two configuration parameters in your application:
rocksdb.cloud.aws.config_file
: The name of the configuration file.rocksdb.cloud.aws.config_file_section
: The section name in the configuration file.
2.4.2. Format of the Configuration File#
Within the specified configuration file, you need to create a section (e.g., [section_name]) with three key-value pairs:
aws_access_key_id
: Your AWS access key ID.aws_secret_access_key
: Your AWS secret access key.aws_session_token
: Your AWS session token (for short-term credentials).
2.4.3. QDBD Configuration#
This section explains how to configure your application for use with short-term credentials in a specific JSON format. It includes the following fields:
"aws": {
"config_file": "[credential file path]",
"config_file_section": "",
"bucket": {
"destination_bucket": "[bucket name]",
"path_prefix": "",
"region": "",
"source_bucket": "[bucket name]"
"use_instance_auth": false,
"provider": "aws",
"config_file"
: The path to the credential file."config_file_section"
: The section name in the credential file."bucket"
: Configuration related to the S3 bucket."use_instance_auth"
: A boolean value indicating whether to use instance authentication (set to false for local machines)."provider"
: The provider (in this case, “aws”).
2.4.4. Copying Credentials#
It also possible to copy/paste the credential file from either the Command Line or Programmatic Access. It’s a reminder to create a new credential file and not overwrite the default AWS credential file.