3. Metrics reference#

This document lists all kernel-level metrics available in QuasarDB, categorized by subsystem. Each metric includes its name, type (accumulator or gauge), and a concise description. This structure is optimized for both human readability and AI model training purposes.

3.1. Cache#

Metric Name

Type

Description

evicted.count

accumulator

How many evictions were done (from the QuasarDB cache)

evicted.total_bytes

accumulator

How many bytes were evicted

memory.persistence.cache_bytes

gauge

The size of all block caches bytes (RocksDB) - we don’t use the row cache at this time

memory.persistence.memtable_bytes

gauge

The number of bytes in the memtables (RocksDB)

memory.persistence.memtable_unflushed_bytes

gauge

The numbers of unflushed bytes in the memtables (RocksDB)

memory.persistence.table_reader_bytes

gauge

Memory usage of all RocksDB table readers

memory.persistence.total_bytes

gauge

Memory usage of the persistence: memtable bytes, table reader, cache bytes

memory.physmem.total_bytes

gauge

The total amount of physical memory in bytes detected on the machine

memory.physmem.used_bytes

gauge

The amount of physical memory in bytes used on the machine

memory.resident_bytes

gauge

The size in bytes of all entries currently in memory

memory.resident_count

gauge

The number of entries in memory, an entry is a value in the internal hash table, which is correlated, but not identical, to table counts

memory.tbb.global_loc_total_bytes

gauge

Advanced internal stat used for debugging complex memory issues only

memory.tbb.huge_threshold_bytes

gauge

When the allocator will use huge pages (if supported)

memory.tbb.large_object_bytes

gauge

The total bytes of large objects (eg big allocations that don’t fit in the optimized structures)

memory.tbb.large_object_count

gauge

The total count of large objects (eg big allocations that don’t fit in the optimized structures)

memory.tbb.large_unaligned_bytes

gauge

Advanced internal stat used for debugging complex memory issues only

memory.tbb.max_requested_bytes

gauge

The largest allocation request ever made

memory.tbb.softlimit_bytes

gauge

The threshold, in bytes, where TBB will return memory to the OS. Below that threshold, TBB will hold the bytes.

memory.tbb.total_bytes

gauge

Total bytes currently allocated (managed) by TBB - note not every allocation in Quasar goes through TBB

memory.tbb.total_count

gauge

The number of allocations made to TBB

memory.vm.total_bytes

gauge

How many bytes of virtual memory the process can use, this value is usually extremely high on 64-bit operating systems

memory.vm.used_bytes

gauge

How many bytes of virtual memory the process is currently using, can be much higher than the actual memory usage when memory is reserved but not actually used

pageins.count

accumulator

How many pagins were done by Quasar (from disk to Quasar cache)

pageins.total_bytes

accumulator

How many bytes were paged in

3.2. Cache - LRU2#

QuasarDB uses a two-level LRU (LRU2) caching strategy consisting of a cold and hot layer. New entries are first placed in the cold cache and only promoted to the hot cache on repeated access. This design improves hit rates for frequently accessed data while avoiding pollution by one-time reads.

The LRU2 metrics help observe:
  • Cold/hot cache pressure (evictions, promotions)

  • Cache efficiency (hit ratios)

  • I/O load due to cache misses (page-ins)

Metric Name

Type

Description

lru2.cold.pagein.count

accumulator

Total number of entries read from disk into the cold cache layer

lru2.cold.pagein.total_bytes

accumulator

Total bytes read from disk into the cold cache layer

lru2.cold.evicted.count

accumulator

Number of entries removed from the cold cache before promotion to hot

lru2.cold.evicted.total_bytes

accumulator

Bytes removed from the cold cache before promotion to hot

lru2.cold.count

gauge

Current number of entries in the cold cache

lru2.hot.evicted.count

accumulator

Total number of evictions from the hot cache

lru2.hot.evicted.total_bytes

accumulator

Total bytes evicted from the hot cache

lru2.hot.promoted.count

accumulator

Total number of entries promoted from cold to hot cache

lru2.hot.promoted.total_bytes

accumulator

Total bytes promoted from cold to hot cache

lru2.hot.hit.count

accumulator

Number of cache hits in the hot layer (entry already promoted)

lru2.hot.hit.total_bytes

accumulator

Total bytes hit in the hot cache

lru2.hot.count

gauge

Current number of entries in the hot cache

3.3. Clustering#

Metric Name

Type

Description

chord.invalid_requests_count

accumulator

How many times the client sent a request to the wrong node

chord.predecessor_changes_count

accumulator

How many times the predecessor changed, if more than a couple of times, cluster has issues

chord.successor_changes_count

accumulator

Same as predecessor but for successor

chord.unstable_errors_count

accumulator

How many times we returned “unstable cluster” to the user

sync_with_master.elapsed_sec

accumulator

cluster to cluster time elapsed in seconds

sync_with_master.failures_count

accumulator

cluster to cluster error count

sync_with_master.successes_count

accumulator

cluster to cluster successes count

3.4. Environment#

Metrics related to the environment in which QuasarDB is running, such as the OS, license, quasardb version, etc.

Metric Name

Type

Description

hardware_concurrency_count

gauge

The value returned by std::thread::hardware_concurrency(), very useful to diagnose problems

license.attribution_date_epoch

gauge

When the license was attributed, epoch

license.expiration_date_epoch

gauge

Expiration date in seconds from epoch

license.max_memory_bytes

gauge

The maximum number of bytes allowed by the node

license.remaining_days_count

gauge

Numbers of days left until the license expires

license.support_until_epoch

gauge

When the support will expire in seconds from epoch

startup_epoch

gauge

Startup time stamp in seconds from epoch

3.5. Indexes#

Metrics that relate to the microindex subsystem of QuasarDB, which speeds up queries.

Metric Name

Type

Description

queries.microindex.aggregation.match_count

accumulator

How many times an aggregation successfully leveraged the microindex

queries.microindex.aggregation.miss_count

accumulator

How many times an aggregation could not leverage the microindex

queries.microindex.filter.match_count

accumulator

How many times a filter (eg WHERE) successfully leveraged the microindex

queries.microindex.filter.miss_count

accumulator

How many times a filter (eg WHERE) could not leverage the microindex

3.6. Network#

Network related metrics, useful for understanding the number of requests, simulatenous users and network throughput.

Metric Name

Type

Description

network.current_users_count

gauge

How many users currently have an active session

network.partitions_count

gauge

How many partitions are there

network.sessions.available_count

gauge

How many sessions are available

network.sessions.max_count

gauge

How many sessions total are available

network.sessions.unavailable_count

gauge

How many sessions are currently busy

network.threads_per_partition_count

gauge

How many threads does each partition have

requests.in_bytes

accumulator

How many bytes in accrosss all calls

requests.out_bytes

accumulator

How many bytes out accross all calls

requests.slow_count

accumulator

How many requests lasted for longer than log slow operation setting

requests.total_count

accumulator

How many requests (accross all calls) we have received successes + failures

requests.successes_count

accumulator

How many successes (accross all calls)

requests.failures_count

accumulator

How many failures / errors accross all calls

3.7. Performance profiling#

Only enabled when network.profile_performance is enabled in qdbd. Useful for better understanding how busy the cluster is, and where the majority of the time is spent.

Metric Name

Type

Description

perf.[name].[metric].total_ns

accumulator

time spend in ns for the given perf metric of name function

perf.[name].total_ns

accumulator

aggregated total for the function

perf.total_ns

accumulator

total of all measured functions in the current performance trace, helpful to compute ratios of a given function

3.8. Storage#

Metric Name

Type

Description

persistence.bucket.total_bytes

accumulator

How many bytes were written to disk for buckets, including large buckets

persistence.bucket.total_count

accumulator

How many times did we write buckets to disk, including large buckets

persistence.bucket.total_us

accumulator

How many microseconds did we spend writing buckets, including large buckets

persistence.bucket_deletion_count

accumulator

How many times did we delete from a bucket

persistence.bucket_insert_count

accumulator

How many times did we insert into a bucket

persistence.bucket_read_count

accumulator

How many times did we read from a bucket

persistence.bucket_update_count

accumulator

How many times did we update a bucket

persistence.cloud_local_cache_bytes

gauge

The current size, in bytes, of the cloud cache (RocksDB + S3)

persistence.entries_count

gauge

The number of entries in the persistence layer, correlated with the number of tables/buckets, but usually higher

persistence.large_bucket.total_bytes

accumulator

How many bytes were written to disk for all the large buckets

persistence.large_bucket.total_count

accumulator

How many times did we write a large bucket

persistence.large_bucket.total_us

accumulator

How many microseconds did we spend writing a large bucket

persistence.persistent_cache_bytes

gauge

The current size, in bytes, of the persisted cache. The persisted cache is used to cache slower I/O on faster I/O. Not to be confused with the cloud cache.

persistence.ts_write.failures_count

accumulator

How many “writes” (all ts operations) failed

persistence.ts_write.successes_count

accumulator

How many “writes” (all ts operations) succeded

persistence.utilized_bytes

gauge

How many bytes used on disk, low-level RocksDB metric

persistence.read_bytes

gauge

How many bytes read from disk, low-level RocksDB metric

persistence.written_bytes

gauge

How many bytes written to disk, lowl-level RocksDB Metric

3.9. Storage - Async Pipelines#

These metrics relate to the async pipelines storage subsystem, which can be heavy in CPU/memory usage, typically used in streaming data use cases.

Metric Name

Type

Description

async_pipelines.[number].buffer.bytes

gauge

How many bytes we have in the “merge” map of the async pipelines (a buffer)

async_pipelines.[number].buffer.count

gauge

How many entries we have in the “merge” map of the async pipelines (a buffer)

async_pipelines.buffer.total_bytes

accumulator

The number of bytes merged by the async pipelines (eg smaller requests merged into a larger one)

async_pipelines.buffer.total_count

accumulator

The number of merge operations

async_pipelines.busy_denied_count

accumulator

denied writes because pipe is full for a given user

async_pipelines.busy_denied_count.total

accumulator

same but for all users

async_pipelines.errors_count

accumulator

errors for the current user id

async_pipelines.errors_count.total

accumulator

errors for all users

async_pipelines.low.state_write.duration_us

accumulator

The time elapsed to write the state of the low priority async pipes

async_pipelines.pulled.total_bytes

accumulator

How many bytes were pulled from the pipelines by the merger

async_pipelines.pulled.total_count

accumulator

How many times data was pulled from the pipelines by the merger

async_pipelines.pushed.total_bytes

accumulator

How many bytes were pushed to the pipelines by a user

async_pipelines.pushed.total_count

accumulator

How many times data was pushed to the pipelines by a user

async_pipelines.write.bytes_total

accumulator

How many bytes were written to disk

async_pipelines.write.elapsed_us

accumulator

How much time was spent writing to disk, this includes serialization, inserting into the timeseries structure in memory, etc

async_pipelines.write.failures_count

accumulator

How many failures for the given user

async_pipelines.write.failures_count.total

accumulator

How many failures for all users

async_pipelines.write.successes_count

accumulator

How many successes for the given user

async_pipelines.write.successes_count.total

accumulator

How many successes for all users

3.10. Storage - Backups#

These metrics relate to backups of the storage subsystem

Metric Name

Type

Description

backup.elapsed_sec

accumulator

How much time did we spend backing up

backup.failures_count

accumulator

How many errors?

backup.successes_count

accumulator

Background backup successes

backup.total_bytes

accumulator

How many bytes were written to disk

3.11. Storage - Optimization#

These metrics relate to background tasks and operations for the storage subsystem that help maintain performance and manage data lifecycle.

Metric Name

Type

Description

compact.cancelations_count

accumulator

Background compaction cancelations

compact.elapsed_sec

accumulator

How much time did we spend compacting

compact.failures_count

accumulator

Background compaction failures

compact.successes_count

accumulator

Background compaction successes (not automatic, explicit calls)

trim.cancelations_count

accumulator

Background trim cancelations

trim.elapsed_sec

accumulator

Background trim duration

trim.failures_count

accumulator

Background trim failures

trim.successes_count

accumulator

Background trim successes

3.12. Metric Unit Interpretation#

Metric names use suffixes to indicate the unit or value type:

Suffix

Meaning

_ns

Duration in nanoseconds

_us

Duration in microseconds

_sec

Duration in seconds

_epoch

Timestamp (seconds since Unix epoch)

_bytes

Byte count (e.g., memory or I/O)

_count

Count of operations or events

_total

Cumulative count or size