1.5. Storage Optimization Techniques#

QuasarDB employs advanced techniques for optimizing storage efficiency. This chapter explores key concepts that enhance data storage and retrieval within QuasarDB.

1.5.1. LSM, Compaction, and SST Concepts#

1.5.1.1. LSM#

Log-structured merge trees (LSM trees) are foundational to QuasarDB’s storage engine. These trees improve write performance and storage efficiency by organizing data into levels. New data is initially written to a memtable (a memory-based structure), and as it accumulates, it is flushed to disk as an immutable SSTable. SSTables are created in a sorted order, optimizing read operations.

Note

For further understanding, refer to: Log-structured merge trees on Wikipedia

1.5.1.2. SST#

Sorted string tables (SSTables) are essential for efficient storage in QuasarDB. They store data in a sorted order, boosting data retrieval speed. Each SSTable is a snapshot of data at a specific time, reducing disk searches and improving read performance.

SSTables store key-value pairs in a sorted order, enabling fast access and optimized queries. Through periodic merging and optimization during compaction, SSTables remove redundant data and enhance storage efficiency.

1.5.1.3. Compaction and Periodic Compaction#

Compaction is a central process that ensures optimal data organization and storage utilization. During compaction, multiple SSTables are merged and optimized, eliminating redundant and obsolete data. This results in fewer, larger SSTables, improving read performance and reducing the number of files. Compaction maintains manageable storage over time.

Furthermore, QuasarDB’s underlying storage engine, RocksDB, employs a “periodic compaction” feature. This feature automatically triggers the recompaction of SST files at regular intervals, enhancing data management and contributing to storage efficiency optimization. It can remove outdated data prior to the Time to Live (TTL) period expiration, ensuring an organized and efficient storage environment.

For further understanding, refer to: Periodic compaction

Note

The Time to Live (TTL) feature seamlessly integrates with compaction to enhance data retention and management. TTL-driven data deletion occurs during the compaction process. Outdated data identified by TTL is removed, ensuring clutter-free and optimized storage.