Big question
LSM tree will sacrifice read performance with write performance. This paper, bLSM
( a Log Structured Merge (LSM) tree with the advantages of B-Trees and log structured approaches), claims to have near optimal read
and scan
performance, and a bounded write latency
with spring and gear
merge scheduler.
Background
bLSM is designed as a backing storage for Yahoo’s geographically distributed key-value storage system, and Walnut, a elastic cloud storage system.
Specific question
Reduce read amplification
It seems like they are using fractal tree structure. But the design seems not clear to me. I haven’t get a clue about how they implement the idea.
Write pause
level scheduler
: A merge scheduler.
The explanation in the paper is not clear to me. Hard to understand.