diff options
Diffstat (limited to 'chapters')
| -rw-r--r-- | chapters/related-works.tex | 269 |
1 files changed, 262 insertions, 7 deletions
diff --git a/chapters/related-works.tex b/chapters/related-works.tex index 7a42003..e828593 100644 --- a/chapters/related-works.tex +++ b/chapters/related-works.tex @@ -1,23 +1,278 @@ \chapter{Related Work} \label{chap:related-work} -\section{Implementations of Bentley-Saxe} +While we have already discussed, at length, the most +directly relevant background work in the area of dynamization in +Chapter~\ref{chap:background}, there are a number of other lines of work +that are related, either directly or superficially, to our ultimate +goal of automating some, or all, of the process of constructing a +database index. In this chapter, we will discuss some of the most notable +of these works. -\subsection{Mantis} +\section{Existing Applications of Dynamization} -\subsection{Metric Indexing Structures} +We will begin with the most directly relevant topics: other papers which +use dynamization to construct specific dynamic data structures. There +are a few examples of works which introduce a new data structure, and +simply apply dynamization to them for the purposes of adding update +support. For example, both the PGM learned index~\cite{pgm} and the KDS +tree~\cite{xie21} propose static structures, and then apply dynamization +techniques to support updates. However, in this section, we will focus +on works in which dynamization appears as a major focus of the paper, +not simply as an incidental tool. -\subsection{LSMGraph} +One of the older applications of the Bentley-Saxe method is in the +creation of a data structure called the Bkd-tree~\cite{bkd-tree}. +This structure is a search tree, based on the kd-tree~\cite{kd-tree}, +for multi-dimensional searching, that has been designed for use in +external storage. While it was not the first external kd-tree, existing +implementations struggled with support for updates, which typically +were both inefficient, and resulted in node structures that poorly +utilized the space (i.e., many nodes were mostly empty, resulting in +less efficient searches). To resolve these problems, the authors used +a statically constructed external structure,\footnote{To clarify, +the K-D-B-tree supports updates, but poorly. For Bkd-tree, the authors +exclusively used the K-D-B-tree's static bulk-loading, which is highly +efficient. } the K-D-B-tree, and combined it with the logarithmic method +to create a full-dynamic structure (K-D-B-tree supports deletes natively, +so the problem is deletion decomposable). The major contribution of +this paper, per the authors, was not necessarily the structure itself, +but the demonstration of the viability of the logarithmic method, as +they showed extensively that their dynamized static structure was able +to outperform native dynamic implementations on external storage. -\subsection{PGM} +A more recent paper discussing the application of the logarithmic method +to a specific example is its application to the Mantis structure for +large-scale DNA sequence search~\cite{mantis-dyn}. Mantis~\cite{mantis} +is one of the fastest and most space efficient structures for sequence +search, but is static. To create a half-dynamic version of Mantis, the +authors first design an algorithm to efficiently merge multiple Mantis +structures together (turning their problem into an MDSP, though they +don't use this terminology). They then apply a modified version of the +logarithmic method, including LSM-tree inspired features like leveling, +tiering, and a scale factor. The resulting structure was shown to perform +quite well compared to other existing solutions. -\subsection{BKD Tree} +Another notable work considering dynamization techniques examines applying +the logarithmic method to produce full-dynamic versions of various metric +indexing structures~\cite{naidan14}. In this paper, the logarithmic +method is directly applied to two static metric indexing structures, +VPTree and SSS-tree, to create full-dynamic versions (using weak deletes) +These dynamized structures are compared to two dynamic baselines, DSA-tree +and EGNAT, for both multi-dimensional range scans and $k$-NN searches. The +paper contains extensive benchmarks which demonstrate that the dynamized +versions perform quite well, being capable of even beating the dynamic +ones in query performance under certain circumstances. It's worth noting +that we also tested a dynamized VPTree in Chapter~\ref{chap:framework} +for $k$-NN, and obtained results in line with theirs. + +Finally, LSMGraph is a recently proposed system which applies +dynamization techniques\footnote{ + The authors make a point of saying that they are \emph{not} + applying dynamization, but instead embedding their structure + inside of an LSM-tree, noting the challenges associated + with applying dynamization directly to graphs. While the + specific techniques they are using are not directly taken + from any of the dynamization techniques we discussed in + Chapter~\ref{chap:background}, we nonetheless consider this + work to be an example of dynamization, at least in principle, + because they decompose a static structure into smaller blocks + and handle inserts by rebuilding these blocks systematically.} +to the compressed sparse row (CSR) matrix representation of graphs to +produce an dynamic, external, graph storage system~\cite{lsmgraph}. This +is a particularly interesting example, because graphs and graph algorithms +are \emph{not} remotely decomposable. Adjacent vertices in the graph may +be spread across many levels, and this means that graph algorithms cannot +be decomposed, as traversals must access adjacent vertices, regardless +of which block they are contained within. To resolve this problem, the +authors discard the general query model and build a tightly integrated +system which uses an index to map each vertex to the block containing +it, and implement a vertex-adjacency aware reconstruction process which +helps ensure that adjacent vertices are compacted into the same blocks +during reconstruction. \section{LSM Tree} +The Log-structured Merge-tree (LSM tree)~\cite{oneil96} is a data +structure proposed by Oneil \emph{et al.} in the mid-90s that is designed +to optimize for write throughput in external indexing contexts. While +Oneil never cites any of the dynamization work we have considered in this +work, the structure that he proposed is eerily similar to a decomposed +data structure, and future developments have driven the structure in a +direction that looks incredibly similar to Bentley and Saxe's logarithmic +method. In fact, several of the examples of dynamization used in the +previous section (as well as this work) either borrow concepts from +modern LSM trees, or go so far as to use the term ``LSM'' as a synonym +for what we call dynamization in this work. However, the work on LSM +trees is distinct from dynamization, at least general dynamization, +because it leans heavily on very specific aspects of the search problems +(point lookup and single-dimensional range search) and data structure in +ways that don't generalize well. In this section, we'll discuss a few +of the relevant works on LSM trees and attempt to differentiate them +from dynamization. + +\subsection{The Structure} + +The modern LSM tree is a single-dimensional range data structure that is +commonly used in key-value stores such as RocksDB. It consists of a small, +dynamic in-memory structure called a memtable, and a sequence of static, +external structures on disk of geometrically increasing size. These +structures are organized into levels, which can contain either one +structure or several, with the former strategy being called leveling and +the latter tiering. The individual structures are often simple sorted +arrays (with some attached metadata) called runs, which can be further +decomposed into smaller files called sorted string tables (SSTs). Records +are inserted into the memtable initially. When the memtable is filled, it +is flushed and the records are merged into the top level of the structure, +with reconstructions proceeding according to various merge policies +to make room as necessary. LSM trees typically support point lookup +and range queries, and answer these by searching all of the runs in the +structure and merging the results together. To accelerate point lookups, +Bloom filters~\cite{bloom70} are often built over the records in each run +to allow skipping of some of the runs that don't contain the key being +searched for. Deletes are typically handled using tombstones. + \subsection{Design Space} -\subsection{SILK} +The bulk of work on LSM trees that is of interest to us focuses on the +associated design space and performance tuning of the structure. There +have been a very large number of papers discussing different ways +of decomposing the structure, performing reconstructions, allocating +resources to filters and other auxiliary structures, etc., to optimize +for resource usage, enable performance tuning, etc. We'll summarize a few +of these works here. + +One major line of work in LSM trees involves optimizing the memory +allocation of Bloom filters to the sorted runs within the structure. Bloom +filters are commonly used in LSM trees to accelerate point lookups, +because these queries must examine each run, from top to bottom, until a +matching key is found. Bloom filters can be used to improve performance by +allowing some runs to be skipped over in this searching process. Because +LSM trees are an external data structure, the savings from doing this can +be quite large. There are a number of works in this area~\cite{dayan18-1, +zhu21}, but we will highlight Monkey~\cite{dayan17} specifically. +Monkey is a system that optimizes the allocation of Bloom filter memory +across the levels of an LSM tree, based on the observation that the +worst-case lookup cost (i.e., the cost of a point-lookup on a key that +doesn't exist within the LSM tree) is directly proportional to the sum +of the false positive rates across all levels in the tree. Thus, memory +can be allocated to filters in a way that minimizes this sum. These works +could be useful in the context of dynamization for problems which allow +similar optimizations, such as a dynamized structure for point lookups +using Bloom filters, or possibly range scans using range filters such +as SuRF~\cite{zhang18} or Rosetta~\cite{siqiang20}, but aren't directly +applicable to the general problem of dynamization. + +Other work in LSM trees considers different merge +policies. Dostoevsky~\cite{dayan18} introduces two new merge policies, and +the so-called Wacky Continuum~\cite{dayan19} introduces a general design +space that includes Dostoevsky's policies, the traditional LSM policies, +and a new policy called an LSM bush. As it encompasses all of these, +we'll exclusively summarize the Wacky Continuum here. Wacky defines a +merge policy based on three parameters: the capping ratio ($C$), growth +exponential ($X$), merge greed ($K$), and largest-level merge greed +($Z$). The merge greed parameters are used to define the merge threshold, +which effectively allows merge policies that sit between leveling and +tiering. Each level contains an ``active'' run, into which new records +will be merged. Once this run contains specified fraction of the level's +total capacity (determined by the merge greed parameters), a new run +will be added to the level and made active. Leveling can be simulated in +this model by setting this merge greed parameter to 1, so that 100\% of +the level's capacity is allocated to a single run. Tiering is simulated +by setting this parameter such that each active run can only sustain a +single set of records, and a new run is created each time records are +merged into the level. The Wacky continuum allows configuring the merge +greed of the last level independently from the inner levels, to allow it +to support lazy leveling, a policy where the largest level in the LSM +tree contains a single run, but the inner levels are tiered. The other +design parameters are simpler: the capping ratio allows the size ratio +of the last level to be varied independently of the inner levels, and +the growth exponential allows the size ratio between adjacent levels to +grow as the levels get larger. This work also introduces an optimization +system for determining good values for all of these parameters for a +given workload. + +It's worth taking a moment to address why we did not consider the Wacky +Continuum design space in our attempts to introduce a design space into +dynamization. It appears that these concepts would be useful to us, given +that we imported the basic leveling and tiering concepts from LSM trees. +However, we believe that this particular set of design parameters are +not broadly useful outside of the LSM context. The reason for this is +shown within the experimental section of the Wacky paper itself. For +workloads involving range reads, the standard leveling/tiering designs +show perfectly reasonable (and sometimes even superior) performance +trade-offs. In large part, the Wacky Continuum work is an extension of +the authors' earlier work on Monkey, as it is most effective at improving +trade-offs for point-lookup performance, which are strongly influenced by +Bloom filters. The range scan results are the ones most closely related to +the general dynamization case we have been considering in this work, where +filters cannot be assumed and filter memory allocation isn't an important +consideration. And, in that case, the new merge policies available within +the Wacky Continuum didn't provide enough advantage to be considered here. + +Another aspect of the LSM tree design space is compaction granularity, +which was studied in Spooky~\cite{dayan22}. Because LSM trees are built +over sorted runs of data, it is possible to perform partial merges between +levels--moving only a small portion of the data from one level to another. +This can improve write performance by making reconstructions smaller, +and also help reduce the transient storage requirements of the structure, +but comes at the cost of additional write amplification due to files +being repeatedly re-written on the same level more frequently. The storage +benefit comes from the fact that LSM trees require extra working storage +to perform reconstructions, and making the reconstructions smaller reduces +this space requirement. Spooky is a method for determining effective +approaches to performing partial reconstructions. It works partitioning the +largest level into equally sized files, and then dynamically partitioning +the other levels in the structure based on the key ranges of the last +level files. This approach is shown to provide a good balance between +write-amplification and reconstruction storage requirements. This concept +of compaction granularity is another example of an LSM tree specific +design element that doesn't generalize well. It could be used in the +context of dynamization, but only for single-dimensional sorted data, and +so we have not considered it as part of our general techniques. + +\subsection{Tail Latency Control} +LSM trees are susceptible to similar insertion tail latency +problems as dynamization. While tail latency can be controlled by +using partial reconstructions, such as in Spooky~\cite{dayan22} +(though this isn't a focus of the work), there does exist some work +specifically on controlling tail latency. One notable result in this +area is SILK~\cite{balmau19,silk-plus}, which focuses on reducing tail +latency using intelligent scheduling of reconstruction operations. + +After performing an experimental evaluation of various LSM tree based key +value systems, the authors determine three main principles for designing +their tail latency control system, +\begin{enumerate} + \item I/O bandwidth should be opportunistically allocated to + reconstructions. + \item Reconstructions at smaller levels in the tree should be + prioritized. + \item Reconstructions at larger levels in the tree should be + preemptable by those on lower levels. +\end{enumerate} +The resulting system prioritizes buffer flushes, which are given dedicated +threads and priority access to I/O bandwidth. The next highest priority +operations are reconstructions involving levels $0$ and $1$, which must be +completed to allow flushes to proceed. These reconstructions are able to +preempt any other running compaction if there is not an available thread +when one is scheduled. All other reconstructions run with lower priority, +and may need to be wholly discarded if a high priority reconstruction +invalidates them. The system also includes sophisticated I/O bandwidth +controls, as this is a constrained resource in external contexts. + +Some of the core concepts underlying the SILK system +inspired the tail latency control system we have proposed in +Chapter~\ref{chap:tail-latency}, but our system is quite distinct from +it. SILK leverages some consequences of the LSM tree design space in ways +that our system cannot rely upon. For example, SILK uses a two-version +buffer (like we do), but is able to allocate enough I/O bandwidth to +ensure that one of the buffer versions can be flushed before the other +one fills up. Given the constraints of our dynamization system, with an +unsorted buffer, this was not possible to do. Additionally, SILK uses +partial compactions to reduce the size of reconstructions. These factors +let SILK maintain the LSM tree structure without having to resort to +insertion throttling, as we do in our system. \section{GiST and GIN} |