diff options
| author | Douglas B. Rumbaugh <doug@douglasrumbaugh.com> | 2025-07-06 18:21:32 -0400 |
|---|---|---|
| committer | Douglas B. Rumbaugh <doug@douglasrumbaugh.com> | 2025-07-06 18:21:32 -0400 |
| commit | 0dc1a8ea20820168149cedaa14e223d4d31dc4b6 (patch) | |
| tree | 2bc726803cf6de6d669958b1f5a79cde59722e00 /chapters | |
| parent | 0fff4753fac809a6ba17f428df3a041cebe692e0 (diff) | |
| download | dissertation-0dc1a8ea20820168149cedaa14e223d4d31dc4b6.tar.gz | |
updates
Diffstat (limited to 'chapters')
| -rw-r--r-- | chapters/abstract.tex | 52 | ||||
| -rw-r--r-- | chapters/background.tex | 2 | ||||
| -rw-r--r-- | chapters/beyond-dsp.tex | 24 | ||||
| -rw-r--r-- | chapters/design-space.tex | 127 | ||||
| -rw-r--r-- | chapters/dynamic-extension-sampling.tex | 2 | ||||
| -rw-r--r-- | chapters/dynamization.tex | 15 | ||||
| -rw-r--r-- | chapters/related-works.tex | 4 | ||||
| -rw-r--r-- | chapters/sigmod23/background.tex | 8 | ||||
| -rw-r--r-- | chapters/sigmod23/framework.tex | 18 | ||||
| -rw-r--r-- | chapters/tail-latency.tex | 19 |
10 files changed, 142 insertions, 129 deletions
diff --git a/chapters/abstract.tex b/chapters/abstract.tex index 5ddfd37..602edd4 100644 --- a/chapters/abstract.tex +++ b/chapters/abstract.tex @@ -10,33 +10,29 @@ result, a large number of potentially useful data structures are excluded from use in such systems, or at the very least require a large amount of development time to be made useful. -This work seeks to address this difficulty by introducing a framework for -automatic data structure dynamization. Given a static data structure and -an associated query, satisfying certain requirements, this proposed work -will enable automatically adding support for concurrent updates, with -minimal modification to the data structure itself. It is based on a -body of theoretical work on dynamization, often called the "Bentley-Saxe -Method", which partitions data into a number of small data structures, -and periodically rebuilds these as records are inserted or deleted, in -a manner that maintains asymptotic bounds on worst case query time, -as well as amortized insertion time. These techniques, as they currently -exist, are limited in usefulness as they exhibit poor performance in -practice, and lack support for concurrency. But, they serve as a solid -theoretical base upon which a novel system can be built to address -these concerns. +This work seeks to address this difficulty by introducing a framework +for automatic data structure dynamization. Given a static data structure +and an associated query, satisfying certain requirements, this proposed +work will enable automatically adding support for concurrent updates, +with minimal modification to the data structure itself. It is based on a +body of theoretical work on dynamization, often called the ``Bentley-Saxe +Method'', which partitions data into a number of small data structures, +and periodically rebuilds these as records are inserted or deleted, in a +manner that maintains asymptotic bounds on worst case query time, as well +as amortized insertion time. These techniques, as they currently exist, +are limited in usefulness as they are restricted in the situations they +can be applied, lack support for configuration and concurrency, and have +poor insertion tail latency performance. Despite these shortcomings, +these techniques can serve as a solid theoretical base upon which a +novel system can be built to address these concerns. -To develop this framework, sampling queries (which are not well served -by existing dynamic data structures) are first considered. The results -of this analysis are then generalized to produce a framework for -single-threaded dynamization that is applicable to a large number -of possible data structures and query types, and the general framework -evaluated across a number of data structures and query types. These -dynamized static structures are shown to equal or exceed the performance -of existing specialized dynamic structures in both update and query -performance. - -Finally, this general framework is expanded with support for concurrent -operations (inserts and queries), and the use of scheduling and -parallelism is studied to provide worst-case insertion guarantees, -as well as a rich trade-off space between query and insertion performance. +To develop this framework, we first consider dynamizing data structures +for sampling queries (which are not well served by existing dynamic data +structures). We then generalize these results to produce a framework +that is able to provide concurrent insertion and deletion support for +a wide range of data structures and query types. Next, we examine the +design space of our framework and show that it supports useful trade-offs +between insertion and query performance. Finally, we examine the use +of concurrency and parallelism to provide better worst-case insertion +performance for our system. diff --git a/chapters/background.tex b/chapters/background.tex index ef30685..26513c4 100644 --- a/chapters/background.tex +++ b/chapters/background.tex @@ -45,7 +45,7 @@ index.~\cite{cowbook} For our purposes here, we'll focus on the first of these, but the use of other codmains wouldn't have any material effect on our discussion. -We will use the following definition of a "classical" database index, +We will use the following definition of a ``classical'' database index, \begin{definition}[Classical Index~\cite{cowbook}] Consider a set of database records, $\mathcal{D}$. An index over diff --git a/chapters/beyond-dsp.tex b/chapters/beyond-dsp.tex index 9e6adf0..7632261 100644 --- a/chapters/beyond-dsp.tex +++ b/chapters/beyond-dsp.tex @@ -3,7 +3,7 @@ \begin{center} \emph{The following chapter is an adaptation of work completed in collaboration with Dr. Dong Xie and Dr. Zhuoyue Zhao and published - in PVLDB Volume 17, Issue 11 (July 2024) under the title "Towards Systematic Index Dynamization". + in PVLDB Volume 17, Issue 11 (July 2024) under the title ``Towards Systematic Index Dynamization''. } \hrule \end{center} @@ -248,8 +248,8 @@ using the following interface, a final query result, $R$. \end{itemize} -Let $P(n)$ be the cost of $\mathbftt{local\_preproc}$, $D(n)$ be -the cost of $\mathbftt{distribute\_query}$, $\mathscr{Q}_\ell(n)$ +Let $P(n)$ be the cost of $\mathbftt{local\_preproc}$, $\mathscr{D}(n)$ be +the cost of $\mathbftt{distribute\_query}$, $\mathscr{Q}_S(n)$ be the cost of $\mathbftt{local\_query}$, and $C_e(n)$ be the cost $\mathbftt{combine}$. To solve a search problem with this interface requires calling $\mathbftt{local\_preproc}$ and $\mathbftt{local\_query}$ @@ -258,12 +258,12 @@ $\mathbftt{combine}$ once. For a Bentley-Saxe dynamization then, with $O(\log_2 n)$ blocks, the worst-case cost of answering an eDSP is, \begin{equation} \label{eqn:edsp-cost} -O \left( \log_2 n \cdot P(n) + D(n) + \log_2 n \cdot \mathscr{Q}_\ell(n) + C_e(n) \right) +O \left( \log_2 n \cdot P(n) + \mathscr{D}(n) + \log_2 n \cdot \mathscr{Q}_S(n) + C_e(n) \right) \end{equation} As an example, we'll express IRS using the above interface and analyze its complexity to show that the resulting solution is the -same $\Theta(log^2 n + k)$ cost as the specialized solution from +same $\Theta(\log^2 n + k)$ cost as the specialized solution from Chapter~\ref{chap:sampling}. We use $\mathbftt{local\_preproc}$ to determine the number of records on each block falling on the interval $[l, u]$ and return this, as well as $i_l$ and $i_u$ as the @@ -332,8 +332,8 @@ each of these operations in pseudo-code. } \end{algorithm} -These operations result in $P(n) \in \Theta(\log n)$, $D(n) \in -\Theta(\log n)$, $\mathscr{Q}(n,k) \in \Theta(k)$, and $C_e(n) \in +These operations result in $P(n) \in \Theta(\log n)$, $\mathscr{D}(n) \in +\Theta(\log n)$, $\mathscr{Q}_S(n,k) \in \Theta(k)$, and $C_e(n) \in \Theta(1)$. At first glance, it would appear that we arrived at a solution with a query cost of $O\left(\log_2^2 n + k\log_2 n\right)$, and thus fallen short of our goal. However, Equation~\ref{eqn:edsp-cost} @@ -455,7 +455,7 @@ following provides an upper bound on the worst-case query complexity of an IDSP, \begin{equation*} - O\left(\log_2 n \cdot P(n) + D(n) + R(n) \left(\log_2 n \cdot Q_s(n) + + O\left(\log_2 n \cdot P(n) + \mathscr{D}(n) + R(n) \left(\log_2 n \cdot Q_S(n) + C_e(n)\right)\right) \end{equation*} @@ -1147,7 +1147,7 @@ a constant number of times per level (at most once for Bentley-Saxe, exactly once for tiering, and at most $s$ times for leveling), and there are at most $\log_s n$ levels. Thus, the amortized insertion cost is, \begin{equation*} - I_a(n) \in \Theta\left(\frac{B(n)}{n} \cdot \log_s n\right) + I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \log_s n\right) \end{equation*} for standard search problems. Slightly more efficient solutions are possible for merge decomposable search problems based on the cost of @@ -1319,7 +1319,7 @@ in Section~\ref{ssec:dyn-idsp}, with slight modifications to account for the different cost function of buffer querying and preprocessing. The cost is, \begin{equation*} -\mathscr{Q}(n) \in O \left(P_B(N_B) + \log_s n \cdot P(n) + D(n) + R(n)\left( +\mathscr{Q}(n) \in O \left(P_B(N_B) + \log_s n \cdot P(n) + \mathscr{D}(n) + R(n)\left( Q_B(n) + \log_s n \cdot Q_s(n) + C_e(n)\right)\right) \end{equation*} where $P_B(n)$ is the cost of pre-processing the buffer, and $Q_B(n)$ is @@ -1747,7 +1747,7 @@ standard DDSP, we compare with the Bentley-Saxe Method (\textbf{BSM})\footnote{ point-lookup directly on the VPTree. } and a dynamic data structure for the same search problem called an M-tree~\cite{mtree,mtree-impl} (\textbf{MTree}), which is an example of a so-called -"ball tree" structure that partitions high dimensional space using nodes +``ball tree'' structure that partitions high dimensional space using nodes representing spheres, which are merged and split to maintain balance in a manner not unlike a B+tree. We also consider a static instance of a VPTree built over the same set of records (\textbf{VPTree}). We used @@ -1875,7 +1875,7 @@ same massive degradation in query performance that PGM's native update support does in its own update-optimized configuration.\footnote{ It's also worth noting that PGM implements tombstone deletes by inserting a record with a matching key to the record to be deleted, - and a particular "tombstone" value, rather than using a header. This + and a particular ``tombstone'' value, rather than using a header. This means that it can not support duplicate keys when deletes are used, unlike our approach. It also means that the records are smaller, which should improve query performance, but we're able to beat it even diff --git a/chapters/design-space.tex b/chapters/design-space.tex index f85883c..22773e5 100644 --- a/chapters/design-space.tex +++ b/chapters/design-space.tex @@ -127,72 +127,89 @@ be $N_B (s - 1) \cdot s^i$, where $N_B$ is the size of the buffer. The resulting structure will have at most $\log_s n$ shards. The resulting policy is described in Algorithm~\ref{alg:design-bsm}. -Unfortunately, the approach used by Bentley and Saxe to calculate the -amortized insertion cost of the BSM does not generalize to larger bases, -and so we will need to derive this result using a different approach. +Analyzing the amortized insertion performance of BSM is slightly +complicated by the fact that each record is \emph{not} written on +every level. For the purposes of our analysis, establishing a reasonable +upper bound on the amortized insertion cost is sufficient, however, so +we will settle for a looser bound to keep things simple. \begin{theorem} The amortized insertion cost for generalized BSM with a growth factor of -$s$ is $\Theta\left(\frac{B(n)}{n} \cdot s\log_s n)\right)$. +$s$ is $O\left(\frac{B(n)}{n} \cdot s\log_s n)\right)$. \end{theorem} \begin{proof} +In generalized BSM, each record will be written at most $s$ times +per level. We will use this result to provide an upper-bound on the +amortized insertion performance. The worst case cost associated with a +reconstruction in BSM is a full compaction of the structure, which will +require $B(n)$ time to complete. Further, there are $O(\log_s n)$ levels +in the decomposition. As a result, the amortized insertion cost can be +bounded above by, +\begin{equation} +I_A(n) \in O\left(\frac{B(n)}{n} \cdot s \log_s n\right) +\end{equation} +\end{proof} -In order to calculate the amortized insertion cost, we will first -determine the average number of times that a record is involved in a -reconstruction, and then amortize those reconstructions over the records -in the structure. -If we consider only the first level of the structure, it's clear that -the reconstruction count associated with each record in that structure -will follow the pattern, $1, 2, 3, 4, ..., s-1$ when the level is full. -Thus, the total number of reconstructions associated with records on level -$i=0$ is the sum of that sequence, or -\begin{equation*} -W(0) = \sum_{j=1}^{s-1} j = \frac{1}{2}\left(s^2 - s\right) -\end{equation*} -Considering the next level, $i=1$, each reconstruction involving this -level will copy down the entirety of the structure above it, adding -one more write per record, as well as one extra write for the new record. -More specifically, in the above example, the first "batch" of records in -level $i=1$ will have the following write counts: $1, 2, 3, 4, 5, ..., s$, -the second "batch" of records will increment all of the existing write -counts by one, and then introduce another copy of $1, 2, 3, 4, 5, ..., s$ -writes, and so on. +% \begin{proof} -Thus, each new "batch" written to level $i$ will introduce $W(i-1) + 1$ -writes from the previous level into level $i$, as well as rewriting all -of the records currently on level $i$. +% In order to calculate the amortized insertion cost, we will first +% determine the average number of times that a record is involved in a +% reconstruction, and then amortize those reconstructions over the records +% in the structure. -The net result of this is that the number of writes on level $i$ is given -by the following recurrence relation (combined with the $W(0)$ base case), +% If we consider only the first level of the structure, it's clear that +% the reconstruction count associated with each record in that structure +% will follow the pattern, $1, 2, 3, 4, ..., s-1$ when the level is full. +% Thus, the total number of reconstructions associated with records on level +% $i=0$ is the sum of that sequence, or +% \begin{equation*} +% W(0) = \sum_{j=1}^{s-1} j = \frac{1}{2}\left(s^2 - s\right) +% \end{equation*} -\begin{equation*} -W(i) = sW(i-1) + \frac{1}{2}\left(s-1\right)^2 \cdot s^i -\end{equation*} +% Considering the next level, $i=1$, each reconstruction involving this +% level will copy down the entirety of the structure above it, adding +% one more write per record, as well as one extra write for the new record. +% More specifically, in the above example, the first ``batch'' of records in +% level $i=1$ will have the following write counts: $1, 2, 3, 4, 5, ..., s$, +% the second ``batch'' of records will increment all of the existing write +% counts by one, and then introduce another copy of $1, 2, 3, 4, 5, ..., s$ +% writes, and so on. -which can be solved to give the following closed-form expression, -\begin{equation*} -W(i) = s^i \cdot \left(\frac{1}{2} (s-1) \cdot (s(i+1) - i)\right) -\end{equation*} -which provides the total number of reconstructions that records in -level $i$ of the structure have participated in. As each record -is involved in a different number of reconstructions, we'll consider the -average number by dividing $W(i)$ by the number of records in level $i$. - -From here, the proof proceeds in the standard way for this sort of -analysis. The worst-case cost of a reconstruction is $B(n)$, and there -are $\log_s(n)$ total levels, so the total reconstruction costs associated -with a record can be upper-bounded by, $B(n) \cdot -\frac{W(\log_s(n))}{n}$, and then this cost amortized over the $n$ -insertions necessary to get the record into the last level. We'll also -condense the multiplicative constants and drop the additive ones to more -clearly represent the relationship we're looking to show. This results -in an amortized insertion cost of, -\begin{equation*} -\frac{B(n)}{n} \cdot s \log_s n -\end{equation*} -\end{proof} +% Thus, each new ``batch'' written to level $i$ will introduce $W(i-1) + 1$ +% writes from the previous level into level $i$, as well as rewriting all +% of the records currently on level $i$. + +% The net result of this is that the number of writes on level $i$ is given +% by the following recurrence relation (combined with the $W(0)$ base case), + +% \begin{equation*} +% W(i) = sW(i-1) + \frac{1}{2}\left(s-1\right)^2 \cdot s^i +% \end{equation*} + +% which can be solved to give the following closed-form expression, +% \begin{equation*} +% W(i) = s^i \cdot \left(\frac{1}{2} (s-1) \cdot (s(i+1) - i)\right) +% \end{equation*} +% which provides the total number of reconstructions that records in +% level $i$ of the structure have participated in. As each record +% is involved in a different number of reconstructions, we'll consider the +% average number by dividing $W(i)$ by the number of records in level $i$. + +% From here, the proof proceeds in the standard way for this sort of +% analysis. The worst-case cost of a reconstruction is $B(n)$, and there +% are $\log_s(n)$ total levels, so the total reconstruction costs associated +% with a record can be upper-bounded by, $B(n) \cdot +% \frac{W(\log_s(n))}{n}$, and then this cost amortized over the $n$ +% insertions necessary to get the record into the last level. We'll also +% condense the multiplicative constants and drop the additive ones to more +% clearly represent the relationship we're looking to show. This results +% in an amortized insertion cost of, +% \begin{equation*} +% \frac{B(n)}{n} \cdot s \log_s n +% \end{equation*} +% \end{proof} \begin{theorem} The worst-case insertion cost for generalized BSM with a scale factor @@ -586,7 +603,7 @@ reconstructions, one per level. \hline & \textbf{Gen. BSM} & \textbf{Leveling} & \textbf{Tiering} \\ \hline $I(n)$ & $\Theta(B(n))$ & $\Theta\left(B\left(\frac{s-1}{s} \cdot n\right)\right)$ & $ \Theta\left(\sum_{i=0}^{\log_s n} B(s^i)\right)$ \\ \hline -$I_A(n)$ & $\Theta\left(\frac{B(n)}{n} s\log_s n)\right)$ & $\Theta\left(\frac{B(n)}{n} s\log_s n\right)$& $\Theta\left(\frac{B(n)}{n} \log_s n\right)$ \\ \hline +$I_A(n)$ & $O\left(\frac{B(n)}{n} s\log_s n)\right)$ & $\Theta\left(\frac{B(n)}{n} s\log_s n\right)$& $\Theta\left(\frac{B(n)}{n} \log_s n\right)$ \\ \hline $\mathscr{Q}(n)$ &$O\left(\log_s n \cdot \mathscr{Q}_S(n)\right)$ & $O\left(\log_s n \cdot \mathscr{Q}_S(n)\right)$ & $O\left(s \log_s n \cdot \mathscr{Q}_S(n)\right)$\\ \hline $\mathscr{Q}_B(n)$ & $\Theta(\mathscr{Q}_S(n))$ & $O(\log_s n \cdot \mathscr{Q}_S(n))$ & $O(\log_s n \cdot \mathscr{Q}_S(n))$ \\ \hline \end{tabular} diff --git a/chapters/dynamic-extension-sampling.tex b/chapters/dynamic-extension-sampling.tex index 2f0a1c3..a8f284e 100644 --- a/chapters/dynamic-extension-sampling.tex +++ b/chapters/dynamic-extension-sampling.tex @@ -3,7 +3,7 @@ \begin{center} \emph{The following chapter is an adaptation of work completed in collaboration with Dr. Dong Xie and published - in PACT Volume 1, Issue 4 (December 2023) under the title "Practical Dynamic Extension of Sampling Indexes". + in PACT Volume 1, Issue 4 (December 2023) under the title ``Practical Dynamic Extension of Sampling Indexes''. } \hrule \end{center} diff --git a/chapters/dynamization.tex b/chapters/dynamization.tex index 1012597..5e4cdec 100644 --- a/chapters/dynamization.tex +++ b/chapters/dynamization.tex @@ -33,7 +33,7 @@ the word \emph{query} \footnote{ The term query is often abused and used to refer to several related, but slightly different things. In the vernacular, a query can refer to either a) a general type of search - problem (as in "range query"), b) a specific instance of a search + problem (as in ``range query''), b) a specific instance of a search problem, or c) a program written in a query language. } is often used within the database systems literature: to refer to a @@ -302,6 +302,19 @@ blocks, we represent it with the notation $\mathscr{I} = \{\mathscr{I}_1, \ldots, \mathscr{I}_m\}$, where $\mathscr{I}_i$ is the $i$th block. \end{example} +In this example, the decomposition resulted in a reduction of the +worst-case insert cost. However, many decomposition schemes that we will +examine do not affect the worst-case cost, despite having notably better +performance in practice. As a result, it is more common to consider the +\emph{amortized} insertion cost, $I_A(n)$ when examining +dynamization. This cost function has the form, +\begin{equation*} + I_A(n) = \frac{B(n)}{n} \cdot \text{A} +\end{equation*} +where $\text{A}$ is the number of times that a record within the +structure participates in a reconstruction, often called the write +amplification. + Much of the existing work on dynamization has considered different decomposition methods for static data structures, and the effects that these methods have on insertion and query performance. However, before diff --git a/chapters/related-works.tex b/chapters/related-works.tex index c9d9357..1ece6fc 100644 --- a/chapters/related-works.tex +++ b/chapters/related-works.tex @@ -22,7 +22,7 @@ on works in which dynamization appears as a major focus of the paper, not simply as an incidental tool. One of the older applications of the Bentley-Saxe method is in the -creation of a data structure called the Bkd-tree~\cite{bkd-tree}. +creation of a data structure called the Bkd-tree~\cite{bkdtree}. This structure is a search tree, based on the kd-tree~\cite{kd-tree}, for multi-dimensional searching, that has been designed for use in external storage. While it was not the first external kd-tree, existing @@ -43,7 +43,7 @@ to outperform native dynamic implementations on external storage. A more recent paper discussing the application of the logarithmic method to a specific example is its application to the Mantis structure for -large-scale DNA sequence search~\cite{mantis-dyn}. Mantis~\cite{mantis} +large-scale DNA sequence search~\cite{almodaresi23}. Mantis~\cite{mantis} is one of the fastest and most space efficient structures for sequence search, but is static. To create a half-dynamic version of Mantis, the authors first design an algorithm to efficiently merge multiple Mantis diff --git a/chapters/sigmod23/background.tex b/chapters/sigmod23/background.tex index 984e36c..8d3a88f 100644 --- a/chapters/sigmod23/background.tex +++ b/chapters/sigmod23/background.tex @@ -21,8 +21,8 @@ set; the specific usage should be clear from context. In each of the problems considered, sampling can be performed either with-replacement or without-replacement. Sampling with-replacement means that a record that has been included in the sample set for a given -sampling query is "replaced" into the dataset and allowed to be sampled -again. Sampling without-replacement does not "replace" the record, +sampling query is ``replaced'' into the dataset and allowed to be sampled +again. Sampling without-replacement does not ``replace'' the record, and so each individual record can only be included within the a sample set once for a given query. The data structures that will be discussed support sampling with-replacement, and sampling without-replacement can @@ -38,7 +38,7 @@ in the sample set to match the distribution of source data set. This requires that the sampling of a record does not affect the probability of any other record being sampled in the future. Such sample sets are said to be drawn i.i.d (independently and identically distributed). Throughout -this chapter, the term "independent" will be used to describe both +this chapter, the term ``independent'' will be used to describe both statistical independence, and identical distribution. Independence of sample sets is important because many useful statistical @@ -192,7 +192,7 @@ requiring greater than $k$ traversals to obtain a sample set of size $k$. \Paragraph{Static Solutions.} There are also a large number of static data structures, which we'll call static sampling indices (SSIs) in this chapter,\footnote{ - We used the term "SSI" in the original paper on which this chapter + We used the term ``SSI'' in the original paper on which this chapter is based, which was published prior to our realization that a strong distinction between an index and a data structure would be useful. I am retaining the term SSI in this chapter for consistency with the diff --git a/chapters/sigmod23/framework.tex b/chapters/sigmod23/framework.tex index 218c290..b413802 100644 --- a/chapters/sigmod23/framework.tex +++ b/chapters/sigmod23/framework.tex @@ -252,7 +252,7 @@ of the major limitations of the ghost structure approach for handling deletes is that there is not a principled method for removing deleted records from the decomposed structure. The standard approach is to set an arbitrary number of delete records, and rebuild the entire structure when -this threshold is crossed~\cite{saxe79}. Mixing the "ghost" records into +this threshold is crossed~\cite{saxe79}. Mixing the ``ghost'' records into the same structures as the original records allows for deleted records to naturally be cleaned up over time as they meet their tombstones during reconstructions using a technique called tombstone cancellation. This @@ -280,7 +280,7 @@ mechanism. The cost of using a tombstone delete in a Bentley-Saxe dynamization is the same as a simple insert, \begin{equation*} -\mathscr{D}(n)_A \in \Theta\left(\frac{B(n)}{n} \log_2 (n)\right) +D_A(n) \in \Theta\left(\frac{B(n)}{n} \log_2 (n)\right) \end{equation*} with the worst-case cost being $\Theta(B(n))$. Note that there is also a minor performance effect resulting from deleted records appearing @@ -309,7 +309,7 @@ on a Bentley-Saxe decomposition of that SSI will require, at worst, executing a point-lookup on each block, with a total cost of \begin{equation*} -\mathscr{D}(n) \in \Theta\left( L(n) \log_2 (n)\right) +D(n) \in \Theta\left( L(n) \log_2 (n)\right) \end{equation*} If the SSI being considered does \emph{not} support an efficient @@ -391,7 +391,7 @@ a natural way of controlling the number of deleted records within the structure, and thereby bounding the rejection rate. During reconstruction, we have the opportunity to remove deleted records. This will cause the record counts associated with each block of the structure to gradually -drift out of alignment with the "perfect" powers of two associated with +drift out of alignment with the ``perfect'' powers of two associated with the Bentley-Saxe method, however. In the theoretical literature on this topic, the solution to this problem is to periodically re-partition all of the records to re-align the block sizes~\cite{merge-dsp, saxe79}. This @@ -450,7 +450,7 @@ deleted records involved in the reconstruction will be dropped. Tombstones may require multiple cascading rounds of compaction to occur, because a tombstone record will only cancel when it encounters the record that it deletes. However, because tombstones always follow the record they -delete in insertion order, and will therefore always be "above" that +delete in insertion order, and will therefore always be ``above'' that record in the structure, each reconstruction will move every tombstone involved closer to the record it deletes, ensuring that eventually the bound will be satisfied. @@ -526,7 +526,7 @@ and LevelDB~\cite{leveldb}. This work has produced an incredibly large and well explored parametrization of the reconstruction procedures of LSM trees, a good summary of which can be bounded in this recent tutorial paper~\cite{sarkar23}. Examples of this design -space exploration include: different ways to organize each "level" +space exploration include: different ways to organize each ``level'' of the tree~\cite{dayan19, dostoevsky, autumn}, different growth rates, buffering, sub-partitioning of structures to allow finer-grained reconstruction~\cite{dayan22}, and approaches for allocating resources to @@ -739,19 +739,19 @@ Assuming that $N_B \ll n$, the first two terms of this expression are constant. Dropping them and amortizing the result over $n$ records give us the amortized insertion cost, \begin{equation*} -I_a(n) \in \Theta\left(\frac{B_M(n)}{n}\log_s(n)\right) +I_A(n) \in \Theta\left(\frac{B_M(n)}{n}\log_s(n)\right) \end{equation*} If the SSI being considered does not support a more efficient construction procedure from other instances of the same SSI, and the general Bentley-Saxe \texttt{unbuild} and \texttt{build} -operations must be used, the the cost becomes $I_a(n) \in +operations must be used, the the cost becomes $I_A(n) \in \Theta\left(\frac{B(n)}{n}\log_s(n)\right)$ instead. \Paragraph{Delete.} The framework supports both tombstone and tagged deletes, each with different performance. Using tombstones, the cost of a delete is identical to that of an insert. When using tagging, the cost of a delete is the same as the cost of a point lookup, because the -"delete" itself only sets a bit in the header of the record, +``delete'' itself only sets a bit in the header of the record, once it has been located. There will be $\Theta(\log_s n)$ total shards in the structure, each with a look-up cost of $L(n)$ using either the SSI's native point-lookup, or an auxiliary hash table, and the lookup diff --git a/chapters/tail-latency.tex b/chapters/tail-latency.tex index ee578a1..a0db592 100644 --- a/chapters/tail-latency.tex +++ b/chapters/tail-latency.tex @@ -136,7 +136,7 @@ that does not perform any re-partitioning. In this case, the technique provides the following worst-case insertion and query bounds, \begin{align*} I(n) &\in \Theta\left(\frac{n}{f(n)}\right) \\ -\mathscr{Q}(n) &\in \Theta\left(f(n) \cdot \mathscr{Q}\left(\frac{n}{f(n)}\right)\right) +\mathscr{Q}(n) &\in \Theta\left(f(n) \cdot \mathscr{Q}_S\left(\frac{n}{f(n)}\right)\right) \end{align*} where $f(n)$ is the number of blocks. @@ -336,19 +336,6 @@ with largest window to preemptively schedule reconstructions. Most of our discussion in this chapter could also be applied to leveling, albeit with worse results. However, BSM \emph{cannot} be used at all. -\Paragraph{Nomenclature.} For the discussion that follows, it will -be convenient to define a few terms for discussing levels relative to -each other. While these are all fairly straightforward, to alleviate any -potential confusion, we'll define them all explicitly here. We define the -term \emph{last level}, $i = \ell$, to mean the level in the dynamized -structure with the largest index value (and thereby the most records) -and \emph{first level} to mean the level with index $i=0$. Any level -with $0 < i < \ell$ is called an \emph{internal level}. A reconstruction -on level $i$ involves the combination of all blocks on that level into -one, larger, block, that is then appended level $i+1$. Relative to some -level at index $i$, the \emph{next level} is the level at index $i + -1$, and the \emph{previous level} is at index $i-1$. - \subsection{Concurrent Reconstructions} Our proposed approach is as follows. We will fully detach reconstructions @@ -766,7 +753,7 @@ creating a new shard in \texttt{L2} using the two shards in \texttt{V1}'s The internal structure of the dynamized data structure (ignoring the buffer) can be thought of as a list of immutable levels, $\mathcal{V} = \{\mathscr{L}_0, \ldots \mathscr{L}_h\}$, where each level -contains immutable shards, $\mathcal{L}_i = \{\mathscr{I}_0, \ldots +contains immutable shards, $\mathscr{L}_i = \{\mathscr{I}_0, \ldots \mathscr{I}_m\}$. Buffer flushes and reconstructions can be thought of as functions, which accept a version as input and produce a new version as output. Namely, @@ -1135,7 +1122,7 @@ ensure sufficient resources to fully parallelize all reconstructions (we'll consider resource constrained situations later). We tested -ISAM tree with the 200 million record SOSD \texttt{OSM} -dataset~\cite{sosd}, as well as VPTree with the one million, +dataset~\cite{sosd-datasets}, as well as VPTree with the one million, 300-dimensional, \texttt{SBW} dataset~\cite{sbw}. For each test, we inserted $30\%$ of the records to warm up the structure, and then measured the individual latency of each insert after that. We measured |