diff options
| author | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-25 19:09:29 -0400 |
|---|---|---|
| committer | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-25 19:09:29 -0400 |
| commit | 9f730256a5526564f7db0d6f802bd0b82178731e (patch) | |
| tree | ede529f36889661c1ac10c8d5a954c6d553bc4d5 | |
| parent | 842ad973095ba36b7022f86a8af4517742cdb40b (diff) | |
| download | dissertation-9f730256a5526564f7db0d6f802bd0b82178731e.tar.gz | |
Updates
| -rw-r--r-- | chapters/app-reverse-cdf.tex | 2 | ||||
| -rw-r--r-- | chapters/design-space.tex | 564 | ||||
| -rw-r--r-- | chapters/dynamization.tex | 2 | ||||
| -rw-r--r-- | chapters/tail-latency.tex | 2 | ||||
| -rw-r--r-- | img/design-space/isam-insert-dist.pdf | bin | 0 -> 36231 bytes | |||
| -rw-r--r-- | img/design-space/vptree-insert-dist.pdf | bin | 0 -> 31076 bytes | |||
| -rw-r--r-- | img/isam_insert.pdf | bin | 0 -> 36231 bytes | |||
| -rw-r--r-- | img/vptree_insert.pdf | bin | 0 -> 31076 bytes | |||
| -rw-r--r-- | paper.tex | 2 |
9 files changed, 547 insertions, 25 deletions
diff --git a/chapters/app-reverse-cdf.tex b/chapters/app-reverse-cdf.tex new file mode 100644 index 0000000..85542e1 --- /dev/null +++ b/chapters/app-reverse-cdf.tex @@ -0,0 +1,2 @@ +\chapter{Interpretting "reverse" CDFs} +\label{append:rcdf} diff --git a/chapters/design-space.tex b/chapters/design-space.tex index 10278bd..f639999 100644 --- a/chapters/design-space.tex +++ b/chapters/design-space.tex @@ -56,6 +56,7 @@ in Chapter~\ref{chap:framework} show that, for other types of problem, the technique does not fair quite so well. \section{Asymptotic Analsyis} +\label{sec:design-asymp} Before beginning with derivations for the cost functions of dynamized structures within the context of our @@ -65,7 +66,14 @@ involves adjusting constants, we will leave the design-space related constants within our asymptotic expressions. Additionally, we will perform the analysis for a simple decomposable search problem. Deletes will be entirely neglected, and we won't make any assumptions about -mergability. These assumptions are to simplify the analysis. +mergability. We will also neglect the buffer size, $N_B$, during this +analysis. Buffering isn't fundamental to the techniques we are examining +in this chapter, and including it would increase the complexity of the +analysis without contributing any useful insights.\footnote{ + The contribution of the buffer size is simply to replace each of the + individual records considered in the analysis with batches of $N_B$ + records. The same patterns hold. +} \subsection{Generalized Bentley Saxe Method} As a first step, we will derive a modified version of the Bentley-Saxe @@ -76,7 +84,6 @@ like this before simply out of a lack of interest in constant factors in theoretical asymptotic analysis. During our analysis, we'll intentionally leave these constant factors in place. - When generalizing the Bentley-Saxe method for arbitrary scale factors, we decided to maintain the core concept of binary decomposition. One interesting mathematical property of a Bentley-Saxe dynamization is that the internal @@ -87,21 +94,49 @@ with all other levels being empty. If we represent a full level with a 1 and an empty level with a 0, then we'd have $1100$, which is $20$ in base 2. +\begin{algorithm} +\caption{The Generalized BSM Layout Policy} +\label{alg:design-bsm} + +\KwIn{$r$: set of records to be inserted, $\mathscr{I}$: a dynamized structure, $n$: number of records in $\mathscr{I}$} + +\BlankLine +\Comment{Find the first non-full level} +$target \gets -1$ \; +\For{$i=0\ldots \log_s n$} { + \If {$|\mathscr{I}_i| < N_B (s - 1)\cdot s^i$} { + $target \gets i$ \; + break \; + } +} + +\BlankLine +\Comment{If the structure is full, we need to grow it} +\If {$target = -1$} { + $target \gets 1 + (\log_s n)$ \; +} + +\BlankLine +\Comment{Build the new structure} +$\mathscr{I}_{target} \gets \text{build}(\text{unbuild}(\mathscr{I}_0) \cup \ldots \text{unbuild}(\mathscr{I}_{target}) \cup r)$ \; +\BlankLine +\Comment{Empty the levels used to build the new shard} +\For{$i=0\ldots target-1$} { + $\mathscr{I}_i \gets \emptyset$ \; +} +\end{algorithm} + Our generalization, then, is to represent the data as an $s$-ary decomposition, where the scale factor represents the base of the representation. To accomplish this, we set of capacity of level $i$ to -be $N_b (s - 1) \cdot s^i$, where $N_b$ is the size of the buffer. The -resulting structure will have at most $\log_s n$ shards. Unfortunately, -the approach used by Bentley and Saxe to calculate the amortized insertion -cost of the BSM does not generalize to larger bases, and so we will need -to derive this result using a different approach. Note that, for this -analysis, we will neglect the buffer size $N_b$ for simplicity. It cancels -out in the analysis, and so would only serve to increase the complexity -of the expressions without contributing any additional insights.\footnote{ - The contribution of the buffer size is simply to replace each of the - individual records considered in the analysis with batches of $N_b$ - records. The same patterns hold. -} +be $N_B (s - 1) \cdot s^i$, where $N_b$ is the size of the buffer. The +resulting structure will have at most $\log_s n$ shards. The resulting +policy is described in Algorithm~\ref{alg:design-bsm}. + +Unfortunately, the approach used by Bentley and Saxe to calculate the +amortized insertion cost of the BSM does not generalize to larger bases, +and so we will need to derive this result using a different approach. + \begin{theorem} The amortized insertion cost for generalized BSM with a growth factor of @@ -185,38 +220,519 @@ in the dynamized structure, and will thus have a cost of $I(n) \in \begin{theorem} The worst-case query cost for generalized BSM for a decomposable -search problem with cost $\mathscr{Q}_s(n)$ is $\Theta(\log_s(n) \cdot +search problem with cost $\mathscr{Q}_S(n)$ is $O(\log_s(n) \cdot \mathscr{Q}_s(n))$. \end{theorem} \begin{proof} +The worst-case scenario for queries in BSM occurs when every existing +level is full. In this case, there will be $\log_s n$ levels that must +be queried, with the $i$th level containing $(s - 1) \cdot s^i$ records. +Thus, the total cost of querying the structure will be, +\begin{equation} +\mathscr{Q}(n) = \sum_{i=0}^{\log_s n} \mathscr{Q}_S\left((s - 1) \cdot s^i\right) +\end{equation} +The number of records per shard will be upper bounded by $O(n)$, so +\begin{equation} +\mathscr{Q}(n) \in O\left(\sum_{i=0}^{\log_s n} \mathscr{Q}_S(n)\right) + \in O\left(\log_s n \cdot \mathscr{Q}_S(n)\right) +\end{equation} \end{proof} \begin{theorem} -The best-case insertion cost for generalized BSM for a decomposable -search problem is $I_B \in \Theta(1)$. +The best-case query cost for generalized BSM for a decomposable +search problem with a cost of $\mathscr{Q}_S(n)$ is $\mathscr{Q}(n) +\in \Theta(\mathscr{Q}_S(n))$. \end{theorem} \begin{proof} +The best case scenario for queries in BSM occurs when a new level is +added, which results in every record in the structure being compacted +into a single structure. In this case, there is only a single data +structure in the dynamization, and so the query cost over the dynamized +structure is identical to the query cost of a single static instance of +the structure. Thus, the best case query cost in BSM is, +\begin{equation*} +\mathscr{Q}_B(n) \in \Theta \left( 1 \cdot \mathscr{Q}_S(n) \right) \in \Theta\left(\mathscr{Q}_S(n)\right) +\end{equation*} \end{proof} \subsection{Leveling} +Our leveling layout policy is described in +Algorithm~\ref{alg:design-level}. Each level contains a single structure +with a capacity of $N_B\cdot s^i$ records. When a reconstruction occurs, +the first level $i$ that has enough space to have the records in the +level $i-1$ stored inside of it is selected as the target, and then a new +structure is built at level $i$ containing the records in it and level +$i-1$. Then, all levels $j < (i - 1)$ are shifted by one level to level +$j+1$. This process clears space in level $0$ to contain the buffer flush. + +\begin{theorem} +The amortized insertion cost of leveling with a scale factor of $s$ is +\begin{equation*} +I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \frac{1}{2}(s+1)\log_s n\right) +\end{equation*} +\end{theorem} +\begin{proof} +Similarly to generalized BSM, the records in each level will be rewritten +up to $s$ times before they move down to the next level. Thus, the +amortized insertion cost for leveling can be found by determining how +many times a record is expected to be rewritten on a single level, and +how many levels there are in the structure. + +On any given level, the total number of writes require to fill the level +is given by the expression, +\begin{equation*} +B(s + (s - 1) + (s - 2) + \ldots + 1) +\end{equation*} +where $B$ is the number of records added to the level during each +reconstruction (i.e., $N_B$ for level $0$ and $N_B\cdots^{i-1}$ for any +other level). + +This is because the first batch of records entering the level will be +rewritten each of the $s$ times that the level is rebuilt before the +records are merged into the level below. The next batch will be rewritten +one fewer times, and so on. Thus, the total number of writes is, +\begin{equation*} +B\sum_{i=0}^{s-1} (s - i) = B\left(s^2 + \sum_{i=0}^{i-1} i\right) = B\left(s^2 + \frac{(s-1)s}{2}\right) +\end{equation*} +which can be simplied to get, +\begin{equation*} +\frac{1}{2}s(s+1)\cdot B +\end{equation*} +writes occuring on each level.\footnote{ + This write count is not cummulative over the entire structure. It only + accounts for the number of writes occuring on this specific level. +} + +To obtain the total number of times records are rewritten, we need to +calculate the average number of times a record is rewritten per level, +and sum this over all of the levels. +\begin{equation*} +\sum_{i=0}^{\log_s n} \frac{\frac{1}{2}B_i s (s+1)}{s B_i} = \frac{1}{2} \sum_{i=0}^{\log_s n} (s + 1) = \frac{1}{2} (s+1) \log_s n +\end{equation*} +To calculate the amortized insertion cost, we multiply this write amplification +number of the cost of rebuilding the structures, and divide by the total number +of records, +\begin{equation*} +I_A(n) \in \Theta\left(\frac{B(n)}{n}\cdot \frac{1}{2} (s+1) \log_s n\right) +\end{equation*} +\end{proof} + +\begin{theorem} +The worst-case insertion cost for leveling with a scale factor of $s$ is +\begin{equation*} +\Theta\left(\frac{s-1}{s} \cdot B(n)\right) +\end{equation*} +\end{theorem} +\begin{proof} +Unlike in BSM, where the worst case reconstruction involves all of the +records within the structure, in leveling it only includes the records +in the last two levels. In particular, the worst case behavior occurs +when the last level is one reconstruction away from its capacity, and the +level above it is full. In this case, the reconstruction will involve, +\begin{equation*} +\left(s^{\log_s n} - s^{\log_s n - 1}\right) + s^{\log_s n - 1} +\end{equation*} +records, where the first parenthesized term represents the records in +the last level, and the second the records in the level above it. +\end{proof} + +\begin{theorem} +The worst-case query cost for leveling for a decomposable search +problem with cost $\mathscr{Q}_S(n)$ is +\begin{equation*} +O\left(\mathscr{Q}_S(n) \cdot \log_s n \right) +\end{equation*} +\end{theorem} +\begin{proof} +The worst-case scenario for leveling is right before the structure gains +a new level, at which point there will be $\log_s n$ data structures +each with $O(n)$ records. Thus the worst-case cost will be the cost +of querying each of these structures, +\begin{equation*} +O\left(\mathscr{Q}_S(n) \cdot \log_s n \right) +\end{equation*} +\end{proof} + +\begin{theorem} +The best-case query cost for leveling for a decomposable search +problem with cost $\mathscr{Q}_S(n)$ is +\begin{equation*} +\mathscr{Q}_B(n) \in O(\mathscr{Q}_S(n) \cdot \log_s n) +\end{equation*} + +\end{theorem} +\begin{proof} +Unlike BSM, leveling will never have empty levels. The policy ensures +that there is always a data structure on every level. As a result, the +best-case query still must query $\log_s n$ structures, and so has a +best-case cost of, +\begin{equation*} +\mathscr{Q}_B(n) \in O\left(\mathscr{Q}_S(n) \cdot \log_s n\right) +\end{equation*} +\end{proof} + \subsection{Tiering} +\begin{theorem} +The amortized insertion cost of tiering with a scale factor of $s$ is, +\begin{equation*} +I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \log_s n \right) +\end{equation*} +\end{theorem} +\begin{proof} +For tiering, each record is written \emph{exactly} one time per +level. As a result, each record will be involved in exactly $\log_s n$ +reconstructions over the lifetime of the structure. Each reconstruction +will have cost $B(n)$, and thus the amortized insertion cost must be, +\begin{equation*} +I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \log_s n\right) +\end{equation*} +\end{proof} + +\begin{theorem} +The worst-case insertion cost of tiering with a scale factor of $s$ is, +\begin{equation*} +I(n) \in \Theta\left(B(n)\right) +\end{equation*} +\end{theorem} +\begin{proof} +The worst-case reconstruction in tiering involves performing a +reconstruction on each level. Of these, the largest level will +contain $\Theta(n)$ records, and thus dominates the cost of the +reconstruction. More formally, the total cost of this reconstruction +will be, +\begin{equation*} +I(n) = \sum_{i=0}{\log_s n} B(s^i) = B(1) + B(s) + B(s^2) + \ldots B(s^{\log_s n}) +\end{equation*} +Of these, the final term $B(s^{\log_s n}) = B(n)$ dominates the others, +resulting in an asymptotic worst-case cost of, +\begin{equation*} +I(n) \in \Theta\left(B(n)\right) +\end{equation*} +\end{proof} + +\begin{theorem} +The worst-case query cost for tiering for a decomposable search +problem with cost $\mathscr{Q}_S(n)$ is +\begin{equation*} +\mathscr{Q}(n) \in O( \mathscr{Q}_S(n) \cdot s \log_s n) +\end{equation*} +\end{theorem} +\begin{proof} +As with the previous two policies, the worst-case query occurs when the +structure is completely full. In case of tiering, that means that there +will be $\log_s n$ levels, each containing $s$ shards with a size bounded +by $O(n)$. Thus, there will be $s \log_s n$ structures to query, and the +query cost must be, +\begin{equation*} +\mathscr{Q}(n) \in O \left(\mathscr{Q}_S(n) \cdot s \log_s n \right) +\end{equation*} +\end{proof} + +\begin{theorem} +The best-case query cost for tiering for a decomposable search problem +with cost $\mathscr{Q}_S(n)$ is $O(\log_s n)$. +\end{theorem} +\begin{proof} +The tiering policy ensures that there are no internal empty levels, and +as a result the best case scenario for tiering occurs when each level is +populated by exactly $1$ shard. In this case, there will only be $\log_s n$ +shards to query, resulting in, +\begin{equation*} +\mathscr{Q}_B(n) \in O\left(\mathscr{Q}_S(n) \cdot \log_S n \right) +\end{equation*} +best-case query cost. +\end{proof} \section{General Observations} +The asymptotic results from the previous section are summarized in +Table~\ref{tab:policy-comp}. When the scale factor is accounted for +in the analysis, we can see that possible trade-offs begin to manifest +within the space. We've seen some of these in action directly in +the experimental sections of previous chapters. -\begin{table*}[!t] +\begin{table*} \centering -\begin{tabular}{|l l l l l|} +\small +\renewcommand{\arraystretch}{1.6} +\begin{tabular}{|l l l l|} \hline -\textbf{Policy} & \textbf{Worst-case Query Cost} & \textbf{Worst-case Insert Cost} & \textbf{Best-cast Insert Cost} & \textbf{Amortized Insert Cost} \\ \hline -Gen. Bentley-Saxe &$\Theta\left(\log_s(n) \cdot Q(n)\right)$ &$\Theta\left(B(n)\right)$ &$\Theta\left(1\right)$ &$\Theta\left(\frac{B(n)}{n} \cdot \frac{1}{2}(s-1) \cdot ( (s-1)\log_s n + s)\right)$ \\ -Leveling &$\Theta\left(\log_s(n) \cdot Q(n)\right)$ &$\Theta\left(B(\frac{n}{s})\right)$ &$\Theta\left(1\right)$ &$\Theta\left(\frac{B(n)}{n} \cdot \frac{1}{2} \log_s(n)(s + 1)\right)$ \\ -Tiering &$\Theta\left(s\log_s(n) \cdot Q(n)\right)$ &$\Theta\left(B(n)\right)$ &$\Theta\left(1\right)$ &$\Theta\left(\frac{B(n)}{n} \cdot \log_s(n)\right)$ \\\hline +& \textbf{Gen. BSM} & \textbf{Leveling} & \textbf{Tiering} \\ \hline +$\mathscr{Q}(n)$ &$O\left(\log_s n \cdot \mathscr{Q}_S(n)\right)$ & $O\left(\log_s n \cdot \mathscr{Q}_S(n)\right)$ & $O\left(s \log_s n \cdot \mathscr{Q}_S(n)\right)$\\ \hline +$\mathscr{Q}_B(n)$ & $\Theta(\mathscr{Q}_S(n))$ & $O(\log_s n \cdot \mathscr{Q}_S(n))$ & $O(\log_s n \cdot \mathscr{Q}_S(n))$ \\ \hline +$I(n)$ & $\Theta(B(n))$ & $\Theta(\frac{s - 1}{s}\cdot B(n))$ & $\Theta(B(n))$\\ \hline +$I_A(n)$ & $\Theta\left(\frac{B(n)}{n} \frac{1}{2}(s-1)\cdot((s-1)\log_s n +s)\right)$ & $\Theta\left(\frac{B(n)}{n} \frac{1}{2}(s-1)\log_s n\right)$& $\Theta\left(\frac{B(n)}{n} \log_s n\right)$ \\ \hline \end{tabular} -\caption{Comparison of cost functions for various reconstruction policies for DSPs} + +\caption{Comparison of cost functions for various layout policies for DSPs} \label{tab:policy-comp} \end{table*} +% \begin{table*}[!t] +% \centering +% \begin{tabular}{|l l l l l|} +% \hline +% \textbf{Policy} & \textbf{Worst-case Query Cost} & \textbf{Worst-case Insert Cost} & \textbf{Best-cast Insert Cost} & \textbf{Amortized Insert Cost} \\ \hline +% Gen. Bentley-Saxe &$\Theta\left(\log_s(n) \cdot Q(n)\right)$ &$\Theta\left(B(n)\right)$ &$\Theta\left(1\right)$ &$\Theta\left(\frac{B(n)}{n} \cdot \frac{1}{2}(s-1) \cdot ( (s-1)\log_s n + s)\right)$ \\ +% Leveling &$\Theta\left(\log_s(n) \cdot Q(n)\right)$ &$\Theta\left(B(\frac{n}{s})\right)$ &$\Theta\left(1\right)$ &$\Theta\left(\frac{B(n)}{n} \cdot \frac{1}{2} \log_s(n)(s + 1)\right)$ \\ +% Tiering &$\Theta\left(s\log_s(n) \cdot Q(n)\right)$ &$\Theta\left(B(n)\right)$ &$\Theta\left(1\right)$ &$\Theta\left(\frac{B(n)}{n} \cdot \log_s(n)\right)$ \\\hline +% \end{tabular} +% \caption{Comparison of cost functions for various reconstruction policies for DSPs} +% \label{tab:policy-comp-old} +% \end{table*} + +% \begin{table*}[!t] +% \centering +% \begin{tabular}{|l l l l|} +% %stuff &\textbf{Gen. BSM} & \textbf{Leveling} & \textbf{Tiering} \\ +% % \textbf{Worst-case Query} &$O\left(\log_s(n)\cdot \mathscr{Q}_S(n)\right)$ & $O\left(\log_s(n) \mathscr{Q}_S(n)\right)$ & $O\left(s \log_s(n) \mathscr{Q}_S(n)\right)$\\ \hline +% % \textbf{Best-case Query} & & & \\ \hline +% % \textbf{Worst-case Insert} & & & \\ \hline +% % \textbf{Amortized Insert} & & & \\ \hline + +% \caption{Comparison of cost functions for various reconstruction policies for DSPs} +% \label{tab:policy-comp} +% \end{table*} + +\section{Experimental Evaluation} + +In the previous sections, we mathematically proved various claims about +the performance characteristics of our three layout policies to assess +the trade-offs that exist within the design space. While this analysis is +useful, the effects we are examining are at the level of constant factors, +and so it would be useful to perform experimental testing to validate +that these claimed performance characteristics manifest in practice. In +this section, we will do just that, running various benchmarks to explore +the real-world performance implications of the configuration parameter +space of our framework. + + +\subsection{Asymptotic Insertion Performance} + +We'll begin by validating our results for the insertion performance +characteristics of the three layout policies. For this test, we +consider two data structures: the ISAM tree and the VP tree. The ISAM +tree structure is merge-decomposable using a sorted-array merge, with +a build cost of $B_M(n) \in \Theta(n \log k)$, where $k$ is the number +of structures being merged. The VPTree, by constrast, is \emph{not} +merge decomposable, and is built in $B(n) \in \Theta(n \log n)$ time. We +use the $200,000,000$ record SOSD \texttt{OSM} dataset~\cite{sosd} for +ISAM testing, and the $1,000,000$ record, $300$-dimensional Spanish +Billion Words (\texttt{SBW}) dataset~\cite{sbw} for VPTree testing. + +For our first experiment, we will examine the latency distribution for +inserts into our structures. We tested the three layout policies, using a +common scale factor of $s=2$. This scale factor was selected to minimize +its influence on the results (we've seen before in Sections~\ref{} +and \ref{} that scale factor affects leveling and tiering in opposite +ways) and isolate the influence of the layout policy alone to as great +a degree as possible. We used a buffer size of $N_b=12000$ for the ISAM +tree structure, and $N_B=1000$ for the VPTree. + +We generated this distribution by inserting $30\%$ of the records from +the set to ``warm up'' the dynamized structure, and then measuring the +insertion latency for each individual insert for the remaining $70\%$ +of the data. Note that, due to timer resolution issues at nanosecond +scales, the specific latency values associated with the faster end of +the insertion distribution are not precise. However, it is our intention +to examine the latency distribution, not the values themselves, and so +this is not a significant limitation for our analysis. + +\begin{figure} +\subfloat[ISAM Tree Insertion Latencies]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:design-isam-ins-dist}} +\subfloat[VPTree Insertion Latencies]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:design-vptree-ins-dist}} \\ +\caption{Insertion Latency Distributions for Layout Policies} +\label{fig:design-policy-ins-latency} +\end{figure} + +The resulting distributions are shown in +Figure~\ref{design-policy-ins-latency}. These distributions are +representing using a "reversed" CDF with log scaling on both axes. This +representation has proven very useful for interpretting the latency +distributions that we see in evaluating dynamization, but are slightly +unusual, and so we've included a guide to interpretting these charts +in Appendix\ref{append:rcdf}. + +The first notable point is that, for both the ISAM tree +in Figure~\ref{fig:design-isam-ins-dist} and VPTree in +Figure~\ref{fig:design-vptree-ins-dist}, the Leveling +policy results in a measurable lower worst-case insertion +latency. This result is in line with our theoretical analysis in +Section~\ref{ssec:design-leveling-proofs}. However, there is a major +deviation from theoretical in the worst-case performance of Tiering +and BSM. Both of these should have similar worst-case latencies, as +the worst-case reconstruction in both cases involves every record in +the structure. Yet, we see tiering consistently performing better, +particularly for the ISAM tree. + +The reason for this has to do with the way that the records are +partitioned in these worst-case reconstructions. In Tiering, with a scale +factor of $s$, the worst-case reconstruction consists of $\Theta(\log_2 +n)$ distinct reconstructions, each involving exactly $2$ structures. BSM, +on the other hand, will use exactly $1$ reconstruction involving +$\Theta(\log_2 n)$ structures. This explains why ISAM performs much better +in Tiering than BSM, as the actual reconstruction cost function there is +$\Theta(n \log_2 k)$. For tiering, this results in $\Theta(n)$ cost in +the worst case. BSM, on the other hand, has $\Theta(n \log_2 \log_2 n)$, +as many more distinct structures must be merged in the reconstruction, +and is thus asymptotically worse-off. VPTree, on the other hand, sees +less of a difference because it is \emph{not} merge decomposable, and so +the number of structures playing a role in the reconstructions plays less +of a role. Having the records more partitioned still hurts performance, +due to cache effects most likely, but less so than in the MDSP case. + +\begin{figure} + +\caption{Insertion Throughput for Layout Policies} +\label{fig:design-ins-tput} +\end{figure} + +Next, in Figure~\ref{fig:design-ins-tput}, we show the overall +insertion throughput for the three policies. This result should +correlate with the amorized insertion costs for each policy derived in +Section~\ref{sec:design-asym}. As expected, tiering has the highest +throughput. + + +\subsection{General Insert vs. Query Trends} + +For our next experiment, we will consider the trade-offs between insertion +and query performance that exist within this design space. We benchmarked +each layout policy for a range of scale factors, measuring both their +respective insertion throughputs and query latencies for both ISAM Tree +and VPTree. + +\begin{figure} + +\caption{Insertion Throughput vs. Query Latency} +\label{fig:design-tradeoff} +\end{figure} + +\subsection{Query Size Effects} + +One potentially interesting aspect of decomposition-based dynamization +techniques is that, asymptotically, the additional cost added by +decomposing the data structure vanished for sufficiently expensive +queries. Bentley and Saxe proved that for query costs of the form +$\mathscr{Q}_B(n) \in \Omega(n^\epsilon)$ for $\epsilon > 0$, the +overal query cost is unaffected (asymptotically) by the decomposition. +This would seem to suggest that, as the cost of the query over a single +shard increases, the effectiveness of our design space for tuning query +performance should reduce. This is because our tuning space consists +of adjusting the number of shards within the structure, and so as the +effects of decomposition on the query cost reduce, we should see all +configurations approaching a similar query performance. + +In order to evaluate this effect, we tested the query latency of range +queries of varying selectivity against various configurations of our +framework to see at what points the query latencies begin to converge. We +also tested $k$-NN queries with varying values of $k$. + +\begin{figure} +\caption{Query "Size" Effect Analysis} +\label{fig:design-query-sze} +\end{figure} + +\section{Asymptotically Relevant Trade-offs} + +Thus far, we have considered a configuration system that trades in +constant factors only. In general asymptotic analysis, all possible +configurations of our framework in this scheme collapse to the same basic +cost functions when the constants are removed. While we have demonstrated +that, in practice, the effects of this configuration are measurable, there +do exist techniques in the classical literature that provide asympotically +relevant trade-offs, such as the equal block method~\cite{maurer80} and +the mixed method~\cite[pp. 117-118]{overmars83}. These techniques have +cost functions that are derived from arbitrary, positive, monotonically +increasing functions of $n$ that govern various ways in which the data +structure is partitioned, and changing the selection of function allows +for "tuning" the performance. However, to the best of our knowledge, +these techniques have never been implemented, and no useful guidance in +the literature exists for selecting these functions. + +However, it is useful to consider the general approach of these +techniques. They accomplish asymptotically relevant trade-offs by tying +the decomposition of the data structure directly to a function of $n$, +the number of records, in a user-configurable way. We can import a similar +concept into our already existing configuration framework for dynamization +to enable similar trade-offs, by replacing the constant scale factor, +$s$, with some function $s(n)$. However, we must take extreme care when +doing this to select a function that doesn't catastrophically impair +query performance. + +Recall that, generally speaking, our dynamization technique requires +multiplying the cost function for the data structure being dynamized by +the number of shards that the data structure has been decomposed into. For +search problems that are solvable in sub-polynomial time, this results in +a worst-case query cost of, +\begin{equation} +\mathscr{Q}(n) \in O(S(n) \cdot \mathscr{Q}_S(n)) +\end{equation} +where $S(n)$ is the number of shards and, for our framework, is $S(n) \in +O(s \log_s n)$. The user can adjust $s$, but this tuning does not have +asymptotically relevant consequences. Unfortunately, there is not much +room, practically, for adjustment. If, for example, we were to allow the +user to specify $S(n) \in \Theta(n)$, rather than $\Theta(\log n)$, then +query performance would be greatly impaired. We need a function that is +sub-linear to ensure useful performance. + +To accomplish this, we proposed adding a second scaling factor, $k$, such +that the number of records on level $i$ is given by, +\begin{equation} +\label{eqn:design-k-expr} +N_B \cdot \left(s \log_2^k(n)\right)^{i} +\end{equation} +with $k=0$ being equivalent to the configuration space we have discussed +thus far. The addition of $k$ allows for the dependency of the number of +shards on $n$ to be slightly biased upwards or downwards, in a way that +\emph{does} show up in the asymptotic analysis for inserts and queries, +but also ensures sub-polynomial additional query cost. + +In particular, we prove the following asymptotic properties of this +configuration. +\begin{theorem} +The worst-case query latency of a dynamization scheme where the +capacity of each level is provided by Equation~\ref{eqn:design-k-expr} is +\begin{equation} +\mathscr{Q}(n) \in O\left(\left(\frac{\log n}{\log (k \log n))}\right) \cdot \mathscr{Q}_S(n)\right) +\end{equation} +\end{theorem} +\begin{proof} +The number of levels within the structure is given by $\log_s (n)$, +where $s$ is the scale factor. The addition of $k$ to the parameterization +replaces this scale factor with $s \log^k n$, and so we have +\begin{equation*} +\log_{s \log^k n}n = \frac{\log n}{\log\left(s \log^k n\right)} = \frac{\log n}{\log s + \log\left(k \log n\right)} \in O\left(\frac{\log n}{\log (k \log n)}\right) +\end{equation*} +by the application of various logarithm rules and change-of-base formula. + +The cost of a query against a decomposed structure is $O(S(n) \cdot \mathscr{Q}_S(n))$, and +there are $\Theta(1)$ shards per level. Thus, the worst case query cost is +\begin{equation*} +\mathscr{Q}(n) \in O\left(\left(\frac{\log n}{\log (k \log n))}\right) \cdot \mathscr{Q}_S(n)\right) +\end{equation*} +\end{proof} + +\begin{theorem} +The amortized insertion cost of a dynamization scheme where the capacity of +each level is provided by Equation~\ref{eqn:design-k-expr} is, +\begin{equation*} +I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \frac{\log n}{\log ( k \log n)}\right) +\end{equation*} +\end{theorem} +\begin{proof} +\end{proof} + +\subsection{Evaluation} + +In this section, we'll access the effect that modifying $k$ in our +new parameter space has on the insertion and query performance of our +dynamization framework. + + +\section{Conclusion} + + diff --git a/chapters/dynamization.tex b/chapters/dynamization.tex index 0ee77d3..b5bf404 100644 --- a/chapters/dynamization.tex +++ b/chapters/dynamization.tex @@ -459,6 +459,8 @@ individually. For formally, for any query running in $\mathscr{Q}(n) \in cost of answering a decomposable search problem from a BSM dynamization is $\Theta\left(\mathscr{Q}(n)\right)$.~\cite{saxe79} +\subsection{The Mixed Method} + \subsection{Merge Decomposable Search Problems} \subsection{Delete Support} diff --git a/chapters/tail-latency.tex b/chapters/tail-latency.tex index 110069d..a96a897 100644 --- a/chapters/tail-latency.tex +++ b/chapters/tail-latency.tex @@ -1 +1,3 @@ \chapter{Controlling Insertion Tail Latency} + +\section{Introduction} diff --git a/img/design-space/isam-insert-dist.pdf b/img/design-space/isam-insert-dist.pdf Binary files differnew file mode 100644 index 0000000..4cb02a3 --- /dev/null +++ b/img/design-space/isam-insert-dist.pdf diff --git a/img/design-space/vptree-insert-dist.pdf b/img/design-space/vptree-insert-dist.pdf Binary files differnew file mode 100644 index 0000000..0128e13 --- /dev/null +++ b/img/design-space/vptree-insert-dist.pdf diff --git a/img/isam_insert.pdf b/img/isam_insert.pdf Binary files differnew file mode 100644 index 0000000..4cb02a3 --- /dev/null +++ b/img/isam_insert.pdf diff --git a/img/vptree_insert.pdf b/img/vptree_insert.pdf Binary files differnew file mode 100644 index 0000000..0128e13 --- /dev/null +++ b/img/vptree_insert.pdf @@ -394,7 +394,7 @@ of Engineering Science and Mechanics % lines that redefine the \thechapter and \thesection: %\renewcommand\thechapter{} %\renewcommand\thesection{\arabic{section}} -% \include{Appendix-A/Appendix-A} +%\include{chapters/app-reverse-cdf} % \include{Appendix-B/Appendix-B} % \include{Appendix-C/Appendix-C} % \include{Appendix-D/Appendix-D} |