From bc531da7b924a4c51a5e95a65cf2f1d1d13645d9 Mon Sep 17 00:00:00 2001
From: Douglas Rumbaugh <dbr4@psu.edu>
Date: Thu, 29 May 2025 20:36:47 -0400
Subject: updates

---
 chapters/tail-latency.tex | 63 +++++++++++++++++++++++------------------------
 1 file changed, 31 insertions(+), 32 deletions(-)

(limited to 'chapters/tail-latency.tex')

diff --git a/chapters/tail-latency.tex b/chapters/tail-latency.tex
index 4b629cc..38e8f27 100644
--- a/chapters/tail-latency.tex
+++ b/chapters/tail-latency.tex
@@ -19,12 +19,12 @@ consider the insertion performance in Figure~\ref{fig:tl-btree-isam},
 which compares the insertion latencies of a dynamized ISAM tree with
 that of its most direct dynamic analog: a B+Tree. While, as shown
 in Figure~\ref{fig:tl-btree-isam-tput}, the dynamized structure has
-comparable average performance to the native dynamic structure, the
-latency distributions, shown in Figure~\ref{fig:tl-btree-isam-lat}
-are quite different.  While the dynamized structure has much better
-"best-case" performance, the worst-case performance is exceedingly
-poor. That the structure exhibits reasonable performance on average is
-the result of these two ends of the distribution balancing each other out.
+superior average performance to the native dynamic structure, the
+latency distributions, shown in Figure~\ref{fig:tl-btree-isam-lat} are
+quite different. The dynamized structure has much better "best-case"
+performance, but the worst-case performance is exceedingly poor. That
+the structure exhibits reasonable performance on average is the result
+of these two ends of the distribution balancing each other out.
 
 This poor worst-case performance is a direct consequence of the different
 approaches to update support used by the dynamized structure and B+Tree.
@@ -39,9 +39,9 @@ proportion of it (for leveling). The fact that our dynamization technique
 uses buffering, and most of the shards involved in reconstruction are
 kept small by the logarithmic decomposition technique used to partition
 it, ensures that the majority of inserts are low cost compared to the
-B+Tree, but at the extreme end of the latency distribution, the local
-reconstruction strategy used by the B+Tree results in better worst-case
-performance.
+B+Tree. At the extreme end of the latency distribution, though, the
+local reconstruction strategy used by the B+Tree results in significantly
+better worst-case performance.
 
 Unfortunately, the design space that we have been considering thus far
 is limited in its ability to meaningfully alter the worst-case insertion
@@ -69,10 +69,9 @@ worst-case performance to be seen here. Adjusting the scale factor does
 have an effect on the distribution, but not in a way that is particularly
 useful from a configuration standpoint, and adjusting the mutable buffer
 has almost no effect on the worst-case latency at all, or even on the
-distribution; particularly when tiering is used. This is to be expected,
-ultimately the worst-case reconstructions largely the same regardless
-of scale factor or buffer size: a reconstruction involving $\Theta(n)$
-records.
+distribution; particularly when tiering is used. This is to be expected;
+ultimately the worst-case reconstruction size is largely the same
+regardless of scale factor or buffer size:  $\Theta(n)$ records.
 
 The selection of configuration parameters can influence \emph{when}
 these reconstructions occur, as well as slightly influence their size, but
@@ -120,10 +119,10 @@ between insertion and query performance by controlling the number of
 blocks in the decomposition. Placing a bound on this number is necessary
 to bound the worst-case query cost, and is done using reconstructions
 to either merge (in the case of the Bentley-Saxe method) or re-partition
-(in the case of the equal block method) them. Performing less frequent
-reconstructions reduces the amount of work associated with inserts,
-at the cost of allowing more blocks to accumulate and thereby hurting
-query performance.
+(in the case of the equal block method) the blocks. Performing less
+frequent reconstructions reduces the amount of work associated with
+inserts, at the cost of allowing more blocks to accumulate and thereby
+hurting query performance.
 
 This trade-off between insertion and query performance by way of block
 count is most directly visible in the equal block method described
@@ -135,10 +134,19 @@ I(n) &\in \Theta\left(\frac{n}{f(n)}\right) \\
 \end{align*}
 where $f(n)$ is the number of blocks.
 
+\begin{figure}
+\centering
+\subfloat[Insertion vs. Query Trade-off]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:tl-ebm-tradeoff}} 
+\subfloat[Insertion Latency Distribution]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:tl-ebm-tail-latency}} \\
+
+\caption{The equal block method with varying values of $f(n)$.}
+\label{fig:tl-ebm}
+\end{figure}
+
 Unlike the design space we have proposed in
 Chapter~\ref{chap:design-space}, the equal block method allows for
 \emph{both} trading off between insert and query performance, \emph{and}
-controlling the tail latency. Figure~\ref{fig:tl-ebm)} shows the results
+controlling the tail latency. Figure~\ref{fig:tl-ebm} shows the results
 of testing an implementation of a dynamized ISAM tree using the equal block
 method, with
 \begin{equation*}
@@ -152,25 +160,16 @@ serve to demonstrate the relevant properties in the clearest possible
 manner.
 
 Figure~\ref{fig:tl-ebm-tail-latency} shows that the equal block
-method provides a very direct relationship between the tail latency,
+method provides a very direct relationship between the tail latency
 and the number of blocks. The worst-case insertion performance is
 dictated by the size of the largest reconstruction, and so increasing
 the block count results in smaller blocks, and better insertion
 performance. These worst-case results also translate directly into
 improved average throughput, at the cost of query latency, as shown in
-Figure~\ref{fig:tl-ebm-tradeoff}. Note that, contrary to our Bentley-Saxe
-inspired dynamization system, the equal block method provides clear and
-direct relationships between insertion and query performance, as well
-as direct control over tail latency, through its design space.
-
-\begin{figure}
-\centering
-\subfloat[Insertion vs. Query Trade-off]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:tl-ebm-tradeoff}} 
-\subfloat[Insertion Latency Distribution]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:tl-ebm-tail-latency}} \\
-
-\caption{The equal block method with varying values of $f(n)$.}
-\label{fig:tl-ebm}
-\end{figure}
+Figure~\ref{fig:tl-ebm-tradeoff}. These results show that, contrary to our
+Bentley-Saxe inspired dynamization system, the equal block method provides
+clear and direct relationships between insertion and query performance,
+as well as direct control over tail latency, through its design space.
 
 Unfortunately, the equal block method is not well suited for
 our purposes. Despite having a much cleaner trade-off space, its
-- 
cgit v1.2.3