From bc531da7b924a4c51a5e95a65cf2f1d1d13645d9 Mon Sep 17 00:00:00 2001 From: Douglas Rumbaugh Date: Thu, 29 May 2025 20:36:47 -0400 Subject: updates --- chapters/tail-latency.tex | 63 +++++++++++++++++++++++------------------------ 1 file changed, 31 insertions(+), 32 deletions(-) (limited to 'chapters/tail-latency.tex') diff --git a/chapters/tail-latency.tex b/chapters/tail-latency.tex index 4b629cc..38e8f27 100644 --- a/chapters/tail-latency.tex +++ b/chapters/tail-latency.tex @@ -19,12 +19,12 @@ consider the insertion performance in Figure~\ref{fig:tl-btree-isam}, which compares the insertion latencies of a dynamized ISAM tree with that of its most direct dynamic analog: a B+Tree. While, as shown in Figure~\ref{fig:tl-btree-isam-tput}, the dynamized structure has -comparable average performance to the native dynamic structure, the -latency distributions, shown in Figure~\ref{fig:tl-btree-isam-lat} -are quite different. While the dynamized structure has much better -"best-case" performance, the worst-case performance is exceedingly -poor. That the structure exhibits reasonable performance on average is -the result of these two ends of the distribution balancing each other out. +superior average performance to the native dynamic structure, the +latency distributions, shown in Figure~\ref{fig:tl-btree-isam-lat} are +quite different. The dynamized structure has much better "best-case" +performance, but the worst-case performance is exceedingly poor. That +the structure exhibits reasonable performance on average is the result +of these two ends of the distribution balancing each other out. This poor worst-case performance is a direct consequence of the different approaches to update support used by the dynamized structure and B+Tree. @@ -39,9 +39,9 @@ proportion of it (for leveling). The fact that our dynamization technique uses buffering, and most of the shards involved in reconstruction are kept small by the logarithmic decomposition technique used to partition it, ensures that the majority of inserts are low cost compared to the -B+Tree, but at the extreme end of the latency distribution, the local -reconstruction strategy used by the B+Tree results in better worst-case -performance. +B+Tree. At the extreme end of the latency distribution, though, the +local reconstruction strategy used by the B+Tree results in significantly +better worst-case performance. Unfortunately, the design space that we have been considering thus far is limited in its ability to meaningfully alter the worst-case insertion @@ -69,10 +69,9 @@ worst-case performance to be seen here. Adjusting the scale factor does have an effect on the distribution, but not in a way that is particularly useful from a configuration standpoint, and adjusting the mutable buffer has almost no effect on the worst-case latency at all, or even on the -distribution; particularly when tiering is used. This is to be expected, -ultimately the worst-case reconstructions largely the same regardless -of scale factor or buffer size: a reconstruction involving $\Theta(n)$ -records. +distribution; particularly when tiering is used. This is to be expected; +ultimately the worst-case reconstruction size is largely the same +regardless of scale factor or buffer size: $\Theta(n)$ records. The selection of configuration parameters can influence \emph{when} these reconstructions occur, as well as slightly influence their size, but @@ -120,10 +119,10 @@ between insertion and query performance by controlling the number of blocks in the decomposition. Placing a bound on this number is necessary to bound the worst-case query cost, and is done using reconstructions to either merge (in the case of the Bentley-Saxe method) or re-partition -(in the case of the equal block method) them. Performing less frequent -reconstructions reduces the amount of work associated with inserts, -at the cost of allowing more blocks to accumulate and thereby hurting -query performance. +(in the case of the equal block method) the blocks. Performing less +frequent reconstructions reduces the amount of work associated with +inserts, at the cost of allowing more blocks to accumulate and thereby +hurting query performance. This trade-off between insertion and query performance by way of block count is most directly visible in the equal block method described @@ -135,10 +134,19 @@ I(n) &\in \Theta\left(\frac{n}{f(n)}\right) \\ \end{align*} where $f(n)$ is the number of blocks. +\begin{figure} +\centering +\subfloat[Insertion vs. Query Trade-off]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:tl-ebm-tradeoff}} +\subfloat[Insertion Latency Distribution]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:tl-ebm-tail-latency}} \\ + +\caption{The equal block method with varying values of $f(n)$.} +\label{fig:tl-ebm} +\end{figure} + Unlike the design space we have proposed in Chapter~\ref{chap:design-space}, the equal block method allows for \emph{both} trading off between insert and query performance, \emph{and} -controlling the tail latency. Figure~\ref{fig:tl-ebm)} shows the results +controlling the tail latency. Figure~\ref{fig:tl-ebm} shows the results of testing an implementation of a dynamized ISAM tree using the equal block method, with \begin{equation*} @@ -152,25 +160,16 @@ serve to demonstrate the relevant properties in the clearest possible manner. Figure~\ref{fig:tl-ebm-tail-latency} shows that the equal block -method provides a very direct relationship between the tail latency, +method provides a very direct relationship between the tail latency and the number of blocks. The worst-case insertion performance is dictated by the size of the largest reconstruction, and so increasing the block count results in smaller blocks, and better insertion performance. These worst-case results also translate directly into improved average throughput, at the cost of query latency, as shown in -Figure~\ref{fig:tl-ebm-tradeoff}. Note that, contrary to our Bentley-Saxe -inspired dynamization system, the equal block method provides clear and -direct relationships between insertion and query performance, as well -as direct control over tail latency, through its design space. - -\begin{figure} -\centering -\subfloat[Insertion vs. Query Trade-off]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:tl-ebm-tradeoff}} -\subfloat[Insertion Latency Distribution]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:tl-ebm-tail-latency}} \\ - -\caption{The equal block method with varying values of $f(n)$.} -\label{fig:tl-ebm} -\end{figure} +Figure~\ref{fig:tl-ebm-tradeoff}. These results show that, contrary to our +Bentley-Saxe inspired dynamization system, the equal block method provides +clear and direct relationships between insertion and query performance, +as well as direct control over tail latency, through its design space. Unfortunately, the equal block method is not well suited for our purposes. Despite having a much cleaner trade-off space, its -- cgit v1.2.3