1 files changed, 88 insertions, 0 deletions
diff --git a/chapters/tail-latency.tex b/chapters/tail-latency.tex
index 9094e26..361dde0 100644
--- a/chapters/tail-latency.tex
+++ b/chapters/tail-latency.tex
@@ -2,3 +2,91 @@
 \label{chap:tail-latency}
 
 \section{Introduction}
+
+\begin{figure}
+\subfloat[Insertion Throughput]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:tl-btree-isam-tput}} 
+\subfloat[Insertion Latency Distribution]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:tl-btree-isam-lat}} \\
+\caption{Insertion Performance of Dynamized ISAM vs. B+Tree}
+\label{fig:tl-btree-isam}
+\end{figure}
+
+Up to this point in our investigation, we have not directly addressed
+one of the largest problems associated with dynamization: insertion
+tail latency. While these techniques result in structures that have
+reasonable, or even good, insertion throughput, the latency associated
+with each individual insert is wildly variable. To illustrate this
+problem, consider the insertion performance in
+Figure~\ref{fig:tl-btree-isam}, which compares the insertion latencies
+of a dynamized ISAM tree with that of its most direct dynamic analog:
+a B+Tree. While, as shown in Figure~\ref{fig:tl-btree-isam-tput},
+the dynamized structure has comperable average performance to the
+native dynamic structure, the latency distributions are quite
+different. Figure~\ref{fig:tl-btree-isam-lat} shows representations
+of the distributions. While the dynamized structure has much better
+"best-case" performance, the worst-case performance is exceedingly
+poor. That the structure exhibits reasonable performance on average
+is the result of these two ends of the distribution balancing each
+other out.
+
+The reason for this poor tail latency is reconstructions. In order
+to provide tight bounds on the number of shards within the structure,
+our techniques must block inserts once the buffer has filled, until
+sufficient room is cleared in the structure to accomodate these new
+records. This results in the worst-case insertion behavior that we
+described mathematically in the previous chapter.
+
+Unfortunately, the design space that we have been considering thus
+far is very limited in its ability to meaningfully alter the
+worst-case insertion performance. While we have seen that the choice
+between leveling and tiering can have some effect, the actual benefit
+in terms of tail latency is quite small, and the situation is made
+worse by the fact that leveling, which can have better worst-case
+insertion performance, lags behind tiering in terms of average
+insertion performance. The use of leveling can allow for a small
+reduction in the worst case, but at the cost of making the majority
+of inserts worse because of increased write amplification.
+
+\begin{figure}
+\subfloat[Scale Factor Sweep]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:tl-parm-sf}} 
+\subfloat[Buffer Size Sweep]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:tl-parm-bs}} \\
+\caption{Design Space Effects on Latency Distribution}
+\label{fig:tl-parm-sweep}
+\end{figure}
+
+Additionally, the other tuning nobs that are available to us are
+of limited usefulness in tuning the worst case behavior.
+Figure~\ref{fig:tl-parm-sweep} shows the latency distributions of
+our framework as we vary the scale factor (Figure~\ref{fig:tl-parm-sf})
+and buffer size (Figure~\ref{fig:tl-parm-bs}) respectively. There
+is no clear trend in worst-case performance to be seen here. This
+is to be expected, ultimately the worst-case reconstructions in
+both cases are largely the same regardless of scale factor or buffer
+size: a reconstruction involving $\Theta(n)$ records. The selection
+of configuration parameters can influence \emph{when} these
+reconstructions occur, as well as slightly influence their size, but
+ultimately the question of ``which configuration has the best tail-latency
+performance'' is more a question of how many insertions the latency is
+measured over, than any fundamental trade-offs with the design space.
+
+\begin{example}
+Consider two dynamized structures, $\mathscr{I}_A$ and $\mathscr{I}_B$,
+with slightly different configurations.  Regardless of the layout
+policy used (of those discussed in Chapter~\ref{chap:design-space}),
+the worst-case insertion will occur when the structure is completely
+full, i.e., after
+\begin{equation*}
+n_\text{worst} = N_B + \sum_{i=0}^{\log_s n} N_B \cdot s^{i+1}
+\end{equation*}
+Let this be $n_a$ for $\mathscr{I}_B$ and $n_b$ for $\mathscr{I}_B$,
+and let $\mathscr{I}_A$ be configured with scale factor $s_a$ and
+$\mathscr{I}_B$ with scale factor $s_b$, such that $s_a < s_b$.
+\end{example}
+
+The upshot of this discussion is that tail latencies are due to the
+worst-case reconstructions associated with this method, and that the
+proposed design space does not provide the necessary tools to avoid or
+reduce these costs.
+
+\section{The Insertion-Query Trade-off}
+
+