diff options
| author | Douglas B. Rumbaugh <doug@douglasrumbaugh.com> | 2025-07-15 11:53:28 -0400 |
|---|---|---|
| committer | Douglas B. Rumbaugh <doug@douglasrumbaugh.com> | 2025-07-15 11:53:28 -0400 |
| commit | fe7842aa6177ad61b4ff6c97925918d02f1e72c0 (patch) | |
| tree | f12190ef1bec43a1e4a98b86884b8ffb536a1042 | |
| parent | 05aab7bd45e691a0b0f527d0ab4dd7cae0b3ec55 (diff) | |
| download | dissertation-fe7842aa6177ad61b4ff6c97925918d02f1e72c0.tar.gz | |
| -rw-r--r-- | chapters/tail-latency.tex | 50 | ||||
| -rw-r--r-- | supplementary/nomenclature.tex | 1 |
2 files changed, 26 insertions, 25 deletions
diff --git a/chapters/tail-latency.tex b/chapters/tail-latency.tex index dbe867c..bba0081 100644 --- a/chapters/tail-latency.tex +++ b/chapters/tail-latency.tex @@ -346,8 +346,8 @@ according to the tiering policy. When a level contains $s$ blocks, a reconstruction will immediately be triggered to merge these blocks and push the result down to the next level. To ensure that the number of blocks in the structure remains bounded by $\Theta(\log_s n)$, we -will throttle the insertion rate by adding a stall time, $\delta$, to -each insert. $\delta$ will be determined such that it is sufficiently +will throttle the insertion rate by adding a stall time, $\gamma$, to +each insert. $\gamma$ will be determined such that it is sufficiently large to ensure that any scheduled reconstructions have enough time to complete before the shard count on any level exceeds $s$. This process is summarized in Algorithm~\ref{alg:tl-relaxed-recon}. Note that @@ -360,10 +360,10 @@ the appropriate amount of stalling occurs. \begin{algorithm} \caption{Insertion Algorithm with Stalling} \label{alg:tl-relaxed-recon} -\KwIn{$r$: a record to be inserted, $\mathscr{I} = (\mathcal{B}, \mathscr{L}_0 \ldots \mathscr{L}_\ell)$: a dynamized structure, $\delta$: insertion stall amount} +\KwIn{$r$: a record to be inserted, $\mathscr{I} = (\mathcal{B}, \mathscr{L}_0 \ldots \mathscr{L}_\ell)$: a dynamized structure, $\gamma$: insertion stall amount} \Comment{Stall insertion process by specified amount} -sleep($\delta$) \; +sleep($\gamma$) \; \BlankLine \Comment{Append to the buffer if possible} \If {$|\mathcal{B}| < N_B$} { @@ -411,8 +411,8 @@ record counts--each level has an increasing number of records per block.}} \end{figure} To ensure the correctness of this algorithm, it is necessary to show -that there exists a value for $\delta$ that ensures that the structural -invariants can be maintained. Logically, this $\delta$ can be thought +that there exists a value for $\gamma$ that ensures that the structural +invariants can be maintained. Logically, this $\gamma$ can be thought of as the amount of time needed to perform the active reconstruction operation, amortized over the inserts between when this reconstruction can be scheduled, and when it needs to be complete. We'll consider how @@ -452,13 +452,13 @@ to ensure that no more than $s$ shards exist on the last level. Assume that all inserts run on a single thread that can be scheduled alongside the reconstructions, and let each insert have a cost of \begin{equation*} -I(n) \in \Theta(1 + \delta) +I(n) \in \Theta(1 + \gamma) \end{equation*} -where $1$ is the cost of appending to the buffer, and $\delta$ +where $1$ is the cost of appending to the buffer, and $\gamma$ is a calculated stall time. During the stalling, the insert thread will be idle and reconstructions can be run on the execution unit. To ensure the last-level reconstruction is complete by the time that -$\Theta(n)$ inserts have finished, it is necessary that $\delta \in +$\Theta(n)$ inserts have finished, it is necessary that $\gamma \in \Theta\left(\frac{B(n)}{n}\right)$. However, this amount of stall is insufficient to maintain exactly $s$ @@ -473,7 +473,7 @@ for the time to complete these reconstructions as well. In the worst-case, there will be one active reconstruction on each of the $\log_s n$ levels, and thus we must introduce stalls such that, \begin{equation*} -I(n) \in \Theta(1 + \delta_0 + \delta_1 + \ldots \delta_{\log n - 1}) +I(n) \in \Theta(1 + \gamma_0 + \gamma_1 + \ldots \gamma_{\log n - 1}) \end{equation*} All of these internal reconstructions will be strictly less than the size of the last-level reconstruction, and so we can bound them all above by @@ -516,7 +516,7 @@ To see why this is important, consider an implementation that, contrary to Theorem~\ref{theo:worst-case-optimal}, only stalls enough to cover the last-level reconstruction. All other reconstructions are blocked until the last-level one has been completed. This approach would -result in $\delta = \frac{B(n)}{n}$ stall and complete the last +result in $\gamma = \frac{B(n)}{n}$ stall and complete the last level reconstruction after $\Theta(n)$ inserts. During this time, $\Theta(\frac{n}{N_B})$ blocks would accumulate in L0, ultimately resulting in a bound of $\Theta(n)$ blocks in the structure, rather than @@ -641,18 +641,18 @@ level $i$, divided by the number of inserts that can occur before the reconstruction must be done (i.e., the capacity of the index above this point). This gives, \begin{equation*} -\delta_i \in O\left( \frac{B(N_B \cdot s^{i+1})}{\sum_{j=0}^{i-1} N_B\cdot s^{j+1}} \right) +\gamma_i \in O\left( \frac{B(N_B \cdot s^{i+1})}{\sum_{j=0}^{i-1} N_B\cdot s^{j+1}} \right) \end{equation*} stall for each level. Noting that $s > 1$, $s \in \Theta(1)$, and that the denominator is the sum of a geometric progression, we have \begin{align*} -\delta_i \in &O\left( \frac{B(N_B\cdot s^{i+1})}{s\cdot N_B \sum_{j=0}^{i-1} s^{j}} \right) \\ - &O\left( \frac{(1-s) B(N_B\cdot s^{i+1})}{N_B\cdot (s - s^{i+1})} \right) \\ - &O\left( \frac{B(N_B\cdot s^{i+1})}{N_B \cdot s^{i+1}}\right) +\gamma_i \in &~O\left( \frac{B(N_B\cdot s^{i+1})}{s\cdot N_B \sum_{j=0}^{i-1} s^{j}} \right) \\ + &~O\left( \frac{(1-s) B(N_B\cdot s^{i+1})}{N_B\cdot (s - s^{i+1})} \right) \\ + &~O\left( \frac{B(N_B\cdot s^{i+1})}{N_B \cdot s^{i+1}}\right) \end{align*} For $B(n) \in \Omega(n)$, the numerator of the fraction will grow at -least as rapidly as the denominator, meaning that $\delta_\ell$ will +least as rapidly as the denominator, meaning that $\gamma_\ell$ will always be the largest. Thus, the stall necessary to cover the last-level reconstruction will be at least as much as is necessary for the internal reconstructions. @@ -1080,7 +1080,7 @@ that arise from direct throughput monitoring, and has a few additional benefits. It is based on a single parameter that can be readily updated on demand using atomics. Our current prototype uses a single, fixed value for the probability, but ultimately it should be dynamically tuned to -approximate the $\delta$ value from Theorem~\ref{theo:worst-case-optimal} +approximate the $\gamma$ value from Theorem~\ref{theo:worst-case-optimal} as closely as possible. It also doesn't require significant modification of the existing client interfaces, and can easily support multiple threads of insertion without needing an explicit serialization process to ensure @@ -1129,16 +1129,16 @@ tests with 32 background threads on a system with 40 physical cores to ensure sufficient resources to fully parallelize all reconstructions (we'll consider resource constrained situations later). -We tested -ISAM tree with the 200 million record SOSD \texttt{OSM} +We tested ISAM tree with the 200 million record SOSD \texttt{OSM} dataset~\cite{sosd-datasets}, as well as VPTree with the one million, 300-dimensional, \texttt{SBW} dataset~\cite{sbw}. For each test, we inserted $30\%$ of the records to warm up the structure, and then -measured the individual latency of each insert after that. We measured -the count of shards in the structure each time the buffer flushed -(including during the warmup period). Note that a rejection rate of -$1$ indicates no stalling at all, and values less than one indicate -$1 - \delta$ probability of an insert being rejected, after which the -insert thread sleeps for about a microsecond. A lower rejection rate means +measured the individual latency of each insert after that. We measured the +count of shards in the structure each time the buffer flushed (including +during the warmup period). Note that a rejection rate, $r$, of $r = +1$ indicates no stalling at all, and values less than one indicate $1 +- r$ probability of an insert being rejected, after which the insert +thread sleeps for about a microsecond. A lower rejection rate means more stalls are introduced. The tiering policy is strict tiering with a scale factor of $s=6$ using the concurrency control scheme described in Section~\ref{ssec:dyn-concurrency}. @@ -1360,7 +1360,7 @@ relatively small. \begin{figure} \centering \subfloat[Insertion Throughput vs. Query Latency]{\includegraphics[width=.5\textwidth]{img/tail-latency/recon-thread-scale.pdf} \label{fig:tl-latency-threads}} -\subfloat[Maximum Insertion Throughput for a Given Query Latency]{\includegraphics[width=.5\textwidth]{img/tail-latency/constant-query.pdf} \label{fig:tl-query-scaling}} \\ +\subfloat[Maximum Insertion Throughput for Query Latency Target]{\includegraphics[width=.5\textwidth]{img/tail-latency/constant-query.pdf} \label{fig:tl-query-scaling}} \\ \caption{Framework Thread Scaling} \label{fig:tl-threads} diff --git a/supplementary/nomenclature.tex b/supplementary/nomenclature.tex index 2846491..848a615 100644 --- a/supplementary/nomenclature.tex +++ b/supplementary/nomenclature.tex @@ -31,5 +31,6 @@ $\mathscr{L}_i$ & The $i$th level in a decomposition \\ \hline $\mathcal{V}_i$ & The $i$th version of a decomposed structure \\ \hline $f(n)$ & Number of blocks in an equal block size decomposition \\ \hline + $\gamma$ & Amount of extra delay added to insertions \\ \hline \end{tabular} \end{center} |