diff options
Diffstat (limited to 'chapters')
| -rw-r--r-- | chapters/design-space.tex | 20 | ||||
| -rw-r--r-- | chapters/tail-latency.tex | 30 |
2 files changed, 40 insertions, 10 deletions
diff --git a/chapters/design-space.tex b/chapters/design-space.tex index 32fe546..32d9b9c 100644 --- a/chapters/design-space.tex +++ b/chapters/design-space.tex @@ -66,7 +66,7 @@ involves adjusting constants, we will leave the design-space related constants within our asymptotic expressions. Additionally, we will perform the analysis for a simple decomposable search problem. Deletes will be entirely neglected, and we won't make any assumptions about -mergability. We will also neglect the buffer size, $N_B$, during this +mergeability. We will also neglect the buffer size, $N_B$, during this analysis. Buffering isn't fundamental to the techniques we are examining in this chapter, and including it would increase the complexity of the analysis without contributing any useful insights.\footnote{ @@ -709,14 +709,14 @@ throughput for the three policies for both ISAM Tree and VPTree. This result should correlate with the amortized insertion costs for each policy derived in Section~\ref{sec:design-asymp}. At a scale factor of $s=2$, all three policies have similar insertion performance. This makes -sense, as both leveling and Bentley-Saxe experience write-amplificiation -proprotional to the scale factor, and at $s=2$ this isn't significantly -larger than tiering's write amplificiation, particularly compared +sense, as both leveling and Bentley-Saxe experience write-amplification +proportional to the scale factor, and at $s=2$ this isn't significantly +larger than tiering's write amplification, particularly compared to the other factors influencing insertion performance, such as reconstruction time. However, for larger scale factors, tiering shows \emph{significantly} higher insertion throughput, and Leveling and Bentley-Saxe show greatly degraded performance due to the large amount -of additional write amplification. These reuslts are perfectly in line +of additional write amplification. These results are perfectly in line with the mathematical analysis of the previous section. \subsection{General Insert vs. Query Trends} @@ -758,7 +758,7 @@ performance degrades linearly with scale factor, and this is well demonstrated in the plot. The Bentley-Saxe method appears to follow a very similar trend to that -of leveling, albiet with even more dramatic performance degredation as +of leveling, albeit with even more dramatic performance degradation as the scale factor is increased. Generally it seems to be a strictly worse alternative to leveling in all but its best-case query cost, and we will omit it from our tests moving forward as a result. @@ -793,21 +793,21 @@ also tested $k$-NN queries with varying values of $k$. \end{figure} Interestingly, for the range of selectivities tested for range counts, the -overall query latency failed to converge, and there remains a consistant, -albiet slight, stratification amongst the tested policies, as shown in +overall query latency failed to converge, and there remains a consistent, +albeit slight, stratification amongst the tested policies, as shown in Figure~\ref{fig:design-isam-sel}. As the selectivity continues to rise above those shown in the chart, the relative ordering of the policies remains the same, but the relative differences between them begin to shrink. This result makes sense given the asymptotics--there is still \emph{some} overhead associated with the decomposition, but as the cost of the query approaches linear, it makes up an increasingly irrelevant -portion of the runtime. +portion of the run time. The $k$-NN results in Figure~\ref{fig:design-knn-sel} show a slightly different story. This is also not surprising, because $k$-NN is a $C(n)$-decomposable problem, and the cost of result combination grows with $k$. Thus, larger $k$ values will \emph{increase} the effect that -the decomposition has on the query runtime, unlike was the case in the +the decomposition has on the query run time, unlike was the case in the range count queries, where the total cost of the combination is constant. % \section{Asymptotically Relevant Trade-offs} diff --git a/chapters/tail-latency.tex b/chapters/tail-latency.tex index 0cdeeab..e63f3c9 100644 --- a/chapters/tail-latency.tex +++ b/chapters/tail-latency.tex @@ -960,3 +960,33 @@ at a variety of stall proportions. \section{Conclusion} + +In this section, we addressed the final of the three major problems of +dynamization: tail latency. We proposed a technique for limiting the +rate of insertions to match the rate of reconstruction that is able to +match the worst-case optimized approach of Overmars~\cite{overmars81} on +a single thread, and able to exceed it given multiple parallel threads. +We then implemented the necessary mechanisms to support this technique +within our framework, including a significantly improved architecture +for scheduling and executing parallel and background reconstructions, +and a system for rate limiting by rejecting inserts via Bernoulli sampling. + +We evaluated this system for fixed insertion rejection rates, and found +significant improvements in tail latencies, approaching the practical lower +bound we established using the equal block method, without requiring +significant degradation of query performance. In fact, we found that +this rate limiting mechanism provides a design space with more effective +trade-offs than the one we examined in Chapter~\ref{chap:design-space}, +with the system being able to exceed the query performance of an +equivalently configured tiering system for certain rate limiting +configurations. The method has limitations, assigning a fixed rejection +rate of inserts works well for linear time constructable structures like +the ISAM Tree, but was significantly less effective for the VPTree, which +requires $\Theta(n \log n)$ time to construct. For structures like this, +it will be necessary to dynamically scale the amount of throttling based +on the record count and size of reconstruction. Additionally, our current +system isn't easily capable of reaching the ``ideal'' goal of being able +to reliably trade query performance and insertion latency at a fixed +throughput. Nonetheless, the mechanisms for supporting such features +are present, and even this simple implementation represents a marked +improvement in terms of both insertion tail latency and configurability. |