diff options
| author | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-12 19:59:26 -0400 |
|---|---|---|
| committer | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-12 19:59:26 -0400 |
| commit | 5ffc53e69e956054fdefd1fe193e00eee705dcab (patch) | |
| tree | 74fd32db95211d0be067d22919e65ac959e4fa46 /chapters/sigmod23/exp-parameter-space.tex | |
| parent | 901a04fd8ec9a07b7bd195517a6d9e89da3ecab6 (diff) | |
| download | dissertation-5ffc53e69e956054fdefd1fe193e00eee705dcab.tar.gz | |
Updates
Diffstat (limited to 'chapters/sigmod23/exp-parameter-space.tex')
| -rw-r--r-- | chapters/sigmod23/exp-parameter-space.tex | 205 |
1 files changed, 128 insertions, 77 deletions
diff --git a/chapters/sigmod23/exp-parameter-space.tex b/chapters/sigmod23/exp-parameter-space.tex index d2057ac..d53c592 100644 --- a/chapters/sigmod23/exp-parameter-space.tex +++ b/chapters/sigmod23/exp-parameter-space.tex @@ -1,105 +1,156 @@ -\subsection{Framework Design Space Exploration} +\subsection{Design Space Exploration} \label{ssec:ds-exp} -The proposed framework brings with it a large design space, described in -Section~\ref{ssec:design-space}. First, this design space will be examined -using a standardized benchmark to measure the average insertion throughput and -sampling latency of DE-WSS at several points within this space. Tests were run -using a random selection of 500 million records from the OSM dataset, with the -index warmed up by the insertion of 10\% of the total records prior to -beginning any measurement. Over the course of the insertion period, 5\% of the -records were deleted, except for the tests in -Figures~\ref{fig:insert_delete_prop}, \ref{fig:sample_delete_prop}, and -\ref{fig:bloom}, in which 25\% of the records were deleted. Reported update -throughputs were calculated using both inserts and deletes, following the -warmup period. The standard values -used for parameters not being varied in a given test were $s = 6$, $N_b = -12000$, $k=1000$, and $\delta = 0.05$, with buffer rejection sampling. +Our proposed framework has a large design space, which we briefly +described in Section~\ref{ssec:design-space}. The contents of this +space will be described in much more detail in Chapter~\ref{chap:design-space}, +but as part of this work we did perform an experimental examination of our +framework to compare insertion throughput and query latency over various +points within the space. + +We examined this design space by considering \texttt{DE-WSS} specifically, +using a random sample of $500,000,000$ records from the \texttt{OSM} +dataset. Prior to taking any measurements, we warmed the structure up by +inserting 10\% of the total records in the set. We then measured the +update throughput over the course of the insertion of the remaining +records, randomly intermixing delete operations of 5\% of the +total data. In the tests for Figures~\ref{fig:insert_delete_prop}, +\ref{fig:sample_delete_prop}, and \ref{fig:bloom}, we instead deleted +25\% of the data. + +The reported update throughputs were calculated based on all of the +inserts and deletes following the warmup, executed on a single thread. +Query latency numbers were measured after all of the inserts and +deletes had been completed. We used standardized values of $s = 6$, +$N_b = 12000$, $k = 1000$ and $\delta = 0.05$ for parameters not be +varied in a given test, and all buffer queries were answered using +rejection sampling. We show the results of this testing in +Figures~\ref{fig:parameter-sweeps1}, \ref{fig:parameter-sweeps2}, and +\ref{fig:parameter-sweeps3}. \begin{figure*} \centering \subfloat[Insertion Throughput vs. Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-insert} \label{fig:insert_mt}} - \subfloat[Insertion Throughput vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-insert} \label{fig:insert_sf}} \\ + \subfloat[Insertion Throughput vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-insert} \label{fig:insert_sf}} \\ - \subfloat[Insertion Throughput vs.\\Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-insert} \label{fig:insert_delete_prop}} - \subfloat[Per 1000 Sampling Latency vs.\\Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-sample} \label{fig:sample_mt}} \\ + \subfloat[Per 1000 Sampling Latency vs.\\Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-sample} \label{fig:sample_mt}} + \subfloat[Per 1000 Sampling Latency vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-sample} \label{fig:sample_sf}} - \caption{DE-WSS Design Space Exploration I} + \caption{DE-WSS Design Space Exploration: Major Parameters} \label{fig:parameter-sweeps1} \end{figure*} -The results of this testing are displayed in -Figures~\ref{fig:parameter-sweeps1},~\ref{fig:parameter-sweeps2},~and:wq~\ref{fig:parameter-sweeps3}. -The two largest contributors to differences in performance were the selection -of layout policy and of delete policy. Figures~\ref{fig:insert_mt} and -\ref{fig:insert_sf} show that the choice of layout policy plays a larger role -than delete policy in insertion performance, with tiering outperforming -leveling in both configurations. The situation is reversed in sampling -performance, seen in Figure~\ref{fig:sample_mt} and \ref{fig:sample_sf}, where -the performance difference between layout policies is far less than between -delete policies. +We first note that the two largest contributors to performance +differences across all of the tests was the selection of layout and delete +policy. In particular, Figures~\ref{fig:insert_mt} and \ref{fig:insert_sf} +demonstrate that layout policy plays a very significant role in insertion +performance, with tiering outperforming leveling for both delete +policies. The next largest effect was the delete policy selection, +with tombstone deletes outperforming tagged deletes in insertion +performance. This result aligns with the asymptotic analysis of the two +approaches in Section~\ref{sampling-deletes}. It is interesting to note +however that the effect of layout policy was more significant in these +particular tests,\footnote{ + Although the largest performance gap in absolute terms was between + tiering with tombstones and tiering with tagging, the selection of + delete policy was not enough to overcome the relative difference + between leveling and tiering in these tests, hence us labeling the + layout policy as more significant. +} despite both layout policies having the same asymptotic performance. +This was likely due to the small amount of deletes (only 5\% of the total +operations) reducing their effect on the overall throughput. -The values used for the scale factor and buffer size have less influence than -layout and delete policy. Sampling performance is largely independent of them -over the ranges of values tested, as shown in Figures~\ref{fig:sample_mt} and -\ref{fig:sample_sf}. This isn't surprising, as these parameters adjust the -number of shards, which only contributes to shard alias construction time -during sampling and is is amortized over all samples taken in a query. The -buffer also contributes rejections, but the cost of a rejection is small and -the buffer constitutes only a small portion of the total weight, so these are -negligible. However, under tombstones there is an upward trend in latency with -buffer size, as delete checks occasionally require a full buffer scan. The -effect of buffer size on insertion is shown in Figure~\ref{fig:insert_mt}. -{ There is only a small improvement in insertion performance as the mutable -buffer grows. This is because a larger buffer results in fewer reconstructions, -but these reconstructions individually take longer, and so the net positive -effect is less than might be expected.} Finally, Figure~\ref{fig:insert_sf} -shows the effect of scale factor on insertion performance. As expected, tiering -performs better with higher scale factors, whereas the insertion performance of -leveling trails off as the scale factor is increased, due to write -amplification. +The influence of scale factor on update performance is shown in +Figure~\ref{fig:insert_sf}. The effect is different depending on the +layout policy, with larger scale factors benefitting update performance +under tiering, and hurting it under leveling. The effect of the mutable +buffer size on insertion, shown in Figure~\ref{fig:insert_mt}, is a little +less clear, but does show a slight upward trend, with larger buffers +enhancing update performance in all cases. A larger buffer results in +fewer reconstructions, but increases the size of these reconstructions, +so the effect isn't as large as one might initially expect. + +Query performance follows broadly opposite trends to updates. We see in +Figures~\ref{fig:sample_sf} and \ref{fig:sample_mt} that query latency +is better under leveling than tiering, and that tagging is better than +tombstones. More interestingly, the relative effect of the two decisions +is also different. Here, the selection of delete policy has a larger +effect than layout policy, in the sense that the better layout policy +(leveling) with the worse delete policy (tombstones), loses to the worse +layout policy (tiering) with the better delete policy (tagging). In fact, +under tagging, the performance difference between the two layout policies +is almost indistinguishable. + +Scale factor, shown in Figure~\ref{fig:sample_sf} has very little +effect on query performance. Thus, in this context, is would appear +that the scale factor is primarily useful as an insertion performance +tuning tool. The mutable buffer size, in Figure~\ref{fig:sample_mt}, +also generally has no clear effect. This is expected, because the buffer +contains onyl a small number of records relative to the entire dataset, +and so has a fairly low probability of being selected for drawing +a sample from. Even when it is selected, rejection sampling is very +inexpensive. The one exception to this trend is when using tombstones, +where the query performance degrades as the buffer size grows. This is +because the rejection check process for tombstones requires doing a full +buffer scan for every sample in some cases. \begin{figure*} \centering - \subfloat[Per 1000 Sampling Latency vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-sample} \label{fig:sample_sf}} + \subfloat[Insertion Throughput vs.\\Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-insert} \label{fig:insert_delete_prop}} \subfloat[Per 1000 Sampling Latency vs. Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-sample}\label{fig:sample_delete_prop}} \\ - \caption{DE-WSS Design Space Exploration II} + \caption{DE-WSS Design Space Exploration: Delete Bounding} \label{fig:parameter-sweeps2} \end{figure*} -Figures~\ref{fig:insert_delete_prop} and \ref{fig:sample_delete_prop} show the -cost of maintaining $\delta$ with a base delete rate of 25\%. The low cost of -an in-memory sampling rejection results in only a slight upward trend in the -sampling latency as the number of deleted records increases. While compaction -is necessary to avoid pathological cases, there does not seem to be a -significant benefit to aggressive compaction thresholds. -Figure~\ref{fig:insert_delete_prop} shows the effect of compactions on insert -performance. There is little effect on performance under tagging, but there is -a clear negative performance trend associated with aggressive compaction when -using tombstones. Under tagging, a single compaction is guaranteed to remove -all deleted records on a level, whereas with tombstones a compaction can -cascade for multiple levels before the delete bound is satisfied, resulting in -a larger cost per incident. +We also considered the effect that bounding the proportion of deleted +records within the structure has on performance. In these tests, +25\% of all records were eventually deleted over the course of the +benchmark. Figure~\ref{fig:sample_delete_prop} shows the effect +that maintaining these bounds has on query performance. In our +testing, we saw very little benefit to maintaining more aggressive +bounds on deletes on query performance. This is likely because +the cost of rejecting is relatively small in our query model. It +does have a clear effect on insertion performance, though, as shown +in Figure~\ref{fig:insert_delete_prop}. Under tagging, the cost of +maintaining increasingly tight bounds on deleted records is small, likely +because all deleted records can be dropped by a single reconstruction. +This means both that a violation of the bound can be resolved in a single +compaction, and also that violations of the bound are much less likely to +occur, as each reconstruction removes all deleted records. Tombstone-based +deletes require far more work to remove from the structure, and so we +would expect to see a degradation of insertion performance. Interestingly, +we see the opposite--higher bounds result in improved performance. This is +because of the sheer volume of deleted records having a measurable effect +on the size of the dynamized structure. The more proactive compactions +prune these records, resulting in better performance. \begin{figure*} \centering \subfloat[Sampling Latency vs. Sample Size]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-samplesize} \label{fig:sample_k}} \subfloat[Per 1000 Sampling Latency vs. Bloom Filter Memory]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-bloom}\label{fig:bloom}} \\ - \caption{DE-WSS Design Space Exploration III} + \caption{DE-WSS Design Space Exploration: Misc.} \label{fig:parameter-sweeps3} \end{figure*} -Figure~\ref{fig:bloom} demonstrates the trade-off between memory usage for -Bloom filters and sampling performance under tombstones. This test was run -using 25\% incoming deletes with no compaction, to maximize the number of -tombstones within the index as a worst-case scenario. As expected, allocating -more memory to Bloom filters, decreasing their false positive rates, -accelerates sampling. Finally, Figure~\ref{fig:sample_k} shows the relationship -between average per sample latency and the sample set size. It shows the effect -of amortizing the initial shard alias setup work across an increasing number of -samples, with $k=100$ as the point at which latency levels off. +Finally, we consider two more parameters: memory usage for bloom filters +and the effect of sample set size on query latency. Figure~\ref{fig:bloom} +shows the trade-off between memory allocated to filters and sampling +performance when tombstones are used. Recall that these Bloom filters +are specifically used for tombstones, not for general records, and +are used to accelerate rejection checks of sampled records. In this +test, 25\% of all records were deleted and $\delta$ was set to 0 to +disable all proactive compaction, to present a worst-case scenario in +terms of tombstones. Allocating additional memory to the Bloom filters +decreases their false positive rates, and results in better sampling +performance. Finally, Figure~\ref{fig:sample_k} compares the sample set +size and the average latency of drawing a single sample, to demonstrate +the ability of our procedure to amortize the preliminary work across +multiple samples in a sample set. After a sample set size of $k=100$, +we stop seeing a benefit from increasing the size, indicating the limit +of how much the preliminary work can be effectively amortized. -Based upon these results, a set of parameters was established for the extended -indexes, which is used in the next section for baseline comparisons. This -standard configuration uses tagging as the delete policy and tiering as the -layout policy, with $k=1000$, $N_b = 12000$, $\delta = 0.05$, and $s = 6$. +Based upon the results of this preliminary study, we established a set +of standardized parameters to use for the baseline comparisons in the +remainder of this section. We will use tagging for deletes, tiering as +the layout policy, $k=1000$, $N_b = 12000$, $\delta = 0.5$, and $s = +6$, unless otherwise stated. |