Updates

author: Douglas Rumbaugh <dbr4@psu.edu> 2025-05-12 19:59:26 -0400
committer: Douglas Rumbaugh <dbr4@psu.edu> 2025-05-12 19:59:26 -0400
commit: 5ffc53e69e956054fdefd1fe193e00eee705dcab (patch)
tree: 74fd32db95211d0be067d22919e65ac959e4fa46 /chapters/sigmod23/exp-parameter-space.tex
parent: 901a04fd8ec9a07b7bd195517a6d9e89da3ecab6 (diff)
download: dissertation-5ffc53e69e956054fdefd1fe193e00eee705dcab.tar.gz
1 files changed, 128 insertions, 77 deletions
diff --git a/chapters/sigmod23/exp-parameter-space.tex b/chapters/sigmod23/exp-parameter-space.tex
index d2057ac..d53c592 100644
--- a/chapters/sigmod23/exp-parameter-space.tex
+++ b/chapters/sigmod23/exp-parameter-space.tex
@@ -1,105 +1,156 @@
-\subsection{Framework Design Space Exploration}
+\subsection{Design Space Exploration}
 \label{ssec:ds-exp}
 
-The proposed framework brings with it a large design space, described in
-Section~\ref{ssec:design-space}. First, this design space will be examined
-using a standardized benchmark to measure the average insertion throughput and
-sampling latency of DE-WSS at several points within this space. Tests were run
-using a random selection of 500 million records from the OSM dataset, with the
-index warmed up by the insertion of 10\% of the total records prior to
-beginning any measurement. Over the course of the insertion period, 5\% of the
-records were deleted, except for the tests in
-Figures~\ref{fig:insert_delete_prop}, \ref{fig:sample_delete_prop}, and
-\ref{fig:bloom}, in which 25\% of the records were deleted. Reported update
-throughputs were calculated using both inserts and deletes, following the
-warmup period. The standard values
-used for parameters not being varied in a given test were $s = 6$, $N_b =
-12000$, $k=1000$, and $\delta = 0.05$, with buffer rejection sampling.
+Our proposed framework has a large design space, which we briefly
+described in Section~\ref{ssec:design-space}. The contents of this
+space will be described in much more detail in Chapter~\ref{chap:design-space},
+but as part of this work we did perform an experimental examination of our
+framework to compare insertion throughput and query latency over various
+points within the space.
+
+We examined this design space by considering \texttt{DE-WSS} specifically,
+using a random sample of $500,000,000$ records from the \texttt{OSM}
+dataset. Prior to taking any measurements, we warmed the structure up by
+inserting 10\% of the total records in the set. We then measured the
+update throughput over the course of the insertion of the remaining
+records, randomly intermixing delete operations of 5\% of the
+total data. In the tests for Figures~\ref{fig:insert_delete_prop},
+\ref{fig:sample_delete_prop}, and \ref{fig:bloom}, we instead deleted
+25\% of the data.
+
+The reported update throughputs were calculated based on all of the
+inserts and deletes following the warmup, executed on a single thread.
+Query latency numbers were measured after all of the inserts and
+deletes had been completed. We used standardized values of $s = 6$,
+$N_b = 12000$, $k = 1000$ and $\delta = 0.05$ for parameters not be
+varied in a given test, and all buffer queries were answered using
+rejection sampling.  We show the results of this testing in
+Figures~\ref{fig:parameter-sweeps1}, \ref{fig:parameter-sweeps2}, and
+\ref{fig:parameter-sweeps3}.
 
 \begin{figure*}
     \centering
     \subfloat[Insertion Throughput vs. Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-insert} \label{fig:insert_mt}}
-    \subfloat[Insertion Throughput vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-insert} \label{fig:insert_sf}} \\ 
+    \subfloat[Insertion Throughput vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-insert} \label{fig:insert_sf}} \\
 
-    \subfloat[Insertion Throughput vs.\\Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-insert} \label{fig:insert_delete_prop}} 
-    \subfloat[Per 1000 Sampling Latency vs.\\Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-sample} \label{fig:sample_mt}}  \\ 
+    \subfloat[Per 1000 Sampling Latency vs.\\Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-sample} \label{fig:sample_mt}} 
+    \subfloat[Per 1000 Sampling Latency vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-sample} \label{fig:sample_sf}}
 
-    \caption{DE-WSS Design Space Exploration I} 
+    \caption{DE-WSS Design Space Exploration: Major Parameters} 
     \label{fig:parameter-sweeps1}
 \end{figure*}
 
-The results of this testing are displayed in 
-Figures~\ref{fig:parameter-sweeps1},~\ref{fig:parameter-sweeps2},~and:wq~\ref{fig:parameter-sweeps3}.
-The two largest contributors to differences in performance were the selection
-of layout policy and of delete policy. Figures~\ref{fig:insert_mt} and
-\ref{fig:insert_sf} show that the choice of layout policy plays a larger role
-than delete policy in insertion performance, with tiering outperforming
-leveling in both configurations. The situation is reversed in sampling
-performance, seen in Figure~\ref{fig:sample_mt} and \ref{fig:sample_sf}, where
-the performance difference between layout policies is far less than between
-delete policies.
+We first note that the two largest contributors to performance
+differences across all of the tests was the selection of layout and delete
+policy. In particular, Figures~\ref{fig:insert_mt} and \ref{fig:insert_sf}
+demonstrate that layout policy plays a very significant role in insertion
+performance, with tiering outperforming leveling for both delete
+policies. The next largest effect was the delete policy selection,
+with tombstone deletes outperforming tagged deletes in insertion
+performance. This result aligns with the asymptotic analysis of the two
+approaches in Section~\ref{sampling-deletes}. It is interesting to note
+however that the effect of layout policy was more significant in these
+particular tests,\footnote{
+    Although the largest performance gap in absolute terms was between
+    tiering with tombstones and tiering with tagging, the selection of
+    delete policy was not enough to overcome the relative difference
+    between leveling and tiering in these tests, hence us labeling the
+    layout policy as more significant.
+} despite both layout policies having the same asymptotic performance.
+This was likely due to the small amount of deletes (only 5\% of the total
+operations) reducing their effect on the overall throughput. 
 
-The values used for the scale factor and buffer size have less influence than
-layout and delete policy. Sampling performance is largely independent of them
-over the ranges of values tested, as shown in Figures~\ref{fig:sample_mt} and
-\ref{fig:sample_sf}. This isn't surprising, as these parameters adjust the
-number of shards, which only contributes to shard alias construction time
-during sampling and is is amortized over all samples taken in a query. The
-buffer also contributes rejections, but the cost of a rejection is small and
-the buffer constitutes only a small portion of the total weight, so these are
-negligible. However, under tombstones there is an upward trend in latency with
-buffer size, as delete checks occasionally require a full buffer scan. The
-effect of buffer size on insertion is shown in Figure~\ref{fig:insert_mt}.
-{  There is only a small improvement in insertion performance as the mutable
-buffer grows. This is because a larger buffer results in fewer reconstructions,
-but these reconstructions individually take longer, and so the net positive
-effect is less than might be expected.} Finally, Figure~\ref{fig:insert_sf}
-shows the effect of scale factor on insertion performance. As expected, tiering
-performs better with higher scale factors, whereas the insertion performance of
-leveling trails off as the scale factor is increased, due to write
-amplification. 
+The influence of scale factor on update performance is shown in
+Figure~\ref{fig:insert_sf}. The effect is different depending on the
+layout policy, with larger scale factors benefitting update performance
+under tiering, and hurting it under leveling. The effect of the mutable
+buffer size on insertion, shown in Figure~\ref{fig:insert_mt}, is a little
+less clear, but does show a slight upward trend, with larger buffers
+enhancing update performance in all cases. A larger buffer results in
+fewer reconstructions, but increases the size of these reconstructions,
+so the effect isn't as large as one might initially expect.
+
+Query performance follows broadly opposite trends to updates. We see in
+Figures~\ref{fig:sample_sf} and \ref{fig:sample_mt} that query latency
+is better under leveling than tiering, and that tagging is better than
+tombstones. More interestingly, the relative effect of the two decisions
+is also different. Here, the selection of delete policy has a larger
+effect than layout policy, in the sense that the better layout policy
+(leveling) with the worse delete policy (tombstones), loses to the worse
+layout policy (tiering) with the better delete policy (tagging). In fact,
+under tagging, the performance difference between the two layout policies
+is almost indistinguishable. 
+
+Scale factor, shown in Figure~\ref{fig:sample_sf} has very little
+effect on query performance. Thus, in this context, is would appear
+that the scale factor is primarily useful as an insertion performance
+tuning tool. The mutable buffer size, in Figure~\ref{fig:sample_mt},
+also generally has no clear effect. This is expected, because the buffer
+contains onyl a small number of records relative to the entire dataset,
+and so has a fairly low probability of being selected for drawing
+a sample from. Even when it is selected, rejection sampling is very
+inexpensive. The one exception to this trend is when using tombstones,
+where the query performance degrades as the buffer size grows. This is
+because the rejection check process for tombstones requires doing a full
+buffer scan for every sample in some cases.
 
 \begin{figure*}
     \centering
-    \subfloat[Per 1000 Sampling Latency vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-sample} \label{fig:sample_sf}}
+    \subfloat[Insertion Throughput vs.\\Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-insert} \label{fig:insert_delete_prop}} 
     \subfloat[Per 1000 Sampling Latency vs. Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-sample}\label{fig:sample_delete_prop}} \\
-    \caption{DE-WSS Design Space Exploration II}
+    \caption{DE-WSS Design Space Exploration: Delete Bounding}
     \label{fig:parameter-sweeps2}
 \end{figure*}
 
-Figures~\ref{fig:insert_delete_prop} and \ref{fig:sample_delete_prop} show the
-cost of maintaining $\delta$ with a base delete rate of 25\%. The low cost of
-an in-memory sampling rejection results in only a slight upward trend in the
-sampling latency as the number of deleted records increases. While compaction
-is necessary to avoid pathological cases, there does not seem to be a
-significant benefit to aggressive compaction thresholds.
-Figure~\ref{fig:insert_delete_prop} shows the effect of compactions on insert
-performance. There is little effect on performance under tagging, but there is
-a clear negative performance trend associated with aggressive compaction when
-using tombstones. Under tagging, a single compaction is guaranteed to remove
-all deleted records on a level, whereas with tombstones a compaction can
-cascade for multiple levels before the delete bound is satisfied, resulting in
-a larger cost per incident.
+We also considered the effect that bounding the proportion of deleted
+records within the structure has on performance. In these tests,
+25\% of all records were eventually deleted over the course of the
+benchmark. Figure~\ref{fig:sample_delete_prop} shows the effect
+that maintaining these bounds has on query performance. In our
+testing, we saw very little benefit to maintaining more aggressive
+bounds on deletes on query performance. This is likely because
+the cost of rejecting is relatively small in our query model. It
+does have a clear effect on insertion performance, though, as shown
+in Figure~\ref{fig:insert_delete_prop}. Under tagging, the cost of
+maintaining increasingly tight bounds on deleted records is small, likely
+because all deleted records can be dropped by a single reconstruction.
+This means both that a violation of the bound can be resolved in a single
+compaction, and also that violations of the bound are much less likely to
+occur, as each reconstruction removes all deleted records. Tombstone-based
+deletes require far more work to remove from the structure, and so we
+would expect to see a degradation of insertion performance. Interestingly,
+we see the opposite--higher bounds result in improved performance. This is
+because of the sheer volume of deleted records having a measurable effect
+on the size of the dynamized structure. The more proactive compactions
+prune these records, resulting in better performance.
 
 \begin{figure*}
     \centering
     \subfloat[Sampling Latency vs. Sample Size]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-samplesize} \label{fig:sample_k}}
     \subfloat[Per 1000 Sampling Latency vs. Bloom Filter Memory]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-bloom}\label{fig:bloom}} \\
-    \caption{DE-WSS Design Space Exploration III}
+    \caption{DE-WSS Design Space Exploration: Misc.}
     \label{fig:parameter-sweeps3}
 \end{figure*}
 
-Figure~\ref{fig:bloom} demonstrates the trade-off between memory usage for
-Bloom filters and sampling performance under tombstones. This test was run
-using 25\% incoming deletes with no compaction, to maximize the number of
-tombstones within the index as a worst-case scenario. As expected, allocating
-more memory to Bloom filters, decreasing their false positive rates,
-accelerates sampling. Finally, Figure~\ref{fig:sample_k} shows the relationship
-between average per sample latency and the sample set size. It shows the effect
-of amortizing the initial shard alias setup work across an increasing number of
-samples, with $k=100$ as the point at which latency levels off.
+Finally, we consider two more parameters: memory usage for bloom filters
+and the effect of sample set size on query latency. Figure~\ref{fig:bloom}
+shows the trade-off between memory allocated to filters and sampling
+performance when tombstones are used. Recall that these Bloom filters
+are specifically used for tombstones, not for general records, and
+are used to accelerate rejection checks of sampled records. In this
+test, 25\% of all records were deleted and $\delta$ was set to 0 to
+disable all proactive compaction, to present a worst-case scenario in
+terms of tombstones. Allocating additional memory to the Bloom filters
+decreases their false positive rates, and results in better sampling
+performance. Finally, Figure~\ref{fig:sample_k} compares the sample set
+size and the average latency of drawing a single sample, to demonstrate
+the ability of our procedure to amortize the preliminary work across
+multiple samples in a sample set. After a sample set size of $k=100$,
+we stop seeing a benefit from increasing the size, indicating the limit
+of how much the preliminary work can be effectively amortized.
 
-Based upon these results, a set of parameters was established for the extended
-indexes, which is used in the next section for baseline comparisons. This
-standard configuration uses tagging as the delete policy and tiering as the
-layout policy, with $k=1000$, $N_b = 12000$, $\delta = 0.05$, and $s = 6$.
+Based upon the results of this preliminary study, we established a set
+of standardized parameters to use for the baseline comparisons in the
+remainder of this section. We will use tagging for deletes, tiering as
+the layout policy, $k=1000$, $N_b = 12000$, $\delta = 0.5$, and $s =
+6$, unless otherwise stated.
author	Douglas Rumbaugh <dbr4@psu.edu>	2025-05-12 19:59:26 -0400
committer	Douglas Rumbaugh <dbr4@psu.edu>	2025-05-12 19:59:26 -0400
commit	5ffc53e69e956054fdefd1fe193e00eee705dcab (patch)
tree	74fd32db95211d0be067d22919e65ac959e4fa46 /chapters/sigmod23/exp-parameter-space.tex
parent	901a04fd8ec9a07b7bd195517a6d9e89da3ecab6 (diff)
download	dissertation-5ffc53e69e956054fdefd1fe193e00eee705dcab.tar.gz