\subsection{Comparison to Baselines} Next, the performance of indexes extended using the framework is compared against tree sampling on the aggregate B+tree, as well as problem-specific SSIs for WSS, WIRS, and IRS queries. Unless otherwise specified, IRS and WIRS queries were executed with a selectivity of $0.1\%$ and 500 million randomly selected records from the OSM dataset were used. The uniform and zipfian synthetic datasets were 1 billion records in size. All benchmarks warmed up the data structure by inserting 10\% of the records, and then measured the throughput inserting the remaining records, while deleting 5\% of them over the course of the benchmark. Once all records were inserted, the sampling performance was measured. The reported update throughputs were calculated using both inserts and deletes, following the warmup period. \begin{figure*} \centering \subfloat[Insertion Throughput vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wss-insert} \label{fig:wss-insert}} \subfloat[Sampling Latency vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wss-sample} \label{fig:wss-sample}} \\ \subfloat[Insertion Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-wss-insert} \label{fig:wss-insert-s}} \subfloat[Sampling Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-wss-sample} \label{fig:wss-sample-s}} \caption{Framework Comparisons to Baselines for WSS} \end{figure*} Starting with WSS, Figure~\ref{fig:wss-insert} shows that the DE-WSS structure is competitive with the AGG B+tree in terms of insertion performance, achieving about 85\% of the AGG B+tree's insertion throughput on the Twitter dataset, and beating it by similar margins on the other datasets. In terms of sampling performance in Figure~\ref{fig:wss-sample}, it beats the B+tree handily, and compares favorably to the static alias structure. Figures~\ref{fig:wss-insert-s} and \ref{fig:wss-sample-s} show the performance scaling of the three structures as the dataset size increases. All of the structures exhibit the same type of performance degradation with respect to dataset size. \begin{figure*} \centering \subfloat[Insertion Throughput vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wirs-insert} \label{fig:wirs-insert}} \subfloat[Sampling Latency vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wirs-sample} \label{fig:wirs-sample}} \caption{Framework Comparison to Baselines for WIRS} \end{figure*} Figures~\ref{fig:wirs-insert} and \ref{fig:wirs-sample} show the performance of the DE-WIRS index, relative to the AGG B+tree and the alias-augmented B+tree. This example shows the same pattern of behavior as was seen with DE-WSS, though the margin between the DE-WIRS and its corresponding SSI is much narrower. Additionally, the constant factors associated with the construction cost of the alias-augmented B+tree are much larger than the alias structure. The loss of insertion performance due to this is seen clearly in Figure~\ref{fig:wirs-insert}, where the margin of advantage between DE-WIRS and the AGG B+tree in insertion throughput shrinks compared to the DE-WSS index, and the AGG B+tree's advantage on the Twitter dataset is expanded. \begin{figure*} \subfloat[Insertion Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-insert} \label{fig:irs-insert-s}} \subfloat[Sampling Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-sample} \label{fig:irs-sample-s}} \\ \subfloat[Insertion Throughput vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-insert} \label{fig:irs-insert1}} \subfloat[Sampling Latency vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-sample} \label{fig:irs-sample1}} \\ \subfloat[Delete Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-delete} \label{fig:irs-delete}} \subfloat[Sampling Latency vs. Sample Size]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-samplesize} \label{fig:irs-samplesize}} \caption{Framework Comparison to Baselines for IRS} \end{figure*} Finally, Figures~\ref{fig:irs-insert1} and \ref{fig:irs-sample1} show a comparison of the in-memory DE-IRS index against the in-memory ISAM tree and the AGG B+tree for answering IRS queries. The cost of bulk-loading the ISAM tree is less than the cost of building the alias structure, or the alias-augmented B+tree, and so here DE-IRS defeats the AGG B+tree by wider margins in insertion throughput, though the margin narrows significantly in terms of sampling performance advantage. DE-IRS was further tested to evaluate scalability. Figure~\ref{fig:irs-insert-s} shows average insertion throughput, Figure~\ref{fig:irs-delete} shows average delete latency (under tagging), and Figure~\ref{fig:irs-sample-s} shows average sampling latencies for DE-IRS and AGG B+tree over a range of data sizes. In all cases, DE-IRS and B+tree show similar patterns of performance degradation as the datasize grows. Note that the delete latencies of DE-IRS are worse than AGG B+tree, because of the B+tree's cheaper point-lookups. Figure~\ref{fig:irs-sample-s} also includes one other point of interest: the sampling performance of DE-IRS \emph{improves} when the data size grows from one million to ten million records. While at first glance the performance increase may appear paradoxical, it actually demonstrates an important result concerning the effect of the unsorted mutable buffer on index performance. At one million records, the buffer constitutes approximately 1\% of the total data size; this results in the buffer being sampled from with greater frequency (as it has more total weight) than would be the case with larger data. The greater the frequency of buffer sampling, the more rejections will occur, and the worse the sampling performance will be. This illustrates the importance of keeping the buffer small, even when a scan is not used for buffer sampling. Finally, Figure~\ref{fig:irs-samplesize} shows the decreasing per-sample cost as the number of records requested by a sampling query grows for DE-IRS, compared to AGG B+tree. Note that DE-IRS benefits significantly more from batching samples than AGG B+tree, and that the improvement is greatest up to $k=100$ samples per query.