\section{Evaluation}
\label{sec:experiment}
In this section, we provide comprehensive performance benchmarks
of implementations of the dynamized structures discussed in
Sections~\ref{sec:instance} and \ref{sec:discussion}. All of the code was
written using C++17. The full implementations, including benchmarking
code, are available on GitHub on the Modified BSD License, at 
\url{https://github.com/psu-db/sampling-extension-original}.\footnote{
	We also provide a ``cleaner'' implementation for WSS and WIRS,
	with a structure and nomenclature better aligned with this
	chapter, here: \url{https://github.com/psu-db/sampling-extension}.
}

\Paragraph{Experimental Setup.} We ran all of our experiments on Ubuntu
20.04 LTS using a server equipped with dual socket Intel Xeon Gold 6242R
processes with 40 physical cores and 384 GiB of physical memory. We
performed testing of external structures with a 4 TB WD Red SA500 SATA
drive rated at 95000 IOPS for random reads and 82000 IOPS for random
writes All benchmarking code was compiled with GCC version 11.3.0 with
the \texttt{-O3} optimization level.

\Paragraph{Datasets.} We used a variety of synthetic and real-world
datasets of various distributions to test sampling performance. For all
of our datasets, we treated the data as a sequence of key-value pairs
with a 64-bit integer key and a 32-bit integer value. Our dynamizations
introduced a 32-bit header to each record as well. This header was not
added to records when testing dynamic baselines. Additionally, weighted
testing attached a 64-bit integer weight to each record. This weight was
not included in the record for non-weighted testing. The weights and
keys were both used directly from the datasets, and values were added
seperately and unique to each record.

We used the following datasets for testing,
\begin{itemize}
\item \textbf{Synthetic Uniform.} A non-weighted, synthetically generated list 
                                  of keys drawn from a uniform distribution.
\item \textbf{Synthetic Zipfian.} A non-weighted, synthetically generated list 
                                  of keys drawn from a Zipfian distribution with 
                                  a skew of $0.8$.
\item \textbf{Twitter~\cite{data-twitter,data-twitter1}.} $41$ million Twitter user ids, weighted by follower counts.
\item \textbf{Delicious~\cite{data-delicious}.} $33.7$ million URLs, represented using unique integers, 
                          weighted by the number of associated tags.
\item \textbf{OSM~\cite{data-osm}.} $2.6$ billion geospatial coordinates for points
                    of interest, collected by OpenStreetMap. We used the latitude, converted
                    to a 64-bit integer, as the key and the number of
                    its associated semantic tags as the weight. 
\end{itemize}

We did not use the synthetic uniform and zipfian data sets for testing
WSS and WIRS, as these datasets lacked weights. We also did not use the
Twitter and Delicious datasets for unweighted testing, as they have
uninteresting key distributions.

\Paragraph{Structures Compared.} As a basis of comparison, we tested
both our dynamized SSI implementations, and existing dynamic baselines,
for each sampling problem considered. Specifically, we consider a the
following dynamized structures,
\begin{itemize}

\item \textbf{DE-WSS.} An implementation of the dynamized alias
structure~\cite{walker74} for weighted set sampling discussed
in Section~\ref{ssec:wss-struct}. We compare this against a WSS
implementation of Olken's method on a B+Tree with aggregate weight tags
(\textbf{AGG-BTree})~\cite{olken95}, based on the B+tree implementation
in the TLX library~\cite{tlx}.

\item \textbf{DE-IRS.} An implementation of the dynamized ISAM tree for
independent range sampling, discussed in Section~\ref{ssec:irs-struct}. We
also implement a concurrent version based on our discussion in
Section~\ref{ssec:ext-concurrency} and an external version from
Section~\ref{ssec:ext-external}. We compare the external and concurrent
versions against the AB-tree~\cite{zhao22}, and the single-threaded,
in memory version was compare with an IRS implementation of Olken's
method on an AGG-BTree.

\item \textbf{DE-WIRS.} An implementation of the dynamized alias-augmented
B+Tree~\cite{afshani17} as discussed in Section~\ref{ssec:wirs-struct} for
weighted indepedent range sampling. We compare this against a WIRS
implementation of Olken's method on an AGG-BTree.

\end{itemize}

All of the tested structures, with the exception of the external memory
DE-IRS implementation and AB-Tree, were wholely contained within system
memory. AB-Tree is a native external structure, so for the in-memory
concurrency evaluation we configured it with enough cache to maintain
the entire structure in memory to simulate an in-memory implementation.\footnote{
	Because of the nature of sampling queries, traditional
	efficient locking techniques for B+Trees are not able to be
	used~\cite{zhao22}. The alternatives were to run AB-Tree in this
	manner, or to globally lock the B+Tree for every operation. We
	elected to use the former approach for this chapter. We used the
	latter approach in the next chapter. 
}