summaryrefslogtreecommitdiffstats
path: root/chapters/sigmod23
diff options
context:
space:
mode:
Diffstat (limited to 'chapters/sigmod23')
-rw-r--r--chapters/sigmod23/abstract.tex29
-rw-r--r--chapters/sigmod23/background.tex182
-rw-r--r--chapters/sigmod23/conclusion.tex17
-rw-r--r--chapters/sigmod23/examples.tex143
-rw-r--r--chapters/sigmod23/exp-baseline.tex98
-rw-r--r--chapters/sigmod23/exp-extensions.tex40
-rw-r--r--chapters/sigmod23/exp-parameter-space.tex105
-rw-r--r--chapters/sigmod23/experiment.tex48
-rw-r--r--chapters/sigmod23/extensions.tex57
-rw-r--r--chapters/sigmod23/framework.tex573
-rw-r--r--chapters/sigmod23/introduction.tex20
-rw-r--r--chapters/sigmod23/relatedwork.tex33
12 files changed, 1345 insertions, 0 deletions
diff --git a/chapters/sigmod23/abstract.tex b/chapters/sigmod23/abstract.tex
new file mode 100644
index 0000000..3ff0c08
--- /dev/null
+++ b/chapters/sigmod23/abstract.tex
@@ -0,0 +1,29 @@
+\begin{abstract}
+
+ The execution of analytical queries on massive datasets presents challenges
+ due to long response times and high computational costs. As a result, the
+ analysis of representative samples of data has emerged as an attractive
+ alternative; this avoids the cost of processing queries against the entire
+ dataset, while still producing statistically valid results. Unfortunately,
+ the sampling techniques in common use sacrifice either sample quality or
+ performance, and so are poorly suited for this task. However, it is
+ possible to build high quality sample sets efficiently with the assistance
+ of indexes. This introduces a new challenge: real-world data is subject to
+ continuous update, and so the indexes must be kept up to date. This is
+ difficult, because existing sampling indexes present a dichotomy; efficient
+ sampling indexes are difficult to update, while easily updatable indexes
+ have poor sampling performance. This paper seeks to address this gap by
+ proposing a general and practical framework for extending most sampling
+ indexes with efficient update support, based on splitting indexes into
+ smaller shards, combined with a systematic approach to the periodic
+ reconstruction. The framework's design space is examined, with an eye
+ towards exploring trade-offs between update performance, sampling
+ performance, and memory usage. Three existing static sampling indexes are
+ extended using this framework to support updates, and the generalization of
+ the framework to concurrent operations and larger-than-memory data is
+ discussed. Through a comprehensive suite of benchmarks, the extended
+ indexes are shown to match or exceed the update throughput of
+ state-of-the-art dynamic baselines, while presenting significant
+ improvements in sampling latency.
+
+\end{abstract}
diff --git a/chapters/sigmod23/background.tex b/chapters/sigmod23/background.tex
new file mode 100644
index 0000000..58324bd
--- /dev/null
+++ b/chapters/sigmod23/background.tex
@@ -0,0 +1,182 @@
+\section{Background}
+\label{sec:background}
+
+This section formalizes the sampling problem and describes relevant existing
+solutions. Before discussing these topics, though, a clarification of
+definition is in order. The nomenclature used to describe sampling varies
+slightly throughout the literature. In this chapter, the term \emph{sample} is
+used to indicate a single record selected by a sampling operation, and a
+collection of these samples is called a \emph{sample set}; the number of
+samples within a sample set is the \emph{sample size}. The term \emph{sampling}
+is used to indicate the selection of either a single sample or a sample set;
+the specific usage should be clear from context.
+
+
+\Paragraph{Independent Sampling Problem.} When conducting sampling, it is often
+desirable for the drawn samples to have \emph{statistical independence}. This
+requires that the sampling of a record does not affect the probability of any
+other record being sampled in the future. Independence is a requirement for the
+application of statistical tools such as the Central Limit
+Theorem~\cite{bulmer79}, which is the basis for many concentration bounds.
+A failure to maintain independence in sampling invalidates any guarantees
+provided by these statistical methods.
+
+In each of the problems considered, sampling can be performed either with
+replacement (WR) or without replacement (WoR). It is possible to answer any WoR
+sampling query using a constant number of WR queries, followed by a
+deduplication step~\cite{hu15}, and so this chapter focuses exclusively on WR
+sampling.
+
+A basic version of the independent sampling problem is \emph{weighted set
+sampling} (WSS),\footnote{
+ This nomenclature is adopted from Tao's recent survey of sampling
+ techniques~\cite{tao22}. This problem is also called
+ \emph{weighted random sampling} (WRS) in the literature.
+}
+in which each record is associated with a weight that determines its
+probability of being sampled. More formally, WSS is defined
+as:
+\begin{definition}[Weighted Set Sampling~\cite{walker74}]
+ Let $D$ be a set of data whose members are associated with positive
+ weights $w: D \to \mathbb{R}^+$. Given an integer $k \geq 1$, a weighted
+ set sampling query returns $k$ independent random samples from $D$ with
+ each data point $d \in D$ having a probability of $\frac{w(d)}{\sum_{p\in
+ D}w(p)}$ of being sampled.
+\end{definition}
+Each query returns a sample set of size $k$, rather than a
+single sample. Queries returning sample sets are the common case, because the
+robustness of analysis relies on having a sufficiently large sample
+size~\cite{ben-eliezer20}. The common \emph{simple random sampling} (SRS)
+problem is a special case of WSS, where every element has unit weight.
+
+In the context of databases, it is also common to discuss a more general
+version of the sampling problem, called \emph{independent query sampling}
+(IQS)~\cite{hu14}. An IQS query samples a specified number of records from the
+result set of a database query. In this context, it is insufficient to merely
+ensure individual records are sampled independently; the sample sets returned
+by repeated IQS queries must be independent as well. This provides a variety of
+useful properties, such as fairness and representativeness of query
+results~\cite{tao22}. As a concrete example, consider simple random sampling on
+the result set of a single-dimensional range reporting query. This is
+called independent range sampling (IRS), and is formally defined as:
+
+\begin{definition}[Independent Range Sampling~\cite{tao22}]
+ Let $D$ be a set of $n$ points in $\mathbb{R}$. Given a query
+ interval $q = [x, y]$ and an integer $k$, an independent range sampling
+ query returns $k$ independent samples from $D \cap q$ with each
+ point having equal probability of being sampled.
+\end{definition}
+A generalization of IRS exists, called \emph{Weighted Independent Range
+Sampling} (WIRS)~\cite{afshani17}, which is similar to WSS. Each point in $D$
+is associated with a positive weight $w: D \to \mathbb{R}^+$, and samples are
+drawn from the range query results $D \cap q$ such that each data point has a
+probability of $\nicefrac{w(d)}{\sum_{p \in D \cap q}w(p)}$ of being sampled.
+
+
+\Paragraph{Existing Solutions.} While many sampling techniques exist,
+few are supported in practical database systems. The existing
+\texttt{TABLESAMPLE} operator provided by SQL in all major DBMS
+implementations~\cite{postgres-doc} requires either a linear scan (e.g.,
+Bernoulli sampling) that results in high sample retrieval costs, or relaxed
+statistical guarantees (e.g., block sampling~\cite{postgres-doc} used in
+PostgreSQL).
+
+Index-assisted sampling solutions have been studied
+extensively. Olken's method~\cite{olken89} is a classical solution to
+independent sampling problems. This algorithm operates upon traditional search
+trees, such as the B+tree used commonly as a database index. It conducts a
+random walk on the tree uniformly from the root to a leaf, resulting in a
+$O(\log n)$ sampling cost for each returned record. Should weighted samples be
+desired, rejection sampling can be performed. A sampled record, $r$, is
+accepted with probability $\nicefrac{w(r)}{w_{max}}$, with an expected
+number of $\nicefrac{w_{max}}{w_{avg}}$ samples to be taken per element in the
+sample set. Olken's method can also be extended to support general IQS by
+rejecting all sampled records failing to satisfy the query predicate. It can be
+accelerated by adding aggregated weight tags to internal
+nodes~\cite{olken-thesis,zhao22}, allowing rejection sampling to be performed
+during the tree-traversal to abort dead-end traversals early.
+
+\begin{figure}
+ \centering
+ \includegraphics[width=.5\textwidth]{img/sigmod23/alias.pdf}
+ \caption{\textbf{A pictorial representation of an alias
+ structure}, built over a set of weighted records. Sampling is performed by
+ first (1) selecting a cell by uniformly generating an integer index on
+ $[0,n)$, and then (2) selecting an item by generating a
+ second uniform float on $[0,1]$ and comparing it to the cell's normalized
+ cutoff values. In this example, the first random number is $0$,
+ corresponding to the first cell, and the second is $.7$. This is larger
+ than $\nicefrac{.15}{.25}$, and so $3$ is selected as the result of the
+ query.
+ This allows $O(1)$ independent weighted set sampling, but adding a new
+ element requires a weight adjustment to every element in the structure, and
+ so isn't generally possible without performing a full reconstruction.}
+ \label{fig:alias}
+
+\end{figure}
+
+There also exist static data structures, referred to in this chapter as static
+sampling indexes (SSIs)\footnote{
+The name SSI was established in the published version of this paper prior to the
+realization that a distinction between the terms index and data structure would
+be useful. We'll continue to use the term SSI for the remainder of this chapter,
+to maintain consistency with the published work, but technically an SSI refers to
+ a data structure, not an index, in the nomenclature established in the previous
+ chapter.
+ }, that are capable of answering sampling queries in
+near-constant time\footnote{
+ The designation
+``near-constant'' is \emph{not} used in the technical sense of being constant
+to within a polylogarithmic factor (i.e., $\tilde{O}(1)$). It is instead used to mean
+constant to within an additive polylogarithmic term, i.e., $f(x) \in O(\log n +
+1)$.
+%For example, drawing $k$ samples from $n$ records using a near-constant
+%approach would require $O(\log n + k)$ time. This is in contrast to a
+%tree-traversal approach, which would require $O(k\log n)$ time.
+} relative to the size of the dataset. An example of such a
+structure is used in Walker's alias method \cite{walker74,vose91}, a technique
+for answering WSS queries with $O(1)$ query cost per sample, but requiring
+$O(n)$ time to construct. It distributes the weight of items across $n$ cells,
+where each cell is partitioned into at most two items, such that the total
+proportion of each cell assigned to an item is its total weight. A query
+selects one cell uniformly at random, then chooses one of the two items in the
+cell by weight; thus, selecting items with probability proportional to their
+weight in $O(1)$ time. A pictorial representation of this structure is shown in
+Figure~\ref{fig:alias}.
+
+The alias method can also be used as the basis for creating SSIs capable of
+answering general IQS queries using a technique called alias
+augmentation~\cite{tao22}. As a concrete example, previous
+papers~\cite{afshani17,tao22} have proposed solutions for WIRS queries using $O(\log n
++ k)$ time, where the $\log n$ cost is only be paid only once per query, after which
+elements can be sampled in constant time. This structure is built by breaking
+the data up into disjoint chunks of size $\nicefrac{n}{\log n}$, called
+\emph{fat points}, each with an alias structure. A B+tree is then constructed,
+using the fat points as its leaf nodes. The internal nodes are augmented with
+an alias structure over the total weight of each child. This alias structure
+is used instead of rejection sampling to determine the traversal path to take
+through the tree, and then the alias structure of the fat point is used to
+sample a record. Because rejection sampling is not used during the traversal,
+two traversals suffice to establish the valid range of records for sampling,
+after which samples can be collected without requiring per-sample traversals.
+More examples of alias augmentation applied to different IQS problems can be
+found in a recent survey by Tao~\cite{tao22}.
+
+There do exist specialized sampling indexes~\cite{hu14} with both efficient
+sampling and support for updates, but these are restricted to specific query
+types and are often very complex structures, with poor constant factors
+associated with sampling and update costs, and so are of limited practical
+utility. There has also been work~\cite{hagerup93,matias03,allendorf23} on
+extending the alias structure to support weight updates over a fixed set of
+elements. However, these solutions do not allow insertion or deletion in the
+underlying dataset, and so are not well suited to database sampling
+applications.
+
+\Paragraph{The Dichotomy.} Among these techniques, there exists a
+clear trade-off between efficient sampling and support for updates. Tree-traversal
+based sampling solutions pay a dataset size based cost per sample, in exchange for
+update support. The static solutions lack support for updates, but support
+near-constant time sampling. While some data structures exist with support for
+both, these are restricted to highly specialized query types. Thus in the
+general case there exists a dichotomy: existing sampling indexes can support
+either data updates or efficient sampling, but not both.
diff --git a/chapters/sigmod23/conclusion.tex b/chapters/sigmod23/conclusion.tex
new file mode 100644
index 0000000..de6bffc
--- /dev/null
+++ b/chapters/sigmod23/conclusion.tex
@@ -0,0 +1,17 @@
+\section{Conclusion}
+\label{sec:conclusion}
+
+This chapter discussed the creation of a framework for the dynamic extension of
+static indexes designed for various sampling problems. Specifically, extensions
+were created for the alias structure (WSS), the in-memory ISAM tree (IRS), and
+the alias-augmented B+tree (WIRS). In each case, the SSIs were extended
+successfully with support for updates and deletes, without compromising their
+sampling performance advantage relative to existing dynamic baselines. This was
+accomplished by leveraging ideas borrowed from the Bentley-Saxe method and the
+design space of the LSM tree to divide the static index into multiple shards,
+which could be individually reconstructed in a systematic fashion to
+accommodate new data. This framework provides a large design space for trading
+between update performance, sampling performance, and memory usage, which was
+explored experimentally. The resulting extended indexes were shown to approach
+or match the insertion performance of the B+tree, while simultaneously
+performing significantly faster in sampling operations under most situations.
diff --git a/chapters/sigmod23/examples.tex b/chapters/sigmod23/examples.tex
new file mode 100644
index 0000000..cdbc398
--- /dev/null
+++ b/chapters/sigmod23/examples.tex
@@ -0,0 +1,143 @@
+\section{Framework Instantiations}
+\label{sec:instance}
+In this section, the framework is applied to three sampling problems and their
+associated SSIs. All three sampling problems draw random samples from records
+satisfying a simple predicate, and so result sets for all three can be
+constructed by directly merging the result sets of the queries executed against
+individual shards, the primary requirement for the application of the
+framework. The SSIs used for each problem are discussed, including their
+support of the remaining two optional requirements for framework application.
+
+\subsection{Dynamically Extended WSS Structure}
+\label{ssec:wss-struct}
+As a first example of applying this framework for dynamic extension,
+the alias structure for answering WSS queries is considered. This is a
+static structure that can be constructed in $O(n)$ time and supports WSS
+queries in $O(1)$ time. The alias structure will be used as the SSI, with
+the shards containing an alias structure paired with a sorted array of
+records. { The use of sorted arrays for storing the records
+allows for more efficient point-lookups, without requiring any additional
+space. The total weight associated with a query for
+a given alias structure is the total weight of all of its records,
+and can be tracked at the shard level and retrieved in constant time. }
+
+Using the formulae from Section~\ref{sec:framework}, the worst-case
+costs of insertion, sampling, and deletion are easily derived. The
+initial construction cost from the buffer is $C_c(N_b) \in O(N_b
+\log N_b)$, requiring the sorting of the buffer followed by alias
+construction. After this point, the shards can be reconstructed in
+linear time while maintaining sorted order. Thus, the reconstruction
+cost is $C_r(n) \in O(n)$. As each shard contains a sorted array,
+the point-lookup cost is $L(n) \in O(\log n)$. The total weight can
+be tracked with the shard, requiring $W(n) \in O(1)$ time to access,
+and there is no necessary preprocessing, so $P(n) \in O(1)$. Samples
+can be drawn in $S(n) \in O(1)$ time. Plugging these results into the
+formulae for insertion, sampling, and deletion costs gives,
+
+\begin{align*}
+ \text{Insertion:} \quad &O\left(\log_s n\right) \\
+ \text{Sampling:} \quad &O\left(\log_s n + \frac{k}{1 - \delta}\cdot R(n)\right) \\
+ \text{Tagged Delete:} \quad &O\left(\log_s n \log n\right)
+\end{align*}
+where $R(n) \in O(1)$ for tagging and $R(n) \in O(\log_s n \log n)$ for
+tombstones.
+
+\Paragraph{Bounding Rejection Rate.} In the weighted sampling case,
+the framework's generic record-based compaction trigger mechanism
+is insufficient to bound the rejection rate. This is because the
+probability of a given record being sampling is dependent upon its
+weight, as well as the number of records in the index. If a highly
+weighted record is deleted, it will be preferentially sampled, resulting
+in a larger number of rejections than would be expected based on record
+counts alone. This problem can be rectified using the framework's user-specified
+compaction trigger mechanism.
+In addition to
+tracking record counts, each level also tracks its rejection rate,
+\begin{equation*}
+\rho_i = \frac{\text{rejections}}{\text{sampling attempts}}
+\end{equation*}
+A configurable rejection rate cap, $\rho$, is then defined. If $\rho_i
+> \rho$ on a level, a compaction is triggered. In the case
+the tombstone delete policy, it is not the level containing the sampled
+record, but rather the level containing its tombstone, that is considered
+the source of the rejection. This is necessary to ensure that the tombstone
+is moved closer to canceling its associated record by the compaction.
+
+\subsection{Dynamically Extended IRS Structure}
+\label{ssec:irs-struct}
+Another sampling problem to which the framework can be applied is
+independent range sampling (IRS). The SSI in this example is the in-memory
+ISAM tree. The ISAM tree supports efficient point-lookups
+ directly, and the total weight of an IRS query can be
+easily obtained by counting the number of records within the query range,
+which is determined as part of the preprocessing of the query.
+
+The static nature of shards in the framework allows for an ISAM tree
+to be constructed with adjacent nodes positioned contiguously in memory.
+By selecting a leaf node size that is a multiple of the record size, and
+avoiding placing any headers within leaf nodes, the set of leaf nodes can
+be treated as a sorted array of records with direct indexing, and the
+internal nodes allow for faster searching of this array.
+Because of this layout, per-sample tree-traversals are avoided. The
+start and end of the range from which to sample can be determined using
+a pair of traversals, and then records can be sampled from this range
+using random number generation and array indexing.
+
+Assuming a sorted set of input records, the ISAM tree can be bulk-loaded
+in linear time. The insertion analysis proceeds like the WSS example
+previously discussed. The initial construction cost is $C_c(N_b) \in
+O(N_b \log N_b)$ and reconstruction cost is $C_r(n) \in O(n)$. The ISAM
+tree supports point-lookups in $L(n) \in O(\log_f n)$ time, where $f$
+is the fanout of the tree.
+
+The process for performing range sampling against the ISAM tree involves
+two stages. First, the tree is traversed twice: once to establish the index of
+the first record greater than or equal to the lower bound of the query,
+and again to find the index of the last record less than or equal to the
+upper bound of the query. This process has the effect of providing the
+number of records within the query range, and can be used to determine
+the weight of the shard in the shard alias structure. Its cost is $P(n)
+\in O(\log_f n)$. Once the bounds are established, samples can be drawn
+by randomly generating uniform integers between the upper and lower bound,
+in $S(n) \in O(1)$ time each.
+
+This results in the extended version of the ISAM tree having the following
+insert, sampling, and delete costs,
+\begin{align*}
+ \text{Insertion:} \quad &O\left(\log_s n\right) \\
+ \text{Sampling:} \quad &O\left(\log_s n \log_f n + \frac{k}{1 - \delta}\cdot R(n)\right) \\
+ \text{Tagged Delete:} \quad &O\left(\log_s n \log_f n\right)
+\end{align*}
+where $R(n) \in O(1)$ for tagging and $R(n) \in O(\log_s n \log_f n)$ for
+tombstones.
+
+
+\subsection{Dynamically Extended WIRS Structure}
+\label{ssec:wirs-struct}
+As a final example of applying this framework, the WIRS problem will be
+considered. Specifically, the alias-augmented B+tree approach, described
+by Tao \cite{tao22}, generalizing work by Afshani and Wei \cite{afshani17},
+and Hu et al. \cite{hu14}, will be extended.
+This structure allows for efficient point-lookups, as
+it is based on the B+tree, and the total weight of a given WIRS query can
+be calculated given the query range using aggregate weight tags within
+the tree.
+
+The alias-augmented B+tree is a static structure of linear space, capable
+of being built initially in $C_c(N_b) \in O(N_b \log N_b)$ time, being
+bulk-loaded from sorted lists of records in $C_r(n) \in O(n)$ time,
+and answering WIRS queries in $O(\log_f n + k)$ time, where the query
+cost consists of preliminary work to identify the sampling range
+and calculate the total weight, with $P(n) \in O(\log_f n)$ cost, and
+constant-time drawing of samples from that range with $S(n) \in O(1)$.
+This results in the following costs,
+\begin{align*}
+ \text{Insertion:} \quad &O\left(\log_s n\right) \\
+ \text{Sampling:} \quad &O\left(\log_s n \log_f n + \frac{k}{1 - \delta} \cdot R(n)\right) \\
+ \text{Tagged Delete:} \quad &O\left(\log_s n \log_f n\right)
+\end{align*}
+where $R(n) \in O(1)$ for tagging and $R(n) \in O(\log_s n \log_f n)$ for
+tombstones. Because this is a weighted sampling structure, the custom
+compaction trigger discussed in in Section~\ref{ssec:wss-struct} is applied
+to maintain bounded rejection rates during sampling.
+
diff --git a/chapters/sigmod23/exp-baseline.tex b/chapters/sigmod23/exp-baseline.tex
new file mode 100644
index 0000000..9e7929c
--- /dev/null
+++ b/chapters/sigmod23/exp-baseline.tex
@@ -0,0 +1,98 @@
+\subsection{Comparison to Baselines}
+
+Next, the performance of indexes extended using the framework is compared
+against tree sampling on the aggregate B+tree, as well as problem-specific
+SSIs for WSS, WIRS, and IRS queries. Unless otherwise specified, IRS and WIRS
+queries were executed with a selectivity of $0.1\%$ and 500 million randomly
+selected records from the OSM dataset were used. The uniform and zipfian
+synthetic datasets were 1 billion records in size. All benchmarks warmed up the
+data structure by inserting 10\% of the records, and then measured the
+throughput inserting the remaining records, while deleting 5\% of them over the
+course of the benchmark. Once all records were inserted, the sampling
+performance was measured. The reported update throughputs were calculated using
+both inserts and deletes, following the warmup period.
+
+\begin{figure*}
+ \centering
+ \subfloat[Insertion Throughput vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wss-insert} \label{fig:wss-insert}}
+ \subfloat[Sampling Latency vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wss-sample} \label{fig:wss-sample}} \\
+ \subfloat[Insertion Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-wss-insert} \label{fig:wss-insert-s}}
+ \subfloat[Sampling Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-wss-sample} \label{fig:wss-sample-s}}
+ \caption{Framework Comparisons to Baselines for WSS}
+\end{figure*}
+
+Starting with WSS, Figure~\ref{fig:wss-insert} shows that the DE-WSS structure
+is competitive with the AGG B+tree in terms of insertion performance, achieving
+about 85\% of the AGG B+tree's insertion throughput on the Twitter dataset, and
+beating it by similar margins on the other datasets. In terms of sampling
+performance in Figure~\ref{fig:wss-sample}, it beats the B+tree handily, and
+compares favorably to the static alias structure. Figures~\ref{fig:wss-insert-s}
+and \ref{fig:wss-sample-s} show the performance scaling of the three structures as
+the dataset size increases. All of the structures exhibit the same type of
+performance degradation with respect to dataset size.
+
+\begin{figure*}
+ \centering
+ \subfloat[Insertion Throughput vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wirs-insert} \label{fig:wirs-insert}}
+ \subfloat[Sampling Latency vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-wirs-sample} \label{fig:wirs-sample}}
+ \caption{Framework Comparison to Baselines for WIRS}
+\end{figure*}
+
+Figures~\ref{fig:wirs-insert} and \ref{fig:wirs-sample} show the performance of
+the DE-WIRS index, relative to the AGG B+tree and the alias-augmented B+tree. This
+example shows the same pattern of behavior as was seen with DE-WSS, though the
+margin between the DE-WIRS and its corresponding SSI is much narrower.
+Additionally, the constant factors associated with the construction cost of the
+alias-augmented B+tree are much larger than the alias structure. The loss of
+insertion performance due to this is seen clearly in Figure~\ref{fig:wirs-insert}, where
+the margin of advantage between DE-WIRS and the AGG B+tree in insertion
+throughput shrinks compared to the DE-WSS index, and the AGG B+tree's advantage
+on the Twitter dataset is expanded.
+
+\begin{figure*}
+ \subfloat[Insertion Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-insert} \label{fig:irs-insert-s}}
+ \subfloat[Sampling Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-sample} \label{fig:irs-sample-s}} \\
+
+ \subfloat[Insertion Throughput vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-insert} \label{fig:irs-insert1}}
+ \subfloat[Sampling Latency vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-sample} \label{fig:irs-sample1}} \\
+
+ \subfloat[Delete Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-delete} \label{fig:irs-delete}}
+ \subfloat[Sampling Latency vs. Sample Size]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-samplesize} \label{fig:irs-samplesize}}
+ \caption{Framework Comparison to Baselines for IRS}
+
+\end{figure*}
+Finally, Figures~\ref{fig:irs-insert1} and \ref{fig:irs-sample1} show a
+comparison of the in-memory DE-IRS index against the in-memory ISAM tree and the AGG
+B+tree for answering IRS queries. The cost of bulk-loading the ISAM tree is less
+than the cost of building the alias structure, or the alias-augmented B+tree, and
+so here DE-IRS defeats the AGG B+tree by wider margins in insertion throughput,
+though the margin narrows significantly in terms of sampling performance
+advantage.
+
+DE-IRS was further tested to evaluate scalability.
+Figure~\ref{fig:irs-insert-s} shows average insertion throughput,
+Figure~\ref{fig:irs-delete} shows average delete latency (under tagging), and
+Figure~\ref{fig:irs-sample-s} shows average sampling latencies for DE-IRS and
+AGG B+tree over a range of data sizes. In all cases, DE-IRS and B+tree show
+similar patterns of performance degradation as the datasize grows. Note that
+the delete latencies of DE-IRS are worse than AGG B+tree, because of the B+tree's
+cheaper point-lookups.
+
+Figure~\ref{fig:irs-sample-s}
+also includes one other point of interest: the sampling performance of
+DE-IRS \emph{improves} when the data size grows from one million to ten million
+records. While at first glance the performance increase may appear paradoxical,
+it actually demonstrates an important result concerning the effect of the
+unsorted mutable buffer on index performance. At one million records, the
+buffer constitutes approximately 1\% of the total data size; this results in
+the buffer being sampled from with greater frequency (as it has more total
+weight) than would be the case with larger data. The greater the frequency of
+buffer sampling, the more rejections will occur, and the worse the sampling
+performance will be. This illustrates the importance of keeping the buffer
+small, even when a scan is not used for buffer sampling. Finally,
+Figure~\ref{fig:irs-samplesize} shows the decreasing per-sample cost as the
+number of records requested by a sampling query grows for DE-IRS, compared to
+AGG B+tree. Note that DE-IRS benefits significantly more from batching samples
+than AGG B+tree, and that the improvement is greatest up to $k=100$ samples per
+query.
+
diff --git a/chapters/sigmod23/exp-extensions.tex b/chapters/sigmod23/exp-extensions.tex
new file mode 100644
index 0000000..d929e92
--- /dev/null
+++ b/chapters/sigmod23/exp-extensions.tex
@@ -0,0 +1,40 @@
+\subsection{External and Concurrent Extensions}
+
+\begin{figure*}[h]%
+ \centering
+ \subfloat[External Insertion Throughput]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-ext-insert.pdf} \label{fig:ext-insert}}
+ \subfloat[External Sampling Latency]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-ext-sample.pdf} \label{fig:ext-sample}} \\
+
+ \subfloat[Concurrent Insert Latency vs. Throughput]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-cc-irs-scale} \label{fig:con-latency}}
+ \subfloat[Concurrent Insert Throughput vs. Thread Count]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-cc-irs-thread} \label{fig:con-tput}}
+
+ \caption{External and Concurrent Extensions of DE-IRS}
+ \label{fig:irs-extensions}
+\end{figure*}
+
+Proof of concept implementations of external and concurrent extensions were
+also tested for IRS queries. Figures \ref{fig:ext-sample} and
+\ref{fig:ext-insert} show the performance of the external DE-IRS sampling index
+against AB-tree. DE-IRS was configured with 4 in-memory levels, using at most
+350 MiB of memory in testing, including bloom filters. {
+For DE-IRS, the \texttt{O\_DIRECT} flag was used to disable OS caching, and
+CGroups were used to limit process memory to 1 GiB to simulate a memory
+constrained environment. The AB-tree implementation tested
+had a cache, which was configured with a memory budget of 64 GiB. This extra
+memory was provided to be fair to AB-tree. Because it uses per-sample
+tree-traversals, it is much more reliant on caching for good performance. DE-IRS was
+tested without a caching layer.} The tests were performed with 4 billion (80 GiB)
+{and 8 billion (162 GiB) uniform and zipfian
+records}, and 2.6 billion (55 GiB) OSM records. DE-IRS outperformed the AB-tree
+by over an order of magnitude in both insertion and sampling performance.
+
+Finally, Figures~\ref{fig:con-latency} and \ref{fig:con-tput} show the
+multi-threaded insertion performance of the in-memory DE-IRS index with
+concurrency support, compared to AB-tree running entirely in memory, using the
+synthetic uniform dataset. Note that in Figure~\ref{fig:con-latency}, some of
+the AB-tree results are cut off, due to having significantly lower throughput
+and higher latency compared with the DE-IRS. Even without concurrent
+merging, the framework shows linear scaling up to 4 threads of insertion,
+before leveling off; throughput remains flat even up to 32 concurrent
+insertion threads. An implementation with support for concurrent merging would
+scale even better.
diff --git a/chapters/sigmod23/exp-parameter-space.tex b/chapters/sigmod23/exp-parameter-space.tex
new file mode 100644
index 0000000..d2057ac
--- /dev/null
+++ b/chapters/sigmod23/exp-parameter-space.tex
@@ -0,0 +1,105 @@
+\subsection{Framework Design Space Exploration}
+\label{ssec:ds-exp}
+
+The proposed framework brings with it a large design space, described in
+Section~\ref{ssec:design-space}. First, this design space will be examined
+using a standardized benchmark to measure the average insertion throughput and
+sampling latency of DE-WSS at several points within this space. Tests were run
+using a random selection of 500 million records from the OSM dataset, with the
+index warmed up by the insertion of 10\% of the total records prior to
+beginning any measurement. Over the course of the insertion period, 5\% of the
+records were deleted, except for the tests in
+Figures~\ref{fig:insert_delete_prop}, \ref{fig:sample_delete_prop}, and
+\ref{fig:bloom}, in which 25\% of the records were deleted. Reported update
+throughputs were calculated using both inserts and deletes, following the
+warmup period. The standard values
+used for parameters not being varied in a given test were $s = 6$, $N_b =
+12000$, $k=1000$, and $\delta = 0.05$, with buffer rejection sampling.
+
+\begin{figure*}
+ \centering
+ \subfloat[Insertion Throughput vs. Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-insert} \label{fig:insert_mt}}
+ \subfloat[Insertion Throughput vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-insert} \label{fig:insert_sf}} \\
+
+ \subfloat[Insertion Throughput vs.\\Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-insert} \label{fig:insert_delete_prop}}
+ \subfloat[Per 1000 Sampling Latency vs.\\Mutable Buffer Capacity]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-mt-sample} \label{fig:sample_mt}} \\
+
+ \caption{DE-WSS Design Space Exploration I}
+ \label{fig:parameter-sweeps1}
+\end{figure*}
+
+The results of this testing are displayed in
+Figures~\ref{fig:parameter-sweeps1},~\ref{fig:parameter-sweeps2},~and:wq~\ref{fig:parameter-sweeps3}.
+The two largest contributors to differences in performance were the selection
+of layout policy and of delete policy. Figures~\ref{fig:insert_mt} and
+\ref{fig:insert_sf} show that the choice of layout policy plays a larger role
+than delete policy in insertion performance, with tiering outperforming
+leveling in both configurations. The situation is reversed in sampling
+performance, seen in Figure~\ref{fig:sample_mt} and \ref{fig:sample_sf}, where
+the performance difference between layout policies is far less than between
+delete policies.
+
+The values used for the scale factor and buffer size have less influence than
+layout and delete policy. Sampling performance is largely independent of them
+over the ranges of values tested, as shown in Figures~\ref{fig:sample_mt} and
+\ref{fig:sample_sf}. This isn't surprising, as these parameters adjust the
+number of shards, which only contributes to shard alias construction time
+during sampling and is is amortized over all samples taken in a query. The
+buffer also contributes rejections, but the cost of a rejection is small and
+the buffer constitutes only a small portion of the total weight, so these are
+negligible. However, under tombstones there is an upward trend in latency with
+buffer size, as delete checks occasionally require a full buffer scan. The
+effect of buffer size on insertion is shown in Figure~\ref{fig:insert_mt}.
+{ There is only a small improvement in insertion performance as the mutable
+buffer grows. This is because a larger buffer results in fewer reconstructions,
+but these reconstructions individually take longer, and so the net positive
+effect is less than might be expected.} Finally, Figure~\ref{fig:insert_sf}
+shows the effect of scale factor on insertion performance. As expected, tiering
+performs better with higher scale factors, whereas the insertion performance of
+leveling trails off as the scale factor is increased, due to write
+amplification.
+
+\begin{figure*}
+ \centering
+ \subfloat[Per 1000 Sampling Latency vs. Scale Factor]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-sf-sample} \label{fig:sample_sf}}
+ \subfloat[Per 1000 Sampling Latency vs. Max Delete Proportion]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-tp-sample}\label{fig:sample_delete_prop}} \\
+ \caption{DE-WSS Design Space Exploration II}
+ \label{fig:parameter-sweeps2}
+\end{figure*}
+
+Figures~\ref{fig:insert_delete_prop} and \ref{fig:sample_delete_prop} show the
+cost of maintaining $\delta$ with a base delete rate of 25\%. The low cost of
+an in-memory sampling rejection results in only a slight upward trend in the
+sampling latency as the number of deleted records increases. While compaction
+is necessary to avoid pathological cases, there does not seem to be a
+significant benefit to aggressive compaction thresholds.
+Figure~\ref{fig:insert_delete_prop} shows the effect of compactions on insert
+performance. There is little effect on performance under tagging, but there is
+a clear negative performance trend associated with aggressive compaction when
+using tombstones. Under tagging, a single compaction is guaranteed to remove
+all deleted records on a level, whereas with tombstones a compaction can
+cascade for multiple levels before the delete bound is satisfied, resulting in
+a larger cost per incident.
+
+\begin{figure*}
+ \centering
+ \subfloat[Sampling Latency vs. Sample Size]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-samplesize} \label{fig:sample_k}}
+ \subfloat[Per 1000 Sampling Latency vs. Bloom Filter Memory]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-ps-wss-bloom}\label{fig:bloom}} \\
+ \caption{DE-WSS Design Space Exploration III}
+ \label{fig:parameter-sweeps3}
+\end{figure*}
+
+Figure~\ref{fig:bloom} demonstrates the trade-off between memory usage for
+Bloom filters and sampling performance under tombstones. This test was run
+using 25\% incoming deletes with no compaction, to maximize the number of
+tombstones within the index as a worst-case scenario. As expected, allocating
+more memory to Bloom filters, decreasing their false positive rates,
+accelerates sampling. Finally, Figure~\ref{fig:sample_k} shows the relationship
+between average per sample latency and the sample set size. It shows the effect
+of amortizing the initial shard alias setup work across an increasing number of
+samples, with $k=100$ as the point at which latency levels off.
+
+Based upon these results, a set of parameters was established for the extended
+indexes, which is used in the next section for baseline comparisons. This
+standard configuration uses tagging as the delete policy and tiering as the
+layout policy, with $k=1000$, $N_b = 12000$, $\delta = 0.05$, and $s = 6$.
diff --git a/chapters/sigmod23/experiment.tex b/chapters/sigmod23/experiment.tex
new file mode 100644
index 0000000..75cf32e
--- /dev/null
+++ b/chapters/sigmod23/experiment.tex
@@ -0,0 +1,48 @@
+\section{Evaluation}
+\label{sec:experiment}
+
+\Paragraph{Experimental Setup.} All experiments were run under Ubuntu 20.04 LTS
+on a dual-socket Intel Xeon Gold 6242R server with 384 GiB of physical memory
+and 40 physical cores. External tests were run using a 4 TB WD Red SA500 SATA
+SSD, rated for 95000 and 82000 IOPS for random reads and writes respectively.
+
+\Paragraph{Datasets.} Testing utilized a variety of synthetic and real-world
+datasets. For all datasets used, the key was represented as a 64-bit integer,
+the weight as a 64-bit integer, and the value as a 32-bit integer. Each record
+also contained a 32-bit header. The weight was omitted from IRS testing.
+Keys and weights were pulled from the dataset directly, and values were
+generated separately and were unique for each record. The following datasets
+were used,
+\begin{itemize}
+\item \textbf{Synthetic Uniform.} A non-weighted, synthetically generated list
+ of keys drawn from a uniform distribution.
+\item \textbf{Synthetic Zipfian.} A non-weighted, synthetically generated list
+ of keys drawn from a Zipfian distribution with
+ a skew of $0.8$.
+\item \textbf{Twitter~\cite{data-twitter,data-twitter1}.} $41$ million Twitter user ids, weighted by follower counts.
+\item \textbf{Delicious~\cite{data-delicious}.} $33.7$ million URLs, represented using unique integers,
+ weighted by the number of associated tags.
+\item \textbf{OSM~\cite{data-osm}.} $2.6$ billion geospatial coordinates for points
+ of interest, collected by OpenStreetMap. The latitude, converted
+ to a 64-bit integer, was used as the key and the number of
+ its associated semantic tags as the weight.
+\end{itemize}
+The synthetic datasets were not used for weighted experiments, as they do not
+have weights. For unweighted experiments, the Twitter and Delicious datasets
+were not used, as they have uninteresting key distributions.
+
+\Paragraph{Compared Methods.} In this section, indexes extended using the
+framework are compared against existing dynamic baselines. Specifically, DE-WSS
+(Section~\ref{ssec:wss-struct}), DE-IRS (Section~\ref{ssec:irs-struct}), and
+DE-WIRS (Section~\ref{ssec:irs-struct}) are examined. In-memory extensions are
+compared against the B+tree with aggregate weight tags on internal nodes (AGG
+B+tree) \cite{olken95} and concurrent and external extensions are compared
+against the AB-tree \cite{zhao22}. Sampling performance is also compared against
+comparable static sampling indexes: the alias structure \cite{walker74} for WSS,
+the in-memory ISAM tree for IRS, and the alias-augmented B+tree \cite{afshani17}
+for WIRS. Note that all structures under test, with the exception of the
+external DE-IRS and external AB-tree, were contained entirely within system
+memory. All benchmarking code and data structures were implemented using C++17
+and compiled using gcc 11.3.0 at the \texttt{-O3} optimization level. The
+extension framework itself, excluding the shard implementations and utility
+headers, consisted of a header-only library of about 1200 SLOC.
diff --git a/chapters/sigmod23/extensions.tex b/chapters/sigmod23/extensions.tex
new file mode 100644
index 0000000..6c242e9
--- /dev/null
+++ b/chapters/sigmod23/extensions.tex
@@ -0,0 +1,57 @@
+\captionsetup[subfloat]{justification=centering}
+\section{Extensions}
+\label{sec:discussion}
+In this section, various extensions of the framework are considered.
+Specifically, the applicability of the framework to external or distributed
+data structures is discussed, as well as the use of the framework to add
+automatic support for concurrent updates and sampling to extended SSIs.
+
+\Paragraph{Larger-than-Memory Data.} This framework can be applied to external
+static sampling structures with minimal modification. As a proof-of-concept,
+the IRS structure was extended with support for shards containing external ISAM
+trees. This structure supports storing a configurable number of shards in
+memory, and the rest on disk, making it well suited for operating in
+memory-constrained environments. The on-disk shards contain standard ISAM
+trees, with $8\text{KiB}$ page-aligned nodes. The external version of the
+index only supports tombstone-based deletes, as tagging would require random
+writes. In principle a hybrid approach to deletes is possible, where a delete
+first searches the in-memory data for the record to be deleted, tagging it if
+found. If the record is not found, then a tombstone could be inserted. As the
+data size grows, though, and the preponderance of data is found on disk, this
+approach would largely revert to the standard tombstone approach in practice.
+External settings make the framework even more attractive, in terms of
+performance characteristics, due to the different cost model. In external data
+structures, performance is typically measured in terms of the number of IO
+operations, meaning that much of the overhead introduced by the framework for
+tasks like querying the mutable buffer, building auxiliary structures, extra
+random number generations due to the shard alias structure, and the like,
+become far less significant.
+
+Because the framework maintains immutability of shards, it is also well suited for
+use on top of distributed file-systems or with other distributed data
+abstractions like RDDs in Apache Spark~\cite{rdd}. Each shard can be
+encapsulated within an immutable file in HDFS or an RDD in Spark. A centralized
+control node or driver program can manage the mutable buffer, flushing it into
+a new file or RDD when it is full, merging with existing files or RDDs using
+the same reconstruction scheme already discussed for the framework. This setup
+allows for datasets exceeding the capacity of a single node to be supported. As
+an example, XDB~\cite{li19} features an RDD-based distributed sampling
+structure that could be supported by this framework.
+
+\Paragraph{Concurrency.} The immutability of the majority of the structures
+within the index makes for a straightforward concurrency implementation.
+Concurrency control on the buffer is made trivial by the fact it is a simple,
+unsorted array. The rest of the structure is never updated (aside from possible
+delete tagging), and so concurrency becomes a simple matter of delaying the
+freeing of memory used by internal structures until all the threads accessing
+them have exited, rather than immediately on merge completion. A very basic
+concurrency implementation can be achieved by using the tombstone delete
+policy, and a reference counting scheme to control the deletion of the shards
+following reconstructions. Multiple insert buffers can be used to improve
+insertion throughput, as this will allow inserts to proceed in parallel with
+merges, ultimately allowing concurrency to scale up to the point of being
+bottlenecked by memory bandwidth and available storage. This proof-of-concept
+implementation is based on a simplified version of an approach proposed by
+Golan-Gueta et al. for concurrent log-structured data stores
+\cite{golan-gueta15}.
+
diff --git a/chapters/sigmod23/framework.tex b/chapters/sigmod23/framework.tex
new file mode 100644
index 0000000..32a32e1
--- /dev/null
+++ b/chapters/sigmod23/framework.tex
@@ -0,0 +1,573 @@
+\section{Dynamic Sampling Index Framework}
+\label{sec:framework}
+
+This work is an attempt to design a solution to independent sampling
+that achieves \emph{both} efficient updates and near-constant cost per
+sample. As the goal is to tackle the problem in a generalized fashion,
+rather than design problem-specific data structures for used as the basis
+of an index, a framework is created that allows for already
+existing static data structures to be used as the basis for a sampling
+index, by automatically adding support for data updates using a modified
+version of the Bentley-Saxe method.
+
+Unfortunately, Bentley-Saxe as described in Section~\ref{ssec:bsm} cannot be
+directly applied to sampling problems. The concept of decomposability is not
+cleanly applicable to sampling, because the distribution of records in the
+result set, rather than the records themselves, must be matched following the
+result merge. Efficiently controlling the distribution requires each sub-query
+to access information external to the structure against which it is being
+processed, a contingency unaccounted for by Bentley-Saxe. Further, the process
+of reconstruction used in Bentley-Saxe provides poor worst-case complexity
+bounds~\cite{saxe79}, and attempts to modify the procedure to provide better
+worst-case performance are complex and have worse performance in the common
+case~\cite{overmars81}. Despite these limitations, this chapter will argue that
+the core principles of the Bentley-Saxe method can be profitably applied to
+sampling indexes, once a system for controlling result set distributions and a
+more effective reconstruction scheme have been devised. The solution to
+the former will be discussed in Section~\ref{ssec:sample}. For the latter,
+inspiration is drawn from the literature on the LSM tree.
+
+The LSM tree~\cite{oneil96} is a data structure proposed to optimize
+write throughput in disk-based storage engines. It consists of a memory
+table of bounded size, used to buffer recent changes, and a hierarchy
+of external levels containing indexes of exponentially increasing
+size. When the memory table has reached capacity, it is emptied into the
+external levels. Random writes are avoided by treating the data within
+the external levels as immutable; all writes go through the memory
+table. This introduces write amplification but maximizes sequential
+writes, which is important for maintaining high throughput in disk-based
+systems. The LSM tree is associated with a broad and well studied design
+space~\cite{dayan17,dayan18,dayan22,balmau19,dayan18-1} containing
+trade-offs between three key performance metrics: read performance, write
+performance, and auxiliary memory usage. The challenges
+faced in reconstructing predominately in-memory indexes are quite
+ different from those which the LSM tree is intended
+to address, having little to do with disk-based systems and sequential IO
+operations. But, the LSM tree possesses a rich design space for managing
+the periodic reconstruction of data structures in a manner that is both
+more practical and more flexible than that of Bentley-Saxe. By borrowing
+from this design space, this preexisting body of work can be leveraged,
+and many of Bentley-Saxe's limitations addressed.
+
+\captionsetup[subfloat]{justification=centering}
+
+\begin{figure*}
+ \centering
+ \subfloat[Leveling]{\includegraphics[width=.75\textwidth]{img/sigmod23/merge-leveling} \label{fig:leveling}}\\
+ \subfloat[Tiering]{\includegraphics[width=.75\textwidth]{img/sigmod23/merge-tiering} \label{fig:tiering}}
+
+ \caption{\textbf{A graphical overview of the sampling framework and its insert procedure.} A
+ mutable buffer (MB) sits atop two levels (L0, L1) containing shards (pairs
+ of SSIs and auxiliary structures [A]) using the leveling
+ (Figure~\ref{fig:leveling}) and tiering (Figure~\ref{fig:tiering}) layout
+ policies. Records are represented as black/colored squares, and grey
+ squares represent unused capacity. An insertion requiring a multi-level
+ reconstruction is illustrated.} \label{fig:framework}
+
+\end{figure*}
+
+
+\subsection{Framework Overview}
+The goal of this chapter is to build a general framework that extends most SSIs
+with efficient support for updates by splitting the index into small data structures
+to reduce reconstruction costs, and then distributing the sampling process over these
+smaller structures.
+The framework is designed to work efficiently with any SSI, so
+long as it has the following properties,
+\begin{enumerate}
+ \item The underlying full query $Q$ supported by the SSI from whose results
+ samples are drawn satisfies the following property:
+ for any dataset $D = \cup_{i = 1}^{n}D_i$
+ where $D_i \cap D_j = \emptyset$, $Q(D) = \cup_{i = 1}^{n}Q(D_i)$.
+ \item \emph{(Optional)} The SSI supports efficient point-lookups.
+ \item \emph{(Optional)} The SSI is capable of efficiently reporting the total weight of all records
+ returned by the underlying full query.
+\end{enumerate}
+
+The first property applies to the query being sampled from, and is essential
+for the correctness of sample sets reported by extended sampling
+indexes.\footnote{ This condition is stricter than the definition of a
+decomposable search problem in the Bentley-Saxe method, which allows for
+\emph{any} constant-time merge operation, not just union.
+However, this condition is satisfied by many common types of database
+query, such as predicate-based filtering queries.} The latter two properties
+are optional, but reduce deletion and sampling costs respectively. Should the
+SSI fail to support point-lookups, an auxiliary hash table can be attached to
+the data structures.
+Should it fail to support query result weight reporting, rejection
+sampling can be used in place of the more efficient scheme discussed in
+Section~\ref{ssec:sample}. The analysis of this framework will generally
+assume that all three conditions are satisfied.
+
+Given an SSI with these properties, a dynamic extension can be produced as
+shown in Figure~\ref{fig:framework}. The extended index consists of disjoint
+shards containing an instance of the SSI being extended, and optional auxiliary
+data structures. The auxiliary structures allow acceleration of certain
+operations that are required by the framework, but which the SSI being extended
+does not itself support efficiently. Examples of possible auxiliary structures
+include hash tables, Bloom filters~\cite{bloom70}, and range
+filters~\cite{zhang18,siqiang20}. The shards are arranged into levels of
+increasing record capacity, with either one shard, or up to a fixed maximum
+number of shards, per level. The decision to place one or many shards per level
+is called the \emph{layout policy}. The policy names are borrowed from the
+literature on the LSM tree, with the former called \emph{leveling} and the
+latter called \emph{tiering}.
+
+To avoid a reconstruction on every insert, an unsorted array of fixed capacity
+($N_b$), called the \emph{mutable buffer}, is used to buffer updates. Because it is
+unsorted, it is kept small to maintain reasonably efficient sampling
+and point-lookup performance. All updates are performed by appending new
+records to the tail of this buffer.
+If a record currently within the index is
+to be updated to a new value, it must first be deleted, and then a record with
+the new value inserted. This ensures that old versions of records are properly
+filtered from query results.
+
+When the buffer is full, it is flushed to make room for new records. The
+flushing procedure is based on the layout policy in use. When using leveling
+(Figure~\ref{fig:leveling}) a new SSI is constructed using both the records in
+$L_0$ and those in the buffer. This is used to create a new shard, which
+replaces the one previously in $L_0$. When using tiering
+(Figure~\ref{fig:tiering}) a new shard is built using only the records from the
+buffer, and placed into $L_0$ without altering the existing shards. Each level
+has a record capacity of $N_b \cdot s^{i+1}$, controlled by a configurable
+parameter, $s$, called the scale factor. Records are organized in one large
+shard under leveling, or in $s$ shards of $N_b \cdot s^i$ capacity each under
+tiering. When a level reaches its capacity, it must be emptied to make room for
+the records flushed into it. This is accomplished by moving its records down to
+the next level of the index. Under leveling, this requires constructing a new
+shard containing all records from both the source and target levels, and
+placing this shard into the target, leaving the source empty. Under tiering,
+the shards in the source level are combined into a single new shard that is
+placed into the target level. Should the target be full, it is first emptied by
+applying the same procedure. New empty levels
+are dynamically added as necessary to accommodate these reconstructions.
+Note that shard reconstructions are not necessarily performed using
+merging, though merging can be used as an optimization of the reconstruction
+procedure where such an algorithm exists. In general, reconstruction requires
+only pooling the records of the shards being combined and then applying the SSI's
+standard construction algorithm to this set of records.
+
+\begin{table}[t]
+\caption{Frequently Used Notation}
+\centering
+
+\begin{tabular}{|p{2.5cm} p{5cm}|}
+ \hline
+ \textbf{Variable} & \textbf{Description} \\ \hline
+ $N_b$ & Capacity of the mutable buffer \\ \hline
+ $s$ & Scale factor \\ \hline
+ $C_c(n)$ & SSI initial construction cost \\ \hline
+ $C_r(n)$ & SSI reconstruction cost \\ \hline
+ $L(n)$ & SSI point-lookup cost \\ \hline
+ $P(n)$ & SSI sampling pre-processing cost \\ \hline
+ $S(n)$ & SSI per-sample sampling cost \\ \hline
+ $W(n)$ & Shard weight determination cost \\ \hline
+ $R(n)$ & Shard rejection check cost \\ \hline
+ $\delta$ & Maximum delete proportion \\ \hline
+ %$\rho$ & Maximum rejection rate \\ \hline
+\end{tabular}
+\label{tab:nomen}
+
+\end{table}
+
+Table~\ref{tab:nomen} lists frequently used notation for the various parameters
+of the framework, which will be used in the coming analysis of the costs and
+trade-offs associated with operations within the framework's design space. The
+remainder of this section will discuss the performance characteristics of
+insertion into this structure (Section~\ref{ssec:insert}), how it can be used
+to correctly answer sampling queries (Section~\ref{ssec:insert}), and efficient
+approaches for supporting deletes (Section~\ref{ssec:delete}). Finally, it will
+close with a detailed discussion of the trade-offs within the framework's
+design space (Section~\ref{ssec:design-space}).
+
+
+\subsection{Insertion}
+\label{ssec:insert}
+The framework supports inserting new records by first appending them to the end
+of the mutable buffer. When it is full, the buffer is flushed into a sequence
+of levels containing shards of increasing capacity, using a procedure
+determined by the layout policy as discussed in Section~\ref{sec:framework}.
+This method allows for the cost of repeated shard reconstruction to be
+effectively amortized.
+
+Let the cost of constructing the SSI from an arbitrary set of $n$ records be
+$C_c(n)$ and the cost of reconstructing the SSI given two or more shards
+containing $n$ records in total be $C_r(n)$. The cost of an insert is composed
+of three parts: appending to the mutable buffer, constructing a new
+shard from the buffered records during a flush, and the total cost of
+reconstructing shards containing the record over the lifetime of the index. The
+cost of appending to the mutable buffer is constant, and the cost of constructing a
+shard from the buffer can be amortized across the records participating in the
+buffer flush, giving $\nicefrac{C_c(N_b)}{N_b}$. These costs are paid exactly once for
+each record. To derive an expression for the cost of repeated reconstruction,
+first note that each record will participate in at most $s$ reconstructions on
+a given level, resulting in a worst-case amortized cost of $O\left(s\cdot
+\nicefrac{C_r(n)}{n}\right)$ paid per level. The index itself will contain at most
+$\log_s n$ levels. Thus, over the lifetime of the index a given record
+will pay $O\left(s\cdot \nicefrac{C_r(n)}{n}\log_s n\right)$ cost in repeated
+reconstruction.
+
+Combining these results, the total amortized insertion cost is
+\begin{equation}
+O\left(\frac{C_c(N_b)}{N_b} + s \cdot \frac{C_r(n)}{n} \log_s n\right)
+\end{equation}
+This can be simplified by noting that $s$ is constant, and that $N_b \ll n$ and also
+a constant. By neglecting these terms, the amortized insertion cost of the
+framework is,
+\begin{equation}
+O\left(\frac{C_r(n)}{n}\log_s n\right)
+\end{equation}
+
+
+\subsection{Sampling}
+\label{ssec:sample}
+
+\begin{figure}
+ \centering
+ \includegraphics[width=\textwidth]{img/sigmod23/sampling}
+ \caption{\textbf{Overview of the multiple-shard sampling query process} for
+ Example~\ref{ex:sample} with $k=1000$. First, (1) the normalized weights of
+ the shards is determined, then (2) these weights are used to construct an
+ alias structure. Next, (3) the alias structure is queried $k$ times to
+ determine per shard sample sizes, and then (4) sampling is performed.
+ Finally, (5) any rejected samples are retried starting from the alias
+ structure, and the process is repeated until the desired number of samples
+ has been retrieved.}
+ \label{fig:sample}
+
+\end{figure}
+
+For many SSIs, sampling queries are completed in two stages. Some preliminary
+processing is done to identify the range of records from which to sample, and then
+samples are drawn from that range. For example, IRS over a sorted list of
+records can be performed by first identifying the upper and lower bounds of the
+query range in the list, and then sampling records by randomly generating
+indexes within those bounds. The general cost of a sampling query can be
+modeled as $P(n) + k S(n)$, where $P(n)$ is the cost of preprocessing, $k$ is
+the number of samples drawn, and $S(n)$ is the cost of sampling a single
+record.
+
+When sampling from multiple shards, the situation grows more complex. For each
+sample, the shard to select the record from must first be decided. Consider an
+arbitrary sampling query $X(D, k)$ asking for a sample set of size $k$ against
+dataset $D$. The framework splits $D$ across $m$ disjoint shards, such that $D
+= \bigcup_{i=1}^m D_i$ and $D_i \cap D_j = \emptyset, \forall i,j < m$. The
+framework must ensure that $X(D, k)$ and $\bigcup_{i=0}^m X(D_i, k_i)$ follow
+the same distribution, by selecting appropriate values for the $k_i$s. If care
+is not taken to balance the number of samples drawn from a shard with the total
+weight of the shard under $X$, then bias can be introduced into the sample
+set's distribution. The selection of $k_i$s can be viewed as an instance of WSS,
+and solved using the alias method.
+
+When sampling using the framework, first the weight of each shard under the
+sampling query is determined and a \emph{shard alias structure} built over
+these weights. Then, for each sample, the shard alias is used to
+determine the shard from which to draw the sample. Let $W(n)$ be the cost of
+determining this total weight for a single shard under the query. The initial setup
+cost, prior to drawing any samples, will be $O\left([W(n) + P(n)]\log_s
+n\right)$, as the preliminary work for sampling from each shard must be
+performed, as well as weights determined and alias structure constructed. In
+many cases, however, the preliminary work will also determine the total weight,
+and so the relevant operation need only be applied once to accomplish both
+tasks.
+
+To ensure that all records appear in the sample set with the appropriate
+probability, the mutable buffer itself must also be a valid target for
+sampling. There are two generally applicable techniques that can be applied for
+this, both of which can be supported by the framework. The query being sampled
+from can be directly executed against the buffer and the result set used to
+build a temporary SSI, which can be sampled from. Alternatively, rejection
+sampling can be used to sample directly from the buffer, without executing the
+query. In this case, the total weight of the buffer is used for its entry in
+the shard alias structure. This can result in the buffer being
+over-represented in the shard selection process, and so any rejections during
+buffer sampling must be retried starting from shard selection. These same
+considerations apply to rejection sampling used against shards, as well.
+
+
+\begin{example}
+ \label{ex:sample}
+ Consider executing a WSS query, with $k=1000$, across three shards
+ containing integer keys with unit weight. $S_1$ contains only the
+ key $-2$, $S_2$ contains all integers on $[1,100]$, and $S_3$
+ contains all integers on $[101, 200]$. These structures are shown
+ in Figure~\ref{fig:sample}. Sampling is performed by first
+ determining the normalized weights for each shard: $w_1 = 0.005$,
+ $w_2 = 0.4975$, $w_3 = 0.4975$, which are then used to construct a
+ shard alias structure. The shard alias structure is then queried
+ $k$ times, resulting in a distribution of $k_i$s that is
+ commensurate with the relative weights of each shard. Finally,
+ each shard is queried in turn to draw the appropriate number
+ of samples.
+\end{example}
+
+
+Assuming that rejection sampling is used on the mutable buffer, the worst-case
+time complexity for drawing $k$ samples from an index containing $n$ elements
+with a sampling cost of $S(n)$ is,
+\begin{equation}
+ \label{eq:sample-cost}
+ O\left(\left[W(n) + P(n)\right]\log_s n + kS(n)\right)
+\end{equation}
+
+%If instead a temporary SSI is constructed, the cost of sampling
+%becomes: $O\left(N_b + C_c(N_b) + (W(n) + P(n))\log_s n + kS(n)\right)$.
+
+\begin{figure}
+ \centering
+ \subfloat[Tombstone Rejection Check]{\includegraphics[width=.75\textwidth]{img/sigmod23/delete-tombstone} \label{fig:delete-tombstone}}\\
+ \subfloat[Tagging Rejection Check]{\includegraphics[width=.75\textwidth]{img/sigmod23/delete-tagging} \label{fig:delete-tag}}
+
+ \caption{\textbf{Overview of the rejection check procedure for deleted records.} First,
+ a record is sampled (1).
+ When using the tombstone delete policy
+ (Figure~\ref{fig:delete-tombstone}), the rejection check starts by (2) querying
+ the bloom filter of the mutable buffer. The filter indicates the record is
+ not present, so (3) the filter on $L_0$ is queried next. This filter
+ returns a false positive, so (4) a point-lookup is executed against $L_0$.
+ The lookup fails to find a tombstone, so the search continues and (5) the
+ filter on $L_1$ is checked, which reports that the tombstone is present.
+ This time, it is not a false positive, and so (6) a lookup against $L_1$
+ (7) locates the tombstone. The record is thus rejected. When using the
+ tagging policy (Figure~\ref{fig:delete-tag}), (1) the record is sampled and
+ (2) checked directly for the delete tag. It is set, so the record is
+ immediately rejected.}
+
+ \label{fig:delete}
+
+\end{figure}
+
+
+\subsection{Deletion}
+\label{ssec:delete}
+
+Because the shards are static, records cannot be arbitrarily removed from them.
+This requires that deletes be supported in some other way, with the ultimate
+goal being the prevention of deleted records' appearance in sampling query
+result sets. This can be realized in two ways: locating the record and marking
+it, or inserting a new record which indicates that an existing record should be
+treated as deleted. The framework supports both of these techniques, the
+selection of which is called the \emph{delete policy}. The former policy is
+called \emph{tagging} and the latter \emph{tombstone}.
+
+Tagging a record is straightforward. Point-lookups are performed against each
+shard in the index, as well as the buffer, for the record to be deleted. When
+it is found, a bit in a header attached to the record is set. When sampling,
+any records selected with this bit set are automatically rejected. Tombstones
+represent a lazy strategy for deleting records. When a record is deleted using
+tombstones, a new record with identical key and value, but with a ``tombstone''
+bit set, is inserted into the index. A record's presence can be checked by
+performing a point-lookup. If a tombstone with the same key and value exists
+above the record in the index, then it should be rejected when sampled.
+
+Two important aspects of performance are pertinent when discussing deletes: the
+cost of the delete operation, and the cost of verifying the presence of a
+sampled record. The choice of delete policy represents a trade-off between
+these two costs. Beyond this simple trade-off, the delete policy also has other
+implications that can affect its applicability to certain types of SSI. Most
+notably, tombstones do not require any in-place updating of records, whereas
+tagging does. This means that using tombstones is the only way to ensure total
+immutability of the data within shards, which avoids random writes and eases
+concurrency control. The tombstone delete policy, then, is particularly
+appealing in external and concurrent contexts.
+
+\Paragraph{Deletion Cost.} The cost of a delete under the tombstone policy is
+the same as an ordinary insert. Tagging, by contrast, requires a point-lookup
+of the record to be deleted, and so is more expensive. Assuming a point-lookup
+operation with cost $L(n)$, a tagged delete must search each level in the
+index, as well as the buffer, requiring $O\left(N_b + L(n)\log_s n\right)$
+time.
+
+\Paragraph{Rejection Check Costs.} In addition to the cost of the delete
+itself, the delete policy affects the cost of determining if a given record has
+been deleted. This is called the \emph{rejection check cost}, $R(n)$. When
+using tagging, the information necessary to make the rejection decision is
+local to the sampled record, and so $R(n) \in O(1)$. However, when using tombstones
+it is not; a point-lookup must be performed to search for a given record's
+corresponding tombstone. This look-up must examine the buffer, and each shard
+within the index. This results in a rejection check cost of $R(n) \in O\left(N_b +
+L(n) \log_s n\right)$. The rejection check process for the two delete policies is
+summarized in Figure~\ref{fig:delete}.
+
+Two factors contribute to the tombstone rejection check cost: the size of the
+buffer, and the cost of performing a point-lookup against the shards. The
+latter cost can be controlled using the framework's ability to associate
+auxiliary structures with shards. For SSIs which do not support efficient
+point-lookups, a hash table can be added to map key-value pairs to their
+location within the SSI. This allows for constant-time rejection checks, even
+in situations where the index would not otherwise support them. However, the
+storage cost of this intervention is high, and in situations where the SSI does
+support efficient point-lookups, it is not necessary. Further performance
+improvements can be achieved by noting that the probability of a given record
+having an associated tombstone in any particular shard is relatively small.
+This means that many point-lookups will be executed against shards that do not
+contain the tombstone being searched for. In this case, these unnecessary
+lookups can be partially avoided using Bloom filters~\cite{bloom70} for
+tombstones. By inserting tombstones into these filters during reconstruction,
+point-lookups against some shards which do not contain the tombstone being
+searched for can be bypassed. Filters can be attached to the buffer as well,
+which may be even more significant due to the linear cost of scanning it. As
+the goal is a reduction of rejection check costs, these filters need only be
+populated with tombstones. In a later section, techniques for bounding the
+number of tombstones on a given level are discussed, which will allow for the
+memory usage of these filters to be tightly controlled while still ensuring
+precise bounds on filter error.
+
+\Paragraph{Sampling with Deletes.} The addition of deletes to the framework
+alters the analysis of sampling costs. A record that has been deleted cannot
+be present in the sample set, and therefore the presence of each sampled record
+must be verified. If a record has been deleted, it must be rejected. When
+retrying samples rejected due to delete, the process must restart from shard
+selection, as deleted records may be counted in the weight totals used to
+construct that structure. This increases the cost of sampling to,
+\begin{equation}
+\label{eq:sampling-cost}
+ O\left([W(n) + P(n)]\log_s n + \frac{kS(n)}{1 - \mathbf{Pr}[\text{rejection}]} \cdot R(n)\right)
+\end{equation}
+where $R(n)$ is the cost of checking if a sampled record has been deleted, and
+$\nicefrac{k}{1 -\mathbf{Pr}[\text{rejection}]}$ is the expected number of sampling
+attempts required to obtain $k$ samples, given a fixed rejection probability.
+The rejection probability itself is a function of the workload, and is
+unbounded.
+
+\Paragraph{Bounding the Rejection Probability.} Rejections during sampling
+constitute wasted memory accesses and random number generations, and so steps
+should be taken to minimize their frequency. The probability of a rejection is
+directly related to the number of deleted records, which is itself a function
+of workload and dataset. This means that, without building counter-measures
+into the framework, tight bounds on sampling performance cannot be provided in
+the presence of deleted records. It is therefore critical that the framework
+support some method for bounding the number of deleted records within the
+index.
+
+While the static nature of shards prevents the direct removal of records at the
+moment they are deleted, it doesn't prevent the removal of records during
+reconstruction. When using tagging, all tagged records encountered during
+reconstruction can be removed. When using tombstones, however, the removal
+process is non-trivial. In principle, a rejection check could be performed for
+each record encountered during reconstruction, but this would increase
+reconstruction costs and introduce a new problem of tracking tombstones
+associated with records that have been removed. Instead, a lazier approach can
+be used: delaying removal until a tombstone and its associated record
+participate in the same shard reconstruction. This delay allows both the record
+and its tombstone to be removed at the same time, an approach called
+\emph{tombstone cancellation}. In general, this can be implemented using an
+extra linear scan of the input shards before reconstruction to identify
+tombstones and associated records for cancellation, but potential optimizations
+exist for many SSIs, allowing it to be performed during the reconstruction
+itself at no extra cost.
+
+The removal of deleted records passively during reconstruction is not enough to
+bound the number of deleted records within the index. It is not difficult to
+envision pathological scenarios where deletes result in unbounded rejection
+rates, even with this mitigation in place. However, the dropping of deleted
+records does provide a useful property: any specific deleted record will
+eventually be removed from the index after a finite number of reconstructions.
+Using this fact, a bound on the number of deleted records can be enforced. A
+new parameter, $\delta$, is defined, representing the maximum proportion of
+deleted records within the index. Each level, and the buffer, tracks the number
+of deleted records it contains by counting its tagged records or tombstones.
+Following each buffer flush, the proportion of deleted records is checked
+against $\delta$. If any level is found to exceed it, then a proactive
+reconstruction is triggered, pushing its shards down into the next level. The
+process is repeated until all levels respect the bound, allowing the number of
+deleted records to be precisely controlled, which, by extension, bounds the
+rejection rate. This process is called \emph{compaction}.
+
+Assuming every record is equally likely to be sampled, this new bound can be
+applied to the analysis of sampling costs. The probability of a record being
+rejected is $\mathbf{Pr}[\text{rejection}] = \delta$. Applying this result to
+Equation~\ref{eq:sampling-cost} yields,
+\begin{equation}
+%\label{eq:sampling-cost-del}
+ O\left([W(n) + P(n)]\log_s n + \frac{kS(n)}{1 - \delta} \cdot R(n)\right)
+\end{equation}
+
+Asymptotically, this proactive compaction does not alter the analysis of
+insertion costs. Each record is still written at most $s$ times on each level,
+there are at most $\log_s n$ levels, and the buffer insertion and SSI
+construction costs are all unchanged, and so on. This results in the amortized
+insertion cost remaining the same.
+
+This compaction strategy is based upon tombstone and record counts, and the
+bounds assume that every record is equally likely to be sampled. For certain
+sampling problems (such as WSS), there are other conditions that must be
+considered to provide a bound on the rejection rate. To account for these
+situations in a general fashion, the framework supports problem-specific
+compaction triggers that can be tailored to the SSI being used. These allow
+compactions to be triggered based on other properties, such as rejection rate
+of a level, weight of deleted records, and the like.
+
+
+\subsection{Trade-offs on Framework Design Space}
+\label{ssec:design-space}
+The framework has several tunable parameters, allowing it to be tailored for
+specific applications. This design space contains trade-offs among three major
+performance characteristics: update cost, sampling cost, and auxiliary memory
+usage. The two most significant decisions when implementing this framework are
+the selection of the layout and delete policies. The asymptotic analysis of the
+previous sections obscures some of the differences between these policies, but
+they do have significant practical performance implications.
+
+\Paragraph{Layout Policy.} The choice of layout policy represents a clear
+trade-off between update and sampling performance. Leveling
+results in fewer shards of larger size, whereas tiering results in a larger
+number of smaller shards. As a result, leveling reduces the costs associated
+with point-lookups and sampling query preprocessing by a constant factor,
+compared to tiering. However, it results in more write amplification: a given
+record may be involved in up to $s$ reconstructions on a single level, as
+opposed to the single reconstruction per level under tiering.
+
+\Paragraph{Delete Policy.} There is a trade-off between delete performance and
+sampling performance that exists in the choice of delete policy. Tagging
+requires a point-lookup when performing a delete, which is more expensive than
+the insert required by tombstones. However, it also allows constant-time
+rejection checks, unlike tombstones which require a point-lookup of each
+sampled record. In situations where deletes are common and write-throughput is
+critical, tombstones may be more useful. Tombstones are also ideal in
+situations where immutability is required, or random writes must be avoided.
+Generally speaking, however, tagging is superior when using SSIs that support
+it, because sampling rejection checks will usually be more common than deletes.
+
+\Paragraph{Mutable Buffer Capacity and Scale Factor.} The mutable buffer
+capacity and scale factor both influence the number of levels within the index,
+and by extension the number of distinct shards. Sampling and point-lookups have
+better performance with fewer shards. Smaller shards are also faster to
+reconstruct, although the same adjustments that reduce shard size also result
+in a larger number of reconstructions, so the trade-off here is less clear.
+
+The scale factor has an interesting interaction with the layout policy: when
+using leveling, the scale factor directly controls the amount of write
+amplification per level. Larger scale factors mean more time is spent
+reconstructing shards on a level, reducing update performance. Tiering does not
+have this problem and should see its update performance benefit directly from a
+larger scale factor, as this reduces the number of reconstructions.
+
+The buffer capacity also influences the number of levels, but is more
+significant in its effects on point-lookup performance: a lookup must perform a
+linear scan of the buffer. Likewise, the unstructured nature of the buffer also
+will contribute negatively towards sampling performance, irrespective of which
+buffer sampling technique is used. As a result, although a large buffer will
+reduce the number of shards, it will also hurt sampling and delete (under
+tagging) performance. It is important to minimize the cost of these buffer
+scans, and so it is preferable to keep the buffer small, ideally small enough
+to fit within the CPU's L2 cache. The number of shards within the index is,
+then, better controlled by changing the scale factor, rather than the buffer
+capacity. Using a smaller buffer will result in more compactions and shard
+reconstructions; however, the empirical evaluation in Section~\ref{ssec:ds-exp}
+demonstrates that this is not a serious performance problem when a scale factor
+is chosen appropriately. When the shards are in memory, frequent small
+reconstructions do not have a significant performance penalty compared to less
+frequent, larger ones.
+
+\Paragraph{Auxiliary Structures.} The framework's support for arbitrary
+auxiliary data structures allows for memory to be traded in exchange for
+insertion or sampling performance. The use of Bloom filters for accelerating
+tombstone rejection checks has already been discussed, but many other options
+exist. Bloom filters could also be used to accelerate point-lookups for delete
+tagging, though such filters would require much more memory than tombstone-only
+ones to be effective. An auxiliary hash table could be used for accelerating
+point-lookups, or range filters like SuRF \cite{zhang18} or Rosetta
+\cite{siqiang20} added to accelerate pre-processing for range queries like in
+IRS or WIRS.
diff --git a/chapters/sigmod23/introduction.tex b/chapters/sigmod23/introduction.tex
new file mode 100644
index 0000000..0155c7d
--- /dev/null
+++ b/chapters/sigmod23/introduction.tex
@@ -0,0 +1,20 @@
+\section{Introduction} \label{sec:intro}
+
+As a first attempt at realizing a dynamic extension framework, one of the
+non-decomposable search problems discussed in the previous chapter was
+considered: independent range sampling, along with a number of other
+independent sampling problems. These sorts of queries are important in a
+variety of contexts, including including approximate query processing
+(AQP)~\cite{blinkdb,quickr,verdict,cohen23}, interactive data
+exploration~\cite{sps,xie21}, financial audit sampling~\cite{olken-thesis}, and
+feature selection for machine learning~\cite{ml-sampling}. However, they are
+not well served using existing techniques, which tend to sacrifice statistical
+independence for performance, or vise versa. In this chapter, a solution for
+independent sampling is presented that manages to achieve both statistical
+independence, and good performance, by designing a Bentley-Saxe inspired
+framework for introducing update support to efficient static sampling data
+structures. It seeks to demonstrate the viability of Bentley-Saxe as the basis
+for adding update support to data structures, as well as showing that the
+limitations of the decomposable search problem abstraction can be overcome
+through alternative query processing techniques to preserve good
+performance.
diff --git a/chapters/sigmod23/relatedwork.tex b/chapters/sigmod23/relatedwork.tex
new file mode 100644
index 0000000..600cd0d
--- /dev/null
+++ b/chapters/sigmod23/relatedwork.tex
@@ -0,0 +1,33 @@
+\section{Related Work}
+\label{sec:related}
+
+The general IQS problem was first proposed by Hu, Qiao, and Tao~\cite{hu14} and
+has since been the subject of extensive research
+\cite{irsra,afshani17,xie21,aumuller20}. These papers involve the use of
+specialized indexes to assist in drawing samples efficiently from the result
+sets of specific types of query, and are largely focused on in-memory settings.
+A recent survey by Tao~\cite{tao22} acknowledged that dynamization remains a major
+challenge for efficient sampling indexes. There do exist specific examples of
+sampling indexes~\cite{hu14} designed to support dynamic updates, but they are
+specialized, and impractical due to their
+implementation complexity and high constant-factors in their cost functions. A
+static index for spatial independent range sampling~\cite{xie21} has been
+proposed with a dynamic extension similar to the one proposed in this paper, but the method was not
+generalized, and its design space was not explored. There are also
+weight-updatable implementations of the alias structure \cite{hagerup93,
+matias03, allendorf23} that function under various assumptions about the weight
+distribution. These are of limited utility in a database context as they do not
+support direct insertion or deletion of entries. Efforts have also been made to
+improve tree-traversal based sampling approaches. Notably, the AB-tree
+\cite{zhao22} extends tree-sampling with support for concurrent updates, which
+has been a historical pain point.
+
+The Bentley-Saxe method was first proposed by Saxe and Bentley~\cite{saxe79}.
+Overmars and van Leeuwen extended this framework to provide better worst-case
+bounds~\cite{overmars81}, but their approach hurts common case performance by
+splitting reconstructions into small pieces and executing these pieces each
+time a record is inserted. Though not commonly used in database systems, the
+method has been applied to address specialized, problems, such as the creation
+of dynamic metric indexing structures~\cite{naidan14}, analysis of
+trajectories~\cite{custers19}, and genetic sequence search
+indexes~\cite{almodaresi23}.