summaryrefslogtreecommitdiffstats
path: root/chapters/sigmod23/introduction.tex
blob: 0155c7d8acd3cdb2242c0e855c7e0d3427069650 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
\section{Introduction} \label{sec:intro} 

As a first attempt at realizing a dynamic extension framework, one of the
non-decomposable search problems discussed in the previous chapter was
considered: independent range sampling, along with a number of other
independent sampling problems. These sorts of queries are important in a
variety of contexts, including including approximate query processing
(AQP)~\cite{blinkdb,quickr,verdict,cohen23}, interactive data
exploration~\cite{sps,xie21}, financial audit sampling~\cite{olken-thesis}, and
feature selection for machine learning~\cite{ml-sampling}. However, they are
not well served using existing techniques, which tend to sacrifice statistical
independence for performance, or vise versa. In this chapter, a solution for
independent sampling is presented that manages to achieve both statistical
independence, and good performance, by designing a Bentley-Saxe inspired
framework for introducing update support to efficient static sampling data
structures. It seeks to demonstrate the viability of Bentley-Saxe as the basis
for adding update support to data structures, as well as showing that the
limitations of the decomposable search problem abstraction can be overcome
through alternative query processing techniques to preserve good
performance.