1 files changed, 29 insertions, 0 deletions
diff --git a/chapters/sigmod23/abstract.tex b/chapters/sigmod23/abstract.tex
new file mode 100644
index 0000000..3ff0c08
--- /dev/null
+++ b/chapters/sigmod23/abstract.tex
@@ -0,0 +1,29 @@
+\begin{abstract}
+
+    The execution of analytical queries on massive datasets presents challenges
+    due to long response times and high computational costs. As a result, the
+    analysis of representative samples of data has emerged as an attractive
+    alternative; this avoids the cost of processing queries against the entire
+    dataset, while still producing statistically valid results. Unfortunately,
+    the sampling techniques in common use sacrifice either sample quality or
+    performance, and so are poorly suited for this task. However, it is
+    possible to build high quality sample sets efficiently with the assistance
+    of indexes. This introduces a new challenge: real-world data is subject to
+    continuous update, and so the indexes must be kept up to date. This is
+    difficult, because existing sampling indexes present a dichotomy; efficient
+    sampling indexes are difficult to update, while easily updatable indexes
+    have poor sampling performance. This paper seeks to address this gap by
+    proposing a general and practical framework for extending most sampling
+    indexes with efficient update support, based on splitting indexes into
+    smaller shards, combined with a systematic approach to the periodic
+    reconstruction. The framework's design space is examined, with an eye
+    towards exploring trade-offs between update performance, sampling
+    performance, and memory usage. Three existing static sampling indexes are
+    extended using this framework to support updates, and the generalization of
+    the framework to concurrent operations and larger-than-memory data is
+    discussed. Through a comprehensive suite of benchmarks, the extended
+    indexes are shown to match or exceed the update throughput of
+    state-of-the-art dynamic baselines, while presenting significant
+    improvements in sampling latency.
+
+\end{abstract}