From 5ffc53e69e956054fdefd1fe193e00eee705dcab Mon Sep 17 00:00:00 2001
From: Douglas Rumbaugh <dbr4@psu.edu>
Date: Mon, 12 May 2025 19:59:26 -0400
Subject: Updates

---
 chapters/sigmod23/framework.tex | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

(limited to 'chapters/sigmod23/framework.tex')

diff --git a/chapters/sigmod23/framework.tex b/chapters/sigmod23/framework.tex
index 89f15c3..0f3fac8 100644
--- a/chapters/sigmod23/framework.tex
+++ b/chapters/sigmod23/framework.tex
@@ -232,12 +232,13 @@ or are naturally determined as part of the pre-processing, and thus the
 $W(n)$ term can be merged into $P(n)$.
 
 \subsection{Supporting Deletes}
+\ref{ssec:sampling-deletes}
 
 As discussed in Section~\ref{ssec:background-deletes}, the Bentley-Saxe
 method can support deleting records through the use of either weak
-deletes, or a secondary ghost structure, assume certain properties are
+deletes, or a secondary ghost structure, assuming certain properties are
 satisfied by either the search problem or data structure. Unfortunately,
-neither approach can work as a "drop-in" solution in the context of
+neither approach can work as a ``drop-in'' solution in the context of
 sampling problems, because of the way that deleted records interact with
 the sampling process itself. Sampling problems, as formalized here,
 are neither invertable, nor deletion decomposable. In this section,
@@ -258,9 +259,9 @@ the structure with a tombstone bit set in the header. This mechanism is
 used to support \emph{ghost structure} based deletes.
 \end{enumerate}
 
-Broadly speaking, for sampling problems, tombstone deletes cause a number
-of problems because \emph{sampling problems are not invertible}. However,
-this limitation can be worked around during the query process if desired.
+Broadly speaking, for sampling problems, tombstone deletes cause a
+number of problems because \emph{sampling problems are not invertible}.
+This limitation can be worked around during the query process if desired.
 Tagging is much more natural for these search problems. However, the
 flexibility of selecting either option is desirable because of their
 different performance characteristics.
@@ -527,8 +528,8 @@ unwieldy and are targetted at tuning the worst-case at the expense of the
 common case. We will take a different approach to adding configurability
 to our dynamization system.
 
-Though it has thus far gone unmentioned, readers familiar with LSM Trees
-may have noted the astonishing similarity between decomposition-based
+Though it has thus far gone unmentioned, some readers may have
+noted the astonishing similarity between decomposition-based
 dynamization techniques, and a data structure called the Log-structured
 Merge-tree. First proposed by O'Neil in the mid '90s\cite{oneil96},
 the LSM Tree was designed to optmize write throughout for external data
@@ -541,7 +542,7 @@ layered, external structures, to reduce the cost of reconstruction.
 
 In more recent times, the LSM Tree has seen significant development and
 been used as the basis for key-value stores like RocksDB~\cite{dong21}
-and LevelDB~\cite{leveldb}. This work as produced an incredibly large
+and LevelDB~\cite{leveldb}. This work has produced an incredibly large
 and well explored parameterization of the reconstruction procedures of
 LSM Trees, a good summary of which can be bound in this recent tutorial
 paper~\cite{sarkar23}. Examples of this design space exploration include:
@@ -701,7 +702,7 @@ levels below it, which may require further reconstructions to occur to
 make room. The manner in which these reconstructions proceed follows the
 selection of layout policy,
 \begin{itemize}
-\item[\textbf{Leveling}] When a buffer flush occurs in the leveling
+\item \textbf{Leveling.} When a buffer flush occurs in the leveling
 policy, the system scans the existing levels to find the first level
 which has sufficient empty space to store the contents of the level above
 it. More formally, if the number of records in level $i$ is $N_i$, then
@@ -711,8 +712,8 @@ empty level is added and $i$ is set to the index of this new level.  Then,
 a reconstruction is executed containing all of the records in levels $i$
 and $i - 1$ (where $i=-1$ indicates the temporary shard built from the
 buffer). Following this reconstruction, all levels $j < i$ are shifted
-by one level.
-\item[\textbf{Tiering}] When using tiering, the system will locate
+by one level to $j + 1$.
+\item \textbf{Tiering.} When using tiering, the system will locate
 the first level, $i$, containing fewer than $s$ shards. If no such
 level exists, then a new empty level is added and $i$ is set to the
 index of that level.  Then, for each level $j < i$, a reconstruction
-- 
cgit v1.2.3