Updates

author: Douglas Rumbaugh <dbr4@psu.edu> 2025-05-12 19:59:26 -0400
committer: Douglas Rumbaugh <dbr4@psu.edu> 2025-05-12 19:59:26 -0400
commit: 5ffc53e69e956054fdefd1fe193e00eee705dcab (patch)
tree: 74fd32db95211d0be067d22919e65ac959e4fa46 /chapters/sigmod23/framework.tex
parent: 901a04fd8ec9a07b7bd195517a6d9e89da3ecab6 (diff)
download: dissertation-5ffc53e69e956054fdefd1fe193e00eee705dcab.tar.gz
1 files changed, 12 insertions, 11 deletions
diff --git a/chapters/sigmod23/framework.tex b/chapters/sigmod23/framework.tex
index 89f15c3..0f3fac8 100644
--- a/chapters/sigmod23/framework.tex
+++ b/chapters/sigmod23/framework.tex
@@ -232,12 +232,13 @@ or are naturally determined as part of the pre-processing, and thus the
 $W(n)$ term can be merged into $P(n)$.
 
 \subsection{Supporting Deletes}
+\ref{ssec:sampling-deletes}
 
 As discussed in Section~\ref{ssec:background-deletes}, the Bentley-Saxe
 method can support deleting records through the use of either weak
-deletes, or a secondary ghost structure, assume certain properties are
+deletes, or a secondary ghost structure, assuming certain properties are
 satisfied by either the search problem or data structure. Unfortunately,
-neither approach can work as a "drop-in" solution in the context of
+neither approach can work as a ``drop-in'' solution in the context of
 sampling problems, because of the way that deleted records interact with
 the sampling process itself. Sampling problems, as formalized here,
 are neither invertable, nor deletion decomposable. In this section,
@@ -258,9 +259,9 @@ the structure with a tombstone bit set in the header. This mechanism is
 used to support \emph{ghost structure} based deletes.
 \end{enumerate}
 
-Broadly speaking, for sampling problems, tombstone deletes cause a number
-of problems because \emph{sampling problems are not invertible}. However,
-this limitation can be worked around during the query process if desired.
+Broadly speaking, for sampling problems, tombstone deletes cause a
+number of problems because \emph{sampling problems are not invertible}.
+This limitation can be worked around during the query process if desired.
 Tagging is much more natural for these search problems. However, the
 flexibility of selecting either option is desirable because of their
 different performance characteristics.
@@ -527,8 +528,8 @@ unwieldy and are targetted at tuning the worst-case at the expense of the
 common case. We will take a different approach to adding configurability
 to our dynamization system.
 
-Though it has thus far gone unmentioned, readers familiar with LSM Trees
-may have noted the astonishing similarity between decomposition-based
+Though it has thus far gone unmentioned, some readers may have
+noted the astonishing similarity between decomposition-based
 dynamization techniques, and a data structure called the Log-structured
 Merge-tree. First proposed by O'Neil in the mid '90s\cite{oneil96},
 the LSM Tree was designed to optmize write throughout for external data
@@ -541,7 +542,7 @@ layered, external structures, to reduce the cost of reconstruction.
 
 In more recent times, the LSM Tree has seen significant development and
 been used as the basis for key-value stores like RocksDB~\cite{dong21}
-and LevelDB~\cite{leveldb}. This work as produced an incredibly large
+and LevelDB~\cite{leveldb}. This work has produced an incredibly large
 and well explored parameterization of the reconstruction procedures of
 LSM Trees, a good summary of which can be bound in this recent tutorial
 paper~\cite{sarkar23}. Examples of this design space exploration include:
@@ -701,7 +702,7 @@ levels below it, which may require further reconstructions to occur to
 make room. The manner in which these reconstructions proceed follows the
 selection of layout policy,
 \begin{itemize}
-\item[\textbf{Leveling}] When a buffer flush occurs in the leveling
+\item \textbf{Leveling.} When a buffer flush occurs in the leveling
 policy, the system scans the existing levels to find the first level
 which has sufficient empty space to store the contents of the level above
 it. More formally, if the number of records in level $i$ is $N_i$, then
@@ -711,8 +712,8 @@ empty level is added and $i$ is set to the index of this new level.  Then,
 a reconstruction is executed containing all of the records in levels $i$
 and $i - 1$ (where $i=-1$ indicates the temporary shard built from the
 buffer). Following this reconstruction, all levels $j < i$ are shifted
-by one level.
-\item[\textbf{Tiering}] When using tiering, the system will locate
+by one level to $j + 1$.
+\item \textbf{Tiering.} When using tiering, the system will locate
 the first level, $i$, containing fewer than $s$ shards. If no such
 level exists, then a new empty level is added and $i$ is set to the
 index of that level.  Then, for each level $j < i$, a reconstruction
author	Douglas Rumbaugh <dbr4@psu.edu>	2025-05-12 19:59:26 -0400
committer	Douglas Rumbaugh <dbr4@psu.edu>	2025-05-12 19:59:26 -0400
commit	5ffc53e69e956054fdefd1fe193e00eee705dcab (patch)
tree	74fd32db95211d0be067d22919e65ac959e4fa46 /chapters/sigmod23/framework.tex
parent	901a04fd8ec9a07b7bd195517a6d9e89da3ecab6 (diff)
download	dissertation-5ffc53e69e956054fdefd1fe193e00eee705dcab.tar.gz