Julia updates v2

author: Douglas Rumbaugh <dbr4@psu.edu> 2025-06-08 15:04:00 -0400
committer: Douglas Rumbaugh <dbr4@psu.edu> 2025-06-08 15:04:00 -0400
commit: 33bc7e620276f4269ee5f1820e5477135e020b3f (patch)
tree: 03a7bb2ccbf7f1d2943871a69bca18006270bd20 /chapters/sigmod23
parent: 50adf588694170699adfa75cd2d1763263085165 (diff)
download: dissertation-33bc7e620276f4269ee5f1820e5477135e020b3f.tar.gz
2 files changed, 7 insertions, 7 deletions
diff --git a/chapters/sigmod23/background.tex b/chapters/sigmod23/background.tex
index 88f2585..42a52de 100644
--- a/chapters/sigmod23/background.tex
+++ b/chapters/sigmod23/background.tex
@@ -104,7 +104,7 @@ sampling} (WIRS),
     positive weights $w: D\to \mathbb{R}^+$. Given a query
     interval $q = [x, y]$ and an integer $k$, an independent range sampling
     query returns $k$ independent samples from $D \cap q$ with each 
-    point having a probability of $\nicefrac{w(d)}{\sum_{p \in D \cap q}w(p)}$
+    point having a probability of $\frac{w(d)}{\sum_{p \in D \cap q}w(p)}$
     of being sampled.
 \end{definition}
 
@@ -118,7 +118,7 @@ SQL's \texttt{TABLESAMPLE} operator~\cite{postgres-doc}. However, the
 algorithms used to implement this operator have significant limitations
 and do not allow users to maintain statistical independence of the results
 without also running the query to be sampled from in full. Thus, users must
-choose between independece and performance.
+choose between independence and performance.
 
 To maintain statistical independence, Bernoulli sampling is used. This
 technique requires iterating over every record in the result set of the
@@ -198,7 +198,7 @@ call static sampling indices (SSIs) in this chapter,\footnote{
   am retaining the term SSI in this chapter for consistency with the
   original paper, but understand that in the terminology established in
   Chapter~\ref{chap:background}, SSIs are data structures, not indices.
-},
+}
 that are capable of answering sampling queries more efficiently than
 Olken's method relative to the overall data size.  An example of such
 a structure is used in Walker's alias method \cite{walker74,vose91}.
diff --git a/chapters/sigmod23/framework.tex b/chapters/sigmod23/framework.tex
index d51c2cb..b3a8215 100644
--- a/chapters/sigmod23/framework.tex
+++ b/chapters/sigmod23/framework.tex
@@ -532,7 +532,7 @@ rates, buffering, sub-partitioning of structures to allow finer-grained
 reconstruction~\cite{dayan22}, and approaches for allocating resources to
 auxiliary structures attached to the main ones for accelerating certain
 types of query~\cite{dayan18-1, zhu21, monkey}. This work is discussed
-in greater depth in Chapter~\ref{chap:related-work}
+in greater depth in Chapter~\ref{chap:related-work}.
 
 Many of the elements within the LSM Tree design space are based upon the
 specifics of the data structure itself, and are not applicable to our
@@ -561,7 +561,7 @@ the case of sampling this isn't a serious problem. The implications of
 this will be discussed in Section~\ref{ssec:sampling-cost-funcs}. The
 size of this buffer, $N_B$ is a user-specified constant. Block capacities
 are defined in terms of multiples of $N_B$, such that each buffer flush
-corresponds to an insert in the traditioanl Bentley-Saxe method. Thus,
+corresponds to an insert in the traditional Bentley-Saxe method. Thus,
 rather than the $i$th block containing $2^i$ records, it contains $N_B
 \cdot 2^i$ records. We call this unsorted array the \emph{mutable buffer}.
 
@@ -750,8 +750,8 @@ operations must be used, the the cost becomes $I_a(n) \in
 \Paragraph{Delete.} The framework supports both tombstone and tagged
 deletes, each with different performance. Using tombstones, the cost
 of a delete is identical to that of an insert. When using tagging, the
-cost of a delete is the same as cost of doing a point lookup, as the
-"delete" itself is simply setting a bit in the header of the record,
+cost of a delete is the same as the cost of a point lookup, because the
+"delete" itself only sets a bit in the header of the record,
 once it has been located. There will be $\Theta(\log_s n)$ total shards
 in the structure, each with a look-up cost of $L(n)$ using either the
 SSI's native point-lookup, or an auxiliary hash table, and the lookup
author	Douglas Rumbaugh <dbr4@psu.edu>	2025-06-08 15:04:00 -0400
committer	Douglas Rumbaugh <dbr4@psu.edu>	2025-06-08 15:04:00 -0400
commit	33bc7e620276f4269ee5f1820e5477135e020b3f (patch)
tree	03a7bb2ccbf7f1d2943871a69bca18006270bd20 /chapters/sigmod23
parent	50adf588694170699adfa75cd2d1763263085165 (diff)
download	dissertation-33bc7e620276f4269ee5f1820e5477135e020b3f.tar.gz