Updates

author: Douglas Rumbaugh <dbr4@psu.edu> 2025-05-13 17:29:40 -0400
committer: Douglas Rumbaugh <dbr4@psu.edu> 2025-05-13 17:29:40 -0400
commit: 40bff24fc2e2da57f382e4f49a5ffb7c826bbcfb (patch)
tree: c00441b058255de08a32d227ce7af46bf11d8eb8 /chapters/sigmod23/background.tex
parent: 5ffc53e69e956054fdefd1fe193e00eee705dcab (diff)
download: dissertation-40bff24fc2e2da57f382e4f49a5ffb7c826bbcfb.tar.gz
1 files changed, 6 insertions, 6 deletions
diff --git a/chapters/sigmod23/background.tex b/chapters/sigmod23/background.tex
index b4ccbf1..af3b80a 100644
--- a/chapters/sigmod23/background.tex
+++ b/chapters/sigmod23/background.tex
@@ -37,12 +37,12 @@ have \emph{statistical independence} and for the distribution of records
 in the sample set to match the distribution of source data set. This
 requires that the sampling of a record does not affect the probability of
 any other record being sampled in the future. Such sample sets are said
-to be drawn i.i.d (idendepently and identically distributed). Throughout
+to be drawn i.i.d (independently and identically distributed). Throughout
 this chapter, the term "independent" will be used to describe both
 statistical independence, and identical distribution.
 
 Independence of sample sets is important because many useful statistical
-results are derived from assumping that the condition holds. For example,
+results are derived from assuming that the condition holds. For example,
 it is a requirement for the application of statistical tools such as
 the Central Limit Theorem~\cite{bulmer79}, which is the basis for many
 concentration bounds.  A failure to maintain independence in sampling
@@ -54,7 +54,7 @@ sampling} (IQS)~\cite{hu14}. In IQS, a sample set is constructed from a
 specified number of records in the result set of a database query. In
 this context, it isn't enough to ensure that individual records are
 sampled independently; the sample sets from repeated queries must also be
-indepedent. This precludes, for example, caching and returning the same
+independent. This precludes, for example, caching and returning the same
 sample set to multiple repetitions of the same query. This inter-query
 independence provides a variety of useful properties, such as fairness
 and representativeness of query results~\cite{tao22}.
@@ -194,7 +194,7 @@ call static sampling indices (SSIs) in this chapter,\footnote{
   is based, which was published prior to our realization that a strong
   distinction between an index and a data structure would be useful. I
   am retaining the term SSI in this chapter for consistency with the
-  original paper, but understand that in the termonology established in
+  original paper, but understand that in the terminology established in
   Chapter~\ref{chap:background}, SSIs are data structures, not indices.
 },
 that are capable of answering sampling queries more efficiently than
@@ -216,7 +216,7 @@ per sample.  Thus, a WSS query can be answered in $\Theta(k)$ time,
 assuming the structure has already been built. Unfortunately, the alias
 structure cannot be efficiently updated, as inserting new records would
 change the relative weights of \emph{all} the records, and require fully
-repartitioning the structure.
+re-partitioning the structure.
 
 While the alias method only applies to WSS, other sampling problems can
 be solved by using the alias method within the context of a larger data
@@ -245,7 +245,7 @@ the alias structure with support for weight updates over a fixed set of
 elements~\cite{hagerup93,matias03,allendorf23}. These approaches do not
 allow the insertion or removal of new records, however, only in-place
 weight updates. While in principle they could be constructed over the
-entire domain of possible records, with the weights of non-existant
+entire domain of possible records, with the weights of non-existent
 records set to $0$, this is hardly practical. Thus, these structures are
 not suited for the database sampling applications that are of interest to
 us in this chapter.
author	Douglas Rumbaugh <dbr4@psu.edu>	2025-05-13 17:29:40 -0400
committer	Douglas Rumbaugh <dbr4@psu.edu>	2025-05-13 17:29:40 -0400
commit	40bff24fc2e2da57f382e4f49a5ffb7c826bbcfb (patch)
tree	c00441b058255de08a32d227ce7af46bf11d8eb8 /chapters/sigmod23/background.tex
parent	5ffc53e69e956054fdefd1fe193e00eee705dcab (diff)
download	dissertation-40bff24fc2e2da57f382e4f49a5ffb7c826bbcfb.tar.gz