summaryrefslogtreecommitdiffstats
path: root/chapters/beyond-dsp.tex
diff options
context:
space:
mode:
Diffstat (limited to 'chapters/beyond-dsp.tex')
-rw-r--r--chapters/beyond-dsp.tex22
1 files changed, 11 insertions, 11 deletions
diff --git a/chapters/beyond-dsp.tex b/chapters/beyond-dsp.tex
index 74afdd2..5655b8c 100644
--- a/chapters/beyond-dsp.tex
+++ b/chapters/beyond-dsp.tex
@@ -1664,10 +1664,10 @@ compaction is triggered.
We configured our dynamized structure to use $s=8$, $N_B=12000$, $\delta
= .05$, $f = 16$, and the tiering layout policy. We compared our method
-(\textbf{DE-IRS}) to Olken's method~\cite{olken89} on a B+Tree with
+(\textbf{DE-IRS}) to Olken's method~\cite{olken89} on a B+tree with
aggregate weight counts (\textbf{AGG B+Tree}), as well as our bespoke
sampling solution from the previous chapter (\textbf{Bespoke}) and a
-single static instance of the ISAM Tree (\textbf{ISAM}). Because IRS
+single static instance of the ISAM tree (\textbf{ISAM}). Because IRS
is neither INV nor DDSP, the standard Bentley-Saxe Method has no way to
support deletes for it, and was not tested. All of our tested sampling
queries had a controlled selectivity of $\sigma = 0.01\%$ and $k=1000$.
@@ -1692,7 +1692,7 @@ the dynamic baseline.
Finally, Figure~\ref{fig:irs-space} shows the space usage of the
data structures, less the storage required for the raw data. The two
dynamized solutions require \emph{significantly} less storage than the
-dynamic B+Tree, which must leave empty spaces in its nodes for inserts.
+dynamic B+tree, which must leave empty spaces in its nodes for inserts.
This is a significant advantage of static data structures--they can pack
data much more tightly and require less storage. Dynamization, at least
in this case, doesn't add a significant amount of overhead over a single
@@ -1701,7 +1701,7 @@ instance of the static structure.
\subsection{$k$-NN Search}
\label{ssec:dyn-knn-exp}
Next, we'll consider answering high dimensional exact $k$-NN queries
-using a static Vantage Point Tree (VPTree)~\cite{vptree}. This is a
+using a static vantage point tree (VPTree)~\cite{vptree}. This is a
binary search tree with internal nodes that partition records based
on their distance to a selected point, called the vantage point. All
of the points within a fixed distance of the vantage point are covered
@@ -1746,10 +1746,10 @@ standard DDSP, we compare with the Bentley-Saxe Method (\textbf{BSM})\footnote{
be deleted in $\Theta(1)$ time, rather than requiring an inefficient
point-lookup directly on the VPTree.
} and a dynamic data structure for the same search problem called an
-M-Tree~\cite{mtree,mtree-impl} (\textbf{MTree}), which is an example of a so-called
+M-tree~\cite{mtree,mtree-impl} (\textbf{MTree}), which is an example of a so-called
"ball tree" structure that partitions high dimensional space using nodes
representing spheres, which are merged and split to maintain balance in
-a manner not unlike a B+Tree. We also consider a static instance of a
+a manner not unlike a B+tree. We also consider a static instance of a
VPTree built over the same set of records (\textbf{VPTree}). We used
L2 distance as our metric, which is defined for vectors of $d$
dimensions as
@@ -1784,7 +1784,7 @@ which are biased towards better insertion performance. Both dynamized
structures also outperform the dynamic baseline. Finally, as is becoming
a trend, Figure~\ref{fig:knn-space} shows that the storage requirements
of the static data structures, dynamized or not, are significantly less
-than M-Tree. M-Tree, like a B+Tree, requires leaving empty slots in its
+than M-tree. M-tree, like a B+tree, requires leaving empty slots in its
nodes to support insertion, and this results in a large amount of wasted
space.
@@ -1810,7 +1810,7 @@ We apply our framework to create dynamized versions of two static learned
indices: Triespline~\cite{plex} (\textbf{DE-TS}) and PGM~\cite{pgm}
(\textbf{DE-PGM}), and compare with a standard Bentley-Saxe dynamized of
Triespline (\textbf{BSM-TS}). Our dynamic baselines are ALEX~\cite{alex},
-which is dynamic learned index based on a B+Tree like structure, and
+which is dynamic learned index based on a B+tree like structure, and
PGM (\textbf{PGM}), which provides support for a dynamic version based
on Bentley-Saxe dynamization (which is why we have not included a BSM
version of PGM in our testing).
@@ -1885,7 +1885,7 @@ support does in its own update-optimized configuration.\footnote{
these data structures. All of the dynamic options require significantly
more space than the static Triespline, but ALEX requires the most by a
very large margin. This is in keeping with the previous experiments, which
-all included similarly B+Tree-like structures that required significant
+all included similarly B+tree-like structures that required significant
additional storage space compared to static structures as part of their
update support.
@@ -1966,7 +1966,7 @@ this test.
In this benchmark, we used a single thread to insert records
into the structure at a constant rate, while we deployed a variable
number of additional threads that continuously issued sampling queries
-against the structure. We used an AGG B+Tree as our baseline. Note
+against the structure. We used an AGG B+tree as our baseline. Note
that, to accurately maintain the aggregate weight counts as records
are inserted, it is necessary that each operation obtain a lock on
the root node of the tree~\cite{zhao22}. This makes this situation
@@ -1974,7 +1974,7 @@ a good use-case for the automatic concurrency support provided by our
framework. Figure~\ref{fig:irs-concurrency} shows the results of this
benchmark for various numbers of concurrency query threads. As can be seen,
our framework supports a stable update throughput up to 32 query threads,
-whereas the AGG B+Tree suffers from contention for the mutex and sees
+whereas the AGG B+tree suffers from contention for the mutex and sees
its performance degrade as the number of threads increases.
\begin{figure}