summaryrefslogtreecommitdiffstats
path: root/chapters/sigmod23/exp-baseline.tex
diff options
context:
space:
mode:
authorDouglas Rumbaugh <dbr4@psu.edu>2025-05-13 17:29:40 -0400
committerDouglas Rumbaugh <dbr4@psu.edu>2025-05-13 17:29:40 -0400
commit40bff24fc2e2da57f382e4f49a5ffb7c826bbcfb (patch)
treec00441b058255de08a32d227ce7af46bf11d8eb8 /chapters/sigmod23/exp-baseline.tex
parent5ffc53e69e956054fdefd1fe193e00eee705dcab (diff)
downloaddissertation-40bff24fc2e2da57f382e4f49a5ffb7c826bbcfb.tar.gz
Updates
Diffstat (limited to 'chapters/sigmod23/exp-baseline.tex')
-rw-r--r--chapters/sigmod23/exp-baseline.tex12
1 files changed, 6 insertions, 6 deletions
diff --git a/chapters/sigmod23/exp-baseline.tex b/chapters/sigmod23/exp-baseline.tex
index da62766..5585c36 100644
--- a/chapters/sigmod23/exp-baseline.tex
+++ b/chapters/sigmod23/exp-baseline.tex
@@ -5,7 +5,7 @@ Olken's method on an aggregate B+Tree. We also examine the query performance
of a single instance of the SSI in question to establish how much query
performance is lost in the dynamization. Unless otherwise specified,
IRS and WIRS queries are run with a selectivity of $0.1\%$. Additionally,
-the \texttt{OSM} dataset was downsampled to 500 million records, except
+the \texttt{OSM} dataset was down-sampled to 500 million records, except
for scalability tests. The synthetic uniform and zipfian datasets were
generated with 1 billion records. As with the previous section, all
benchmarks began by warming up the structure with $10\%$ of the total
@@ -50,13 +50,13 @@ resulting in better performance.
\end{figure*}
In Figures~\ref{fig:wirs-insert} and \ref{fig:wirs-sample} we examine
-the performed of \texttt{DE-WIRS} compared to \text{AGG B+TreE} and an
+the performed of \texttt{DE-WIRS} compared to \text{AGG B+tree} and an
alias-augmented B+Tree. We see the same basic set of patterns in this
case as we did with WSS. \texttt{AGG B+Tree} defeats our dynamized
index on the \texttt{twitter} dataset, but loses on the others, in
terms of insertion performance. We can see that the alias-augmented
B+Tree is much more expensive to build than an alias structure, and
-so its insertion performance advantage is erroded somewhat compared to
+so its insertion performance advantage is eroded somewhat compared to
the dynamic structure. For queries we see that the \texttt{AGG B+Tree}
performs similarly for WIRS sampling as it did for WSS sampling, but the
alias-augmented B+Tree structure is quite a bit slower at WIRS than the
@@ -82,7 +82,7 @@ being introduced by the dynamization.
We next considered IRS queries. Figures~\ref{fig:irs-insert1} and
\ref{fig:irs-sample1} show the results of our testing of single-threaded
\texttt{DE-IRS} running in-memory against the in-memory ISAM Tree and
-\texttt{AGG B+treE}. The ISAM tree structure can be efficiently bulk-loaded,
+\texttt{AGG B+tree}. The ISAM tree structure can be efficiently bulk-loaded,
which results in a much faster construction time than the alias structure
or alias-augmented B+tree. This gives it a significant update performance
advantage, and we see in Figure~\ref{fig:irs-insert1} that \texttt{DE-IRS}
@@ -96,7 +96,7 @@ the performance differences.
We also consider the scalability of inserts, queries, and deletes, of
\texttt{DE-IRS} compared to \texttt{AGG B+tree} across a wide range of
data sizes. Figure~\ref{fig:irs-insert-s} shows that \texttt{DE-IRS}'s
-insertion performance scales similarly with datasize as the baseline, and
+insertion performance scales similarly with data size as the baseline, and
Figure~\ref{fig:irs-sample-s} tells a similar story for query performance.
Figure~\ref{fig:irs-delete-s} compares the delete performance of the
two structures, where \texttt{DE-IRS} is configured to use tagging. As
@@ -110,7 +110,7 @@ the B+tree is superior to \texttt{DE-IRS} because of the cost of the
preliminary processing that our dynamized structure must do to begin
to answer queries. However, as the sample set size increases, this cost
increasingly begins to pay off, with \texttt{DE-IRS} quickly defeating
-the dynamic structure in averge per-sample latency. One other interesting
+the dynamic structure in average per-sample latency. One other interesting
note is the performance of the static ISAM tree, which begins on-par with
the B+Tree, but also sees an improvement as the sample set size increases.
This is because of cache effects. During the initial tree traversal, both