summaryrefslogtreecommitdiffstats
path: root/chapters/dynamization.tex
diff options
context:
space:
mode:
authorDouglas Rumbaugh <dbr4@psu.edu>2025-05-30 21:31:31 -0400
committerDouglas Rumbaugh <dbr4@psu.edu>2025-05-30 21:31:31 -0400
commit3df3d11f71073419ea05fd66bc77c0d9474ca4ce (patch)
tree216a977bcee6f7a8b220dd7fbe48843d39878cd4 /chapters/dynamization.tex
parent6bbc26424eae2d8069de716e7c685a4188d923b9 (diff)
downloaddissertation-3df3d11f71073419ea05fd66bc77c0d9474ca4ce.tar.gz
Updates
Diffstat (limited to 'chapters/dynamization.tex')
-rw-r--r--chapters/dynamization.tex74
1 files changed, 66 insertions, 8 deletions
diff --git a/chapters/dynamization.tex b/chapters/dynamization.tex
index 2301537..fce0d9f 100644
--- a/chapters/dynamization.tex
+++ b/chapters/dynamization.tex
@@ -366,11 +366,12 @@ for brevity and streamline some of the original notation (full details
can be found in~\cite{overmars83}), but this technique ultimately
results in a data structure with the following performance characteristics,
\begin{align*}
-\text{Amortized Insertion Cost:}&\quad \Theta\left(\frac{B(n)}{n} + B\left(\frac{n}{f(n)}\right)\right) \\
-\text{Worst-case Query Cost:}& \quad \Theta\left(f(n) \cdot \mathscr{Q}\left(\frac{n}{f(n)}\right)\right) \\
+\text{Amortized Insertion Cost:}&\quad I_A(n) \in \Theta\left(\frac{B(n)}{n} + B\left(\frac{n}{f(n)}\right)\right) \\
+\text{Worst-case Insertion Cost:}&\quad I(n) \in \Theta\left(B(n)\right) \\
+\text{Worst-case Query Cost:}& \quad \mathscr{Q}(n) \in \Theta\left(f(n) \cdot \mathscr{Q}_S\left(\frac{n}{f(n)}\right)\right) \\
\end{align*}
where $B(n)$ is the cost of statically building $\mathcal{I}$, and
-$\mathscr{Q}(n)$ is the cost of answering $F$ using $\mathcal{I}$.
+$\mathscr{Q}_S(n)$ is the cost of answering $F$ using $\mathcal{I}$.
%TODO: example?
@@ -431,9 +432,9 @@ structure in the same way that incrementing the binary number by $1$ does.
By applying BSM to a data structure, a dynamized structure can be created
with the following performance characteristics,
\begin{align*}
-\text{Amortized Insertion Cost:}&\quad \Theta\left(\left(\frac{B(n)}{n}\cdot \log_2 n\right)\right) \\
-\text{Worst Case Insertion Cost:}&\quad \Theta\left(B(n)\right) \\
-\text{Worst-case Query Cost:}& \quad \Theta\left(\log_2 n\cdot \mathscr{Q}\left(n\right)\right) \\
+\text{Amortized Insertion Cost:}&\quad I_A(n) \in \Theta\left(\left(\frac{B(n)}{n}\cdot \log_2 n\right)\right) \\
+\text{Worst Case Insertion Cost:}&\quad I(n) \in \Theta\left(B(n)\right) \\
+\text{Worst-case Query Cost:}& \quad \mathscr{Q}(n) \in \Theta\left(\log_2 n\cdot \mathscr{Q}_S\left(n\right)\right) \\
\end{align*}
This is a particularly attractive result because, for example, a data
structure having $B(n) \in \Theta(n)$ will have an amortized insertion
@@ -459,10 +460,66 @@ individually. For formally, for any query running in $\mathscr{Q}(n) \in
cost of answering a decomposable search problem from a BSM dynamization
is $\Theta\left(\mathscr{Q}(n)\right)$.~\cite{saxe79}
-\subsection{The Mixed Method}
+
\subsection{Merge Decomposable Search Problems}
+When a reconstruction is performed using these techniques, the inputs to
+that reconstruction are not random collections of records, but rather
+multiple data structures. While in the fully general case, these new
+structures are built by first unbuilding all of the input structures and
+then building a new one over that set of records, many data structures
+admit more efficient \emph{merging}. Consider a data structure that
+supports construction via merging, $\mathtt{merge}(\mathscr{I}_0, \ldots
+\mathscr{I}_k)$ in $B_M(n, k)$ time, where $n = \sum_{i=0}^k
+|\mathscr{I}_i|$. A search problem for which such a data structure exists is
+called a \emph{merge decomposable search problem} (MDSP)~\cite{merge-dsp}.
+
+Note that in~\cite{merge-dsp}, Overmars considers a \emph{very} specific
+definition where the data structure is built in two stages. An initial
+sorting phase, requiring $O(n \log n)$ time, and then a construction
+phase requiring $O(n)$ time. Overmar's proposed mechanism for leveraging
+this property is to include with each block a linked list storing the
+records in sorted order (presumably to account for structures where the
+records must be sorted, but aren't necessarily kept that way). During
+reconstructions, these sorted lists can first be merged, and then the
+data structure built from the resulting merged list. Using this approach,
+even accounting for the merging of the list, he is able to prove that
+the amortized insertion cost is less than would have been the case paying
+the $O( n \log n)$ cost for each reconstruction.~\cite{merge-dsp}
+
+While Overmars's definition for MDSP does capture a large number of
+mergable data structures (including all of the mergable structures
+considered in this work), we modify his definition to consider a broader
+class of problems. We will be using the term to refer to any search
+problem with a data structure that can be merged more efficiently than
+built from an unsorted set of records. More formally,
+\begin{definition}[Merge Decomposable Search Problem~\cite{merge-dsp}]
+ \label{def:mdsp}
+ A search problem $F: (\mathcal{D}, \mathcal{Q}) \to \mathcal{R}$
+ is decomposable if and only if there exists a data structure,
+ $\mathcal{I}$ capable of solving $F$ that is constructable by
+ merging $k$ instances of $\mathcal{I}$ with cost $B_M(n, k)$ such
+ that $B_M(n, \log n) \leq B(n)$.
+\end{definition}
+
+The use of $k = \log n$ in this definition comes from the Bentley-Saxe
+method's upper limit on the number of data structures. In the worst case,
+there will be $\log n$ structures to merge, and so to gain benefit
+from the merge routine, the merging of $\log n$ structures must be
+less expensive than building the new structure using the standard
+$\mathtt{unbuild}$ and $\mathtt{build}$ mechanism. The availability of
+an efficient merge operation isn't of much use in the equal block method,
+which doesn't perform data structure merges, and so it isn't considered in
+the above definition.\footnote{
+ In the equal block method, all reconstructions are due to either
+ inserting a record, in which case the reconstruction consists of
+ adding a single record to a structure, not merging two structures,
+ or due to repartitioning, occurs when $f(n)$ increases sufficiently
+ that the existing structures must be made \emph{smaller}, and so,
+ again, no merging is done.
+}
+
\subsection{Delete Support}
\label{ssec:dyn-deletes}
@@ -786,11 +843,12 @@ an insert results in a violation: $s$ is updated to be exactly $f(n)$, all
existing blocks are unbuilt, and then the records are evenly redistributed
into the $s$ blocks.~\cite{overmars-art-of-dyn}
-
\subsection{Worst-Case Optimal Techniques}
\label{ssec:bsm-worst-optimal}
+
+
\section{Limitations of Classical Dynamization Techniques}
\label{sec:bsm-limits}