diff options
Diffstat (limited to 'chapters/dynamization.tex')
| -rw-r--r-- | chapters/dynamization.tex | 74 |
1 files changed, 66 insertions, 8 deletions
diff --git a/chapters/dynamization.tex b/chapters/dynamization.tex index 2301537..fce0d9f 100644 --- a/chapters/dynamization.tex +++ b/chapters/dynamization.tex @@ -366,11 +366,12 @@ for brevity and streamline some of the original notation (full details can be found in~\cite{overmars83}), but this technique ultimately results in a data structure with the following performance characteristics, \begin{align*} -\text{Amortized Insertion Cost:}&\quad \Theta\left(\frac{B(n)}{n} + B\left(\frac{n}{f(n)}\right)\right) \\ -\text{Worst-case Query Cost:}& \quad \Theta\left(f(n) \cdot \mathscr{Q}\left(\frac{n}{f(n)}\right)\right) \\ +\text{Amortized Insertion Cost:}&\quad I_A(n) \in \Theta\left(\frac{B(n)}{n} + B\left(\frac{n}{f(n)}\right)\right) \\ +\text{Worst-case Insertion Cost:}&\quad I(n) \in \Theta\left(B(n)\right) \\ +\text{Worst-case Query Cost:}& \quad \mathscr{Q}(n) \in \Theta\left(f(n) \cdot \mathscr{Q}_S\left(\frac{n}{f(n)}\right)\right) \\ \end{align*} where $B(n)$ is the cost of statically building $\mathcal{I}$, and -$\mathscr{Q}(n)$ is the cost of answering $F$ using $\mathcal{I}$. +$\mathscr{Q}_S(n)$ is the cost of answering $F$ using $\mathcal{I}$. %TODO: example? @@ -431,9 +432,9 @@ structure in the same way that incrementing the binary number by $1$ does. By applying BSM to a data structure, a dynamized structure can be created with the following performance characteristics, \begin{align*} -\text{Amortized Insertion Cost:}&\quad \Theta\left(\left(\frac{B(n)}{n}\cdot \log_2 n\right)\right) \\ -\text{Worst Case Insertion Cost:}&\quad \Theta\left(B(n)\right) \\ -\text{Worst-case Query Cost:}& \quad \Theta\left(\log_2 n\cdot \mathscr{Q}\left(n\right)\right) \\ +\text{Amortized Insertion Cost:}&\quad I_A(n) \in \Theta\left(\left(\frac{B(n)}{n}\cdot \log_2 n\right)\right) \\ +\text{Worst Case Insertion Cost:}&\quad I(n) \in \Theta\left(B(n)\right) \\ +\text{Worst-case Query Cost:}& \quad \mathscr{Q}(n) \in \Theta\left(\log_2 n\cdot \mathscr{Q}_S\left(n\right)\right) \\ \end{align*} This is a particularly attractive result because, for example, a data structure having $B(n) \in \Theta(n)$ will have an amortized insertion @@ -459,10 +460,66 @@ individually. For formally, for any query running in $\mathscr{Q}(n) \in cost of answering a decomposable search problem from a BSM dynamization is $\Theta\left(\mathscr{Q}(n)\right)$.~\cite{saxe79} -\subsection{The Mixed Method} + \subsection{Merge Decomposable Search Problems} +When a reconstruction is performed using these techniques, the inputs to +that reconstruction are not random collections of records, but rather +multiple data structures. While in the fully general case, these new +structures are built by first unbuilding all of the input structures and +then building a new one over that set of records, many data structures +admit more efficient \emph{merging}. Consider a data structure that +supports construction via merging, $\mathtt{merge}(\mathscr{I}_0, \ldots +\mathscr{I}_k)$ in $B_M(n, k)$ time, where $n = \sum_{i=0}^k +|\mathscr{I}_i|$. A search problem for which such a data structure exists is +called a \emph{merge decomposable search problem} (MDSP)~\cite{merge-dsp}. + +Note that in~\cite{merge-dsp}, Overmars considers a \emph{very} specific +definition where the data structure is built in two stages. An initial +sorting phase, requiring $O(n \log n)$ time, and then a construction +phase requiring $O(n)$ time. Overmar's proposed mechanism for leveraging +this property is to include with each block a linked list storing the +records in sorted order (presumably to account for structures where the +records must be sorted, but aren't necessarily kept that way). During +reconstructions, these sorted lists can first be merged, and then the +data structure built from the resulting merged list. Using this approach, +even accounting for the merging of the list, he is able to prove that +the amortized insertion cost is less than would have been the case paying +the $O( n \log n)$ cost for each reconstruction.~\cite{merge-dsp} + +While Overmars's definition for MDSP does capture a large number of +mergable data structures (including all of the mergable structures +considered in this work), we modify his definition to consider a broader +class of problems. We will be using the term to refer to any search +problem with a data structure that can be merged more efficiently than +built from an unsorted set of records. More formally, +\begin{definition}[Merge Decomposable Search Problem~\cite{merge-dsp}] + \label{def:mdsp} + A search problem $F: (\mathcal{D}, \mathcal{Q}) \to \mathcal{R}$ + is decomposable if and only if there exists a data structure, + $\mathcal{I}$ capable of solving $F$ that is constructable by + merging $k$ instances of $\mathcal{I}$ with cost $B_M(n, k)$ such + that $B_M(n, \log n) \leq B(n)$. +\end{definition} + +The use of $k = \log n$ in this definition comes from the Bentley-Saxe +method's upper limit on the number of data structures. In the worst case, +there will be $\log n$ structures to merge, and so to gain benefit +from the merge routine, the merging of $\log n$ structures must be +less expensive than building the new structure using the standard +$\mathtt{unbuild}$ and $\mathtt{build}$ mechanism. The availability of +an efficient merge operation isn't of much use in the equal block method, +which doesn't perform data structure merges, and so it isn't considered in +the above definition.\footnote{ + In the equal block method, all reconstructions are due to either + inserting a record, in which case the reconstruction consists of + adding a single record to a structure, not merging two structures, + or due to repartitioning, occurs when $f(n)$ increases sufficiently + that the existing structures must be made \emph{smaller}, and so, + again, no merging is done. +} + \subsection{Delete Support} \label{ssec:dyn-deletes} @@ -786,11 +843,12 @@ an insert results in a violation: $s$ is updated to be exactly $f(n)$, all existing blocks are unbuilt, and then the records are evenly redistributed into the $s$ blocks.~\cite{overmars-art-of-dyn} - \subsection{Worst-Case Optimal Techniques} \label{ssec:bsm-worst-optimal} + + \section{Limitations of Classical Dynamization Techniques} \label{sec:bsm-limits} |