diff options
Diffstat (limited to 'chapters/sigmod23/framework.tex')
| -rw-r--r-- | chapters/sigmod23/framework.tex | 18 |
1 files changed, 9 insertions, 9 deletions
diff --git a/chapters/sigmod23/framework.tex b/chapters/sigmod23/framework.tex index 218c290..b413802 100644 --- a/chapters/sigmod23/framework.tex +++ b/chapters/sigmod23/framework.tex @@ -252,7 +252,7 @@ of the major limitations of the ghost structure approach for handling deletes is that there is not a principled method for removing deleted records from the decomposed structure. The standard approach is to set an arbitrary number of delete records, and rebuild the entire structure when -this threshold is crossed~\cite{saxe79}. Mixing the "ghost" records into +this threshold is crossed~\cite{saxe79}. Mixing the ``ghost'' records into the same structures as the original records allows for deleted records to naturally be cleaned up over time as they meet their tombstones during reconstructions using a technique called tombstone cancellation. This @@ -280,7 +280,7 @@ mechanism. The cost of using a tombstone delete in a Bentley-Saxe dynamization is the same as a simple insert, \begin{equation*} -\mathscr{D}(n)_A \in \Theta\left(\frac{B(n)}{n} \log_2 (n)\right) +D_A(n) \in \Theta\left(\frac{B(n)}{n} \log_2 (n)\right) \end{equation*} with the worst-case cost being $\Theta(B(n))$. Note that there is also a minor performance effect resulting from deleted records appearing @@ -309,7 +309,7 @@ on a Bentley-Saxe decomposition of that SSI will require, at worst, executing a point-lookup on each block, with a total cost of \begin{equation*} -\mathscr{D}(n) \in \Theta\left( L(n) \log_2 (n)\right) +D(n) \in \Theta\left( L(n) \log_2 (n)\right) \end{equation*} If the SSI being considered does \emph{not} support an efficient @@ -391,7 +391,7 @@ a natural way of controlling the number of deleted records within the structure, and thereby bounding the rejection rate. During reconstruction, we have the opportunity to remove deleted records. This will cause the record counts associated with each block of the structure to gradually -drift out of alignment with the "perfect" powers of two associated with +drift out of alignment with the ``perfect'' powers of two associated with the Bentley-Saxe method, however. In the theoretical literature on this topic, the solution to this problem is to periodically re-partition all of the records to re-align the block sizes~\cite{merge-dsp, saxe79}. This @@ -450,7 +450,7 @@ deleted records involved in the reconstruction will be dropped. Tombstones may require multiple cascading rounds of compaction to occur, because a tombstone record will only cancel when it encounters the record that it deletes. However, because tombstones always follow the record they -delete in insertion order, and will therefore always be "above" that +delete in insertion order, and will therefore always be ``above'' that record in the structure, each reconstruction will move every tombstone involved closer to the record it deletes, ensuring that eventually the bound will be satisfied. @@ -526,7 +526,7 @@ and LevelDB~\cite{leveldb}. This work has produced an incredibly large and well explored parametrization of the reconstruction procedures of LSM trees, a good summary of which can be bounded in this recent tutorial paper~\cite{sarkar23}. Examples of this design -space exploration include: different ways to organize each "level" +space exploration include: different ways to organize each ``level'' of the tree~\cite{dayan19, dostoevsky, autumn}, different growth rates, buffering, sub-partitioning of structures to allow finer-grained reconstruction~\cite{dayan22}, and approaches for allocating resources to @@ -739,19 +739,19 @@ Assuming that $N_B \ll n$, the first two terms of this expression are constant. Dropping them and amortizing the result over $n$ records give us the amortized insertion cost, \begin{equation*} -I_a(n) \in \Theta\left(\frac{B_M(n)}{n}\log_s(n)\right) +I_A(n) \in \Theta\left(\frac{B_M(n)}{n}\log_s(n)\right) \end{equation*} If the SSI being considered does not support a more efficient construction procedure from other instances of the same SSI, and the general Bentley-Saxe \texttt{unbuild} and \texttt{build} -operations must be used, the the cost becomes $I_a(n) \in +operations must be used, the the cost becomes $I_A(n) \in \Theta\left(\frac{B(n)}{n}\log_s(n)\right)$ instead. \Paragraph{Delete.} The framework supports both tombstone and tagged deletes, each with different performance. Using tombstones, the cost of a delete is identical to that of an insert. When using tagging, the cost of a delete is the same as the cost of a point lookup, because the -"delete" itself only sets a bit in the header of the record, +``delete'' itself only sets a bit in the header of the record, once it has been located. There will be $\Theta(\log_s n)$ total shards in the structure, each with a look-up cost of $L(n)$ using either the SSI's native point-lookup, or an auxiliary hash table, and the lookup |