diff options
| author | Douglas B. Rumbaugh <doug@douglasrumbaugh.com> | 2025-06-01 13:15:52 -0400 |
|---|---|---|
| committer | Douglas B. Rumbaugh <doug@douglasrumbaugh.com> | 2025-06-01 13:15:52 -0400 |
| commit | cd3447f1cad16972e8a659ec6e84764c5b8b2745 (patch) | |
| tree | 5a50b6e8a99646e326b2c41714f50e4f7dee64d0 /chapters/sigmod23 | |
| parent | 6354e60f106a89f5bf807082561ed5efd9be0f4f (diff) | |
| download | dissertation-cd3447f1cad16972e8a659ec6e84764c5b8b2745.tar.gz | |
Julia updates
Diffstat (limited to 'chapters/sigmod23')
| -rw-r--r-- | chapters/sigmod23/background.tex | 18 | ||||
| -rw-r--r-- | chapters/sigmod23/examples.tex | 2 | ||||
| -rw-r--r-- | chapters/sigmod23/exp-baseline.tex | 2 | ||||
| -rw-r--r-- | chapters/sigmod23/exp-parameter-space.tex | 12 | ||||
| -rw-r--r-- | chapters/sigmod23/experiment.tex | 2 | ||||
| -rw-r--r-- | chapters/sigmod23/extensions.tex | 4 | ||||
| -rw-r--r-- | chapters/sigmod23/framework.tex | 134 |
7 files changed, 90 insertions, 84 deletions
diff --git a/chapters/sigmod23/background.tex b/chapters/sigmod23/background.tex index af3b80a..d600c27 100644 --- a/chapters/sigmod23/background.tex +++ b/chapters/sigmod23/background.tex @@ -19,16 +19,16 @@ is used to indicate the selection of either a single sample or a sample set; the specific usage should be clear from context. In each of the problems considered, sampling can be performed either -with replacement or without replacement. Sampling with replacement +with-replacement or without-replacement. Sampling with-replacement means that a record that has been included in the sample set for a given sampling query is "replaced" into the dataset and allowed to be sampled -again. Sampling without replacement does not "replace" the record, +again. Sampling without-replacement does not "replace" the record, and so each individual record can only be included within the a sample set once for a given query. The data structures that will be discussed -support sampling with replacement, and sampling without replacement can -be implemented using a constant number of with replacement sampling +support sampling with-replacement, and sampling without-replacement can +be implemented using a constant number of with-replacement sampling operations, followed by a deduplication step~\cite{hu15}, so this chapter -will focus exclusive on the with replacement case. +will focus exclusive on the with-replacement case. \subsection{Independent Sampling Problem} @@ -115,8 +115,10 @@ of problems that will be directly addressed within this chapter. Relational database systems often have native support for IQS using SQL's \texttt{TABLESAMPLE} operator~\cite{postgress-doc}. However, the -algorithms used to implement this operator have significant limitations: -users much choose between statistical independence or performance. +algorithms used to implement this operator have significant limitations +and do not allow users to maintain statistical independence of the results +without also running the query to be sampled from in full. Thus, users must +choose between independece and performance. To maintain statistical independence, Bernoulli sampling is used. This technique requires iterating over every record in the result set of the @@ -240,7 +242,7 @@ Tao~\cite{tao22}. There also exist specialized data structures with support for both efficient sampling and updates~\cite{hu14}, but these structures have poor constant factors and are very complex, rendering them of little -practical utility. Additionally, efforts have been made to extended +practical utility. Additionally, efforts have been made to extend the alias structure with support for weight updates over a fixed set of elements~\cite{hagerup93,matias03,allendorf23}. These approaches do not allow the insertion or removal of new records, however, only in-place diff --git a/chapters/sigmod23/examples.tex b/chapters/sigmod23/examples.tex index 38df04d..4e7f9ac 100644 --- a/chapters/sigmod23/examples.tex +++ b/chapters/sigmod23/examples.tex @@ -25,7 +25,7 @@ number of shards involved in a reconstruction using either layout policy is $\Theta(1)$ using our framework, this means that we can perform reconstructions in $B_M(n) \in \Theta(n)$ time, including tombstone cancellation. The total weight of the structure can also be calculated -at no time when it is constructed, allows $W(n) \in \Theta(1)$ time +at no time cost when it is constructed, allows $W(n) \in \Theta(1)$ time as well. Point lookups over the sorted data can be done using a binary search in $L(n) \in \Theta(\log_2 n)$ time, and sampling queries require no pre-processing, so $P(n) \in \Theta(1)$. The mutable buffer can be diff --git a/chapters/sigmod23/exp-baseline.tex b/chapters/sigmod23/exp-baseline.tex index 5585c36..d0e1ce0 100644 --- a/chapters/sigmod23/exp-baseline.tex +++ b/chapters/sigmod23/exp-baseline.tex @@ -73,7 +73,7 @@ being introduced by the dynamization. \subfloat[Insertion Throughput vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-insert} \label{fig:irs-insert1}} \subfloat[Sampling Latency vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-sample} \label{fig:irs-sample1}} \\ - \subfloat[Delete Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-delete} \label{fig:irs-delete}} + \subfloat[Delete Scalability vs. Baselines]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-sc-irs-delete} \label{fig:irs-delete-s}} \subfloat[Sampling Latency vs. Sample Size]{\includegraphics[width=.5\textwidth]{img/sigmod23/plot/fig-bs-irs-samplesize} \label{fig:irs-samplesize}} \caption{Framework Comparison to Baselines for IRS} diff --git a/chapters/sigmod23/exp-parameter-space.tex b/chapters/sigmod23/exp-parameter-space.tex index 9583312..1e51d8c 100644 --- a/chapters/sigmod23/exp-parameter-space.tex +++ b/chapters/sigmod23/exp-parameter-space.tex @@ -2,11 +2,11 @@ \label{ssec:ds-exp} Our proposed framework has a large design space, which we briefly -described in Section~\ref{ssec:design-space}. The contents of this -space will be described in much more detail in Chapter~\ref{chap:design-space}, -but as part of this work we did perform an experimental examination of our -framework to compare insertion throughput and query latency over various -points within the space. +described in Section~\ref{ssec:sampling-design-space}. The +contents of this space will be described in much more detail in +Chapter~\ref{chap:design-space}, but as part of this work we did perform +an experimental examination of our framework to compare insertion +throughput and query latency over various points within the space. We examined this design space by considering \texttt{DE-WSS} specifically, using a random sample of $500,000,000$ records from the \texttt{OSM} @@ -48,7 +48,7 @@ performance, with tiering outperforming leveling for both delete policies. The next largest effect was the delete policy selection, with tombstone deletes outperforming tagged deletes in insertion performance. This result aligns with the asymptotic analysis of the two -approaches in Section~\ref{sampling-deletes}. It is interesting to note +approaches in Section~\ref{ssec:sampling-deletes}. It is interesting to note however that the effect of layout policy was more significant in these particular tests,\footnote{ Although the largest performance gap in absolute terms was between diff --git a/chapters/sigmod23/experiment.tex b/chapters/sigmod23/experiment.tex index 727284a..1eb704c 100644 --- a/chapters/sigmod23/experiment.tex +++ b/chapters/sigmod23/experiment.tex @@ -53,7 +53,7 @@ uninteresting key distributions. \Paragraph{Structures Compared.} As a basis of comparison, we tested both our dynamized SSI implementations, and existing dynamic baselines, -for each sampling problem considered. Specifically, we consider a the +for each sampling problem considered. Specifically, we consider the following dynamized structures, \begin{itemize} diff --git a/chapters/sigmod23/extensions.tex b/chapters/sigmod23/extensions.tex index 3a3cba3..3304b76 100644 --- a/chapters/sigmod23/extensions.tex +++ b/chapters/sigmod23/extensions.tex @@ -56,7 +56,7 @@ structure using in XDB~\cite{li19}. Because our dynamization technique is built on top of static data structures, a limited form of concurrency support is straightforward to -implement. To that end, created a proof-of-concept dynamization of an +implement. To that end, we created a proof-of-concept dynamization of an ISAM Tree for IRS based on a simplified version of a general concurrency controlled scheme for log-structured data stores~\cite{golan-gueta15}. @@ -79,7 +79,7 @@ accessing them have finished. The buffer itself is an unsorted array, so a query can capture a consistent and static version by storing the tail pointer at the time the query begins. New inserts can be performed concurrently by doing -a fetch-and-and on the tail. By using multiple buffers, inserts and +a fetch-and-add on the tail. By using multiple buffers, inserts and reconstructions can proceed, to some extent, in parallel, which helps to hide some of the insertion tail latency due to blocking on reconstructions during a buffer flush. diff --git a/chapters/sigmod23/framework.tex b/chapters/sigmod23/framework.tex index 256d127..804194b 100644 --- a/chapters/sigmod23/framework.tex +++ b/chapters/sigmod23/framework.tex @@ -50,6 +50,7 @@ on the query being sampled from. Based on these observations, we can define the decomposability conditions for a query sampling problem, \begin{definition}[Decomposable Sampling Problem] + \label{def:decomp-sampling} A query sampling problem, $X: (F, \mathcal{D}, \mathcal{Q}, \mathbb{Z}^+ \to \mathcal{R}$) is decomposable if and only if the following conditions are met for all $q \in \mathcal{Q}, @@ -78,12 +79,14 @@ These two conditions warrant further explanation. The first condition is simply a redefinition of the standard decomposability criteria to consider matching the distribution, rather than the exact records in $R$, as the correctness condition for the merge process. The second condition -handles a necessary property of the underlying search problem being -sampled from. Note that this condition is \emph{stricter} than normal -decomposability for $F$, and essentially requires that the query being -sampled from return a set of records, rather than an aggregate value or -some other result that cannot be meaningfully sampled from. This condition -is satisfied by predicate-filtering style database queries, among others. +addresses the search problem from which results are to be sampled. Not all +search problems admit sampling of this sort--for example, an aggregation +query that returns a single result. This condition essentially requires +that the search problem being sampled from return a set of records, rather +than an aggregate value or some other result that cannot be meaningfully +sampled from. This condition is satisfied by predicate-filtering style +database queries, among others. However, it should be noted that this +condition is \emph{stricter} than normal decomposability. With these definitions in mind, let's turn to solving these query sampling problems. First, we note that many SSIs have a sampling procedure that @@ -120,7 +123,7 @@ down-sampling combination operator. Secondly, this formulation fails to avoid a per-sample dependence on $n$, even in the case where $S(n) \in \Theta(1)$. This gets even worse when considering rejections that may occur as a result of deleted records. Recall from -Section~\ref{ssec:background-deletes} that deletion can be supported +Section~\ref{ssec:dyn-deletes} that deletion can be supported using weak deletes or a shadow structure in a Bentley-Saxe dynamization. Using either approach, it isn't possible to avoid deleted records in advance when sampling, and so these will need to be rejected and retried. @@ -208,9 +211,8 @@ or are naturally determined as part of the pre-processing, and thus the $W(n)$ term can be merged into $P(n)$. \subsection{Supporting Deletes} -\ref{ssec:sampling-deletes} - -As discussed in Section~\ref{ssec:background-deletes}, the Bentley-Saxe +\label{ssec:sampling-deletes} +As discussed in Section~\ref{ssec:dyn-deletes}, the Bentley-Saxe method can support deleting records through the use of either weak deletes, or a secondary ghost structure, assuming certain properties are satisfied by either the search problem or data structure. Unfortunately, @@ -222,13 +224,14 @@ we'll discuss our mechanisms for supporting deletes, as well as how these can be handled during sampling while maintaining correctness. Because both deletion policies have their advantages under certain -contexts, we decided to support both. Specifically, we propose two -mechanisms for deletes, which are +contexts, we decided to support both. We require that each record contain +a small header, which is used to store visibility metadata. Given this, +we propose two mechanisms for deletes, \begin{enumerate} \item \textbf{Tagged Deletes.} Each record in the structure includes a -header with a visibility bit set. On delete, the structure is searched -for the record, and the bit is set in indicate that it has been deleted. +visibility bit in its header. On delete, the structure is searched +for the record, and the bit is set to indicate that it has been deleted. This mechanism is used to support \emph{weak deletes}. \item \textbf{Tombstone Deletes.} On delete, a new record is inserted into the structure with a tombstone bit set in the header. This mechanism is @@ -252,8 +255,9 @@ arbitrary number of delete records, and rebuild the entire structure when this threshold is crossed~\cite{saxe79}. Mixing the "ghost" records into the same structures as the original records allows for deleted records to naturally be cleaned up over time as they meet their tombstones during -reconstructions. This is an important consequence that will be discussed -in more detail in Section~\ref{ssec-sampling-delete-bounding}. +reconstructions using a technique called tombstone cancellation. This +technique, and its important consequences related to sampling, will be +discussed in Section~\ref{sssec:sampling-rejection-bound}. There are two relevant aspects of performance that the two mechanisms trade-off between: the cost of performing the delete, and the cost of @@ -368,7 +372,7 @@ This performance cost seems catastrophically bad, considering it must be paid per sample, but there are ways to mitigate it. We will discuss these mitigations in more detail later, during our discussion of the implementation of these results in -Section~\ref{sec:sampling-implementation}. +Section~\ref{ssec:sampling-framework}. \subsubsection{Bounding Rejection Probability} @@ -392,8 +396,7 @@ the Bentley-Saxe method, however. In the theoretical literature on this topic, the solution to this problem is to periodically re-partition all of the records to re-align the block sizes~\cite{merge-dsp, saxe79}. This approach could also be easily applied here, if desired, though we -do not in our implementations, for reasons that will be discussed in -Section~\ref{sec:sampling-implementation}. +do not in our implementations. The process of removing these deleted records during reconstructions is different for the two mechanisms. Tagged deletes are straightforward, @@ -411,16 +414,16 @@ care with ordering semantics, tombstones and their associated records can be sorted into adjacent spots, allowing them to be efficiently dropped during reconstruction without any extra overhead. -While the dropping of deleted records during reconstruction helps, it is -not sufficient on its own to ensure a particular bound on the number of -deleted records within the structure. Pathological scenarios resulting in -unbounded rejection rates, even in the presence of this mitigation, are -possible. For example, tagging alone will never trigger reconstructions, -and so it would be possible to delete every single record within the -structure without triggering a reconstruction, or records could be deleted -in the reverse order that they were inserted using tombstones. In either -case, a passive system of dropping records naturally during reconstruction -is not sufficient. +While the dropping of deleted records during reconstruction helps, +it is not sufficient on its own to ensure a particular bound on the +number of deleted records within the structure. Pathological scenarios +resulting in unbounded rejection rates, even in the presence of this +mitigation, are possible. For example, tagging alone will never trigger +reconstructions, and so it would be possible to delete every single +record within the structure without triggering a reconstruction. Or, +when using tombstones, records could be deleted in the reverse order +that they were inserted. In either case, a passive system of dropping +records naturally during reconstruction is not sufficient. Fortunately, this passive system can be used as the basis for a system that does provide a bound. This is because it guarantees, @@ -490,6 +493,7 @@ be taken to obtain a sample set of size $k$. \subsection{Performance Tuning and Configuration} +\label{ssec:sampling-design-space} The final of the desiderata referenced earlier in this chapter for our dynamized sampling indices is having tunable performance. The base @@ -508,7 +512,7 @@ Though it has thus far gone unmentioned, some readers may have noted the astonishing similarity between decomposition-based dynamization techniques, and a data structure called the Log-structured Merge-tree. First proposed by O'Neil in the mid '90s\cite{oneil96}, -the LSM Tree was designed to optimize write throughout for external data +the LSM Tree was designed to optimize write throughput for external data structures. It accomplished this task by buffer inserted records in a small in-memory AVL Tree, and then flushing this buffer to disk when it filled up. The flush process itself would fully rebuild the on-disk @@ -518,22 +522,23 @@ layered, external structures, to reduce the cost of reconstruction. In more recent times, the LSM Tree has seen significant development and been used as the basis for key-value stores like RocksDB~\cite{dong21} -and LevelDB~\cite{leveldb}. This work has produced an incredibly large -and well explored parametrization of the reconstruction procedures of -LSM Trees, a good summary of which can be bound in this recent tutorial -paper~\cite{sarkar23}. Examples of this design space exploration include: -different ways to organize each "level" of the tree~\cite{dayan19, -dostoevsky, autumn}, different growth rates, buffering, sub-partitioning -of structures to allow finer-grained reconstruction~\cite{dayan22}, and -approaches for allocating resources to auxiliary structures attached to -the main ones for accelerating certain types of query~\cite{dayan18-1, -zhu21, monkey}. +and LevelDB~\cite{leveldb}. This work has produced an incredibly +large and well explored parametrization of the reconstruction +procedures of LSM Trees, a good summary of which can be bounded in +this recent tutorial paper~\cite{sarkar23}. Examples of this design +space exploration include: different ways to organize each "level" +of the tree~\cite{dayan19, dostoevsky, autumn}, different growth +rates, buffering, sub-partitioning of structures to allow finer-grained +reconstruction~\cite{dayan22}, and approaches for allocating resources to +auxiliary structures attached to the main ones for accelerating certain +types of query~\cite{dayan18-1, zhu21, monkey}. This work is discussed +in greater depth in Chapter~\ref{chap:related-work} Many of the elements within the LSM Tree design space are based upon the -specifics of the data structure itself, and are not generally applicable. -However, some of the higher-level concepts can be imported and applied in -the context of dynamization. Specifically, we have decided to import the -following four elements for use in our dynamization technique, +specifics of the data structure itself, and are not applicable to our +use case. However, some of the higher-level concepts can be imported and +applied in the context of dynamization. Specifically, we have decided to +import the following four elements for use in our dynamization technique, \begin{itemize} \item A small dynamic buffer into which new records are inserted \item A variable growth rate, called as \emph{scale factor} @@ -554,11 +559,11 @@ we are dynamizing may not exist. This introduces some query cost, as queries must be answered from these unsorted records as well, but in the case of sampling this isn't a serious problem. The implications of this will be discussed in Section~\ref{ssec:sampling-cost-funcs}. The -size of this buffer, $N_B$ is a user-specified constant, and all block -capacities are multiplied by it. In the Bentley-Saxe method, the $i$th -block contains $2^i$ records. In our scheme, with buffering, this becomes -$N_B \cdot 2^i$ records in the $i$th block. We call this unsorted array -the \emph{mutable buffer}. +size of this buffer, $N_B$ is a user-specified constant. Block capacities +are defined in terms of multiples of $N_B$, such that each buffer flush +corresponds to an insert in the traditioanl Bentley-Saxe method. Thus, +rather than the $i$th block containing $2^i$ records, it contains $N_B +\cdot 2^i$ records. We call this unsorted array the \emph{mutable buffer}. \Paragraph{Scale Factor.} In the Bentley-Saxe method, each block is twice as large as the block the precedes it There is, however, no reason @@ -593,19 +598,19 @@ we can build them over tombstones. This approach can greatly improve the sampling performance of the structure when tombstone deletes are used. \Paragraph{Layout Policy.} The Bentley-Saxe method considers blocks -individually, without any other organization beyond increasing size. In -contrast, LSM Trees have multiple layers of structural organization. The -top level structure is a level, upon which record capacity restrictions -are applied. These levels are then partitioned into individual structures, -which can be further organized by key range. Because our intention is to -support general data structures, which may or may not be easily partition -by a key, we will not consider the finest grain of partitioning. However, -we can borrow the concept of levels, and lay out shards in these levels -according to different strategies. +individually, without any other organization beyond increasing +size. In contrast, LSM Trees have multiple layers of structural +organization. Record capacity restrictions are enforced on structures +called \emph{levels}, which are partitioned into individual data +structures, and then further organized into non-overlapping key ranges. +Because our intention is to support general data structures, which may +or may not be easily partitioned by a key, we will not consider the finest +grain of partitioning. However, we can borrow the concept of levels, +and lay out shards in these levels according to different strategies. Specifically, we consider two layout policies. First, we can allow a single shard per level, a policy called \emph{Leveling}. This approach -is traditionally read optimized, as it generally results in fewer shards +is traditionally read-optimized, as it generally results in fewer shards within the overall structure for a given scale factor. Under leveling, the $i$th level has a capacity of $N_B \cdot s^{i+1}$ records. We can also allow multiple shards per level, resulting in a write-optimized @@ -628,12 +633,10 @@ The requirements that the framework places upon SSIs are rather modest. The sampling problem being considered must be a decomposable sampling problem (Definition \ref{def:decomp-sampling}) and the SSI must support the \texttt{build} and \texttt{unbuild} operations. Optionally, -if the SSI supports point lookups or if the SSI can be constructed -from multiple instances of the SSI more efficiently than its normal -static construction, these two operations can be leveraged by the -framework. However, these are not requirements, as the framework provides -facilities to work around their absence. - +if the SSI supports point lookups or if the SSI is merge decomposable, +then these two operations can be leveraged by the framework. However, +these are not requirements, as the framework provides facilities to work +around their absence. \captionsetup[subfloat]{justification=centering} \begin{figure*} @@ -669,6 +672,7 @@ these delete mechanisms, each record contains an attached header with bits to indicate its tombstone or delete status. \subsection{Supported Operations and Cost Functions} +\label{ssec:sampling-cost-funcs} \Paragraph{Insert.} Inserting a record into the dynamization involves appending it to the mutable buffer, which requires $\Theta(1)$ time. When the buffer reaches its capacity, it must be flushed into the structure |