diff options
| author | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-06 17:41:03 -0400 |
|---|---|---|
| committer | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-06 17:41:03 -0400 |
| commit | f1fcf8426764b2e8fc8de08a6d74968d2fbc1b27 (patch) | |
| tree | 9adb6a0901e2a416d2f4aaa419ca80d2ece4414e /chapters/future-work.tex | |
| parent | ac1244fced7e6c6ba93d4292dd9a18ce293236eb (diff) | |
| download | dissertation-f1fcf8426764b2e8fc8de08a6d74968d2fbc1b27.tar.gz | |
Updates to chapter 3
Diffstat (limited to 'chapters/future-work.tex')
| -rw-r--r-- | chapters/future-work.tex | 176 |
1 files changed, 2 insertions, 174 deletions
diff --git a/chapters/future-work.tex b/chapters/future-work.tex index d4ddd52..0c766dd 100644 --- a/chapters/future-work.tex +++ b/chapters/future-work.tex @@ -1,174 +1,2 @@ -\chapter{Proposed Work} -\label{chap:proposed} - -The previous two chapters described work already completed, however -there are a number of work that remains to be done as part of this -project. Update support is only one of the important features that an -index requires of its data structure. In this chapter, the remaining -research problems will be discussed briefly, to lay out a set of criteria -for project completion. - -\section{Concurrency Support} - -Database management systems are designed to hide the latency of -IO operations, and one of the techniques they use are being highly -concurrent. As a result, any data structure used to build a database -index must also support concurrent updates and queries. The sampling -extension framework described in Chapter~\ref{chap:sampling} had basic -concurrency support, but work is ongoing to integrate a superior system -into the framework of Chapter~\ref{chap:framework}. - -Because the framework is based on the Bentley-Saxe method, it has a number -of desirable properties for making concurrency management simpler. With -the exception of the buffer, the vast majority of the data resides in -static data structures. When using tombstones, these static structures -become fully immutable. This turns concurrency control into a resource -management problem, and suggests a simple multi-version concurrency -control scheme. Each version of the structure, defined as being the -state between two reconstructions, is tagged with an epoch number. A -query, then, will read only a single epoch, which will be preserved -in storage until all queries accessing it have terminated. Because the -mutable buffer is append-only, a consistent view of it can be obtained -by storing the tail of the log at the start of query execution. Thus, -a fixed snapshot of the index can be represented as a two-tuple containing -the epoch number and buffer tail index. - -The major limitation of the Chapter~\ref{chap:sampling} system was -the handling of buffer expansion. While the mutable buffer itself is -an unsorted array, and thus supports concurrent inserts using a simple -fetch-and-add operation, the real hurdle to insert performance is managing -reconstruction. During a reconstruction, the buffer is full and cannot -support any new inserts. Because active queries may be using the buffer, -it cannot be immediately flushed, and so inserts are blocked. Because of -this, it is necessary to use multiple buffers to sustain insertions. When -a buffer is filled, a background thread is used to perform the -reconstruction, and a new buffer is added to continue inserting while that -reconstruction occurs. In Chapter~\ref{chap:sampling}, the solution used -was limited by its restriction to only two buffers (and as a result, -a maximum of two active epochs at any point in time). Any sustained -insertion workload would quickly fill up the pair of buffers, and then -be forced to block until one of the buffers could be emptied. This -emptying of the buffer was contingent on \emph{both} all queries using -the buffer finishing, \emph{and} on the reconstruction using that buffer -to finish. As a result, the length of the block on inserts could be long -(multiple seconds, or even minutes for particularly large reconstructions) -and indeterminate (a given index could be involved in a very long running -query, and the buffer would be blocked until the query completed). - -Thus, a more effective concurrency solution would need to support -dynamically adding mutable buffers as needed to maintain insertion -throughput. This would allow for insertion throughput to be maintained -so long as memory for more buffer space is available.\footnote{For the -in-memory indexes considered thus far, it isn't clear that running out of -memory for buffers is a recoverable error in all cases. The system would -require the same amount of memory for storing record (technically more, -considering index overhead) in a shard as it does in the buffer. In the -case of an external storage system, the calculus would be different, -of course.} It would also ensure that a long running could only block -insertion if there is insufficient memory to create a new buffer or to -run a reconstruction. However, as the number of buffered records grows, -there is the potential for query performance to suffer, which leads to -another important aspect of an effective concurrency control scheme. - -\subsection{Tail Latency Control} - -The concurrency control scheme discussed thus far allows for maintaining -insertion throughput by allowing an unbounded portion of the new data -to remain buffered in an unsorted fashion. Over time, this buffered -data will be moved into data structures in the background, as the -system performs merges (which are moved off of the critical path for -most operations). While this system allows for fast inserts, it has the -potential to damage query performance. This is because the more buffered -data there is, the more a query must fall back on its inefficient -scan-based buffer path, as opposed to using the data structure. - -Unfortunately, reconstructions can be incredibly lengthy (recall that -the worst-case scenario involves rebuilding a static structure over -all of the records; this is, thankfully, quite rare). This implies that -it may be necessary in certain circumstances to throttle insertions to -maintain certain levels of query performance. Additionally, it may be -worth preemptively performing large reconstructions during periods of -low utilization, similar to systems like Silk designed for mitigating -tail latency spikes in LSM-tree based systems~\cite{balmau19}. - -Additionally, it is possible that large reconstructions may have a -negative effect on query performance, due to system resource utilization. -Reconstructions can use a large amount of memory bandwidth, which must -be shared by queries. The effects of parallel reconstruction on query -performance will need to be assessed, and strategies for mitigation of -this effect, be it a scheduling-based solution, or a resource-throttling -one, considered if necessary. - - -\section{Fine-Grained Online Performance Tuning} - -The framework has a large number of configurable parameters, and -introducing concurrency control will add even more. The parameter sweeps -in Section~\ref{ssec:ds-exp} show that there are trade-offs between -read and write performance across this space. Unfortunately, the current -framework applies this configuration parameters globally, and does not -allow them to be changed after the index is constructed. It seems apparent -that better performance might be obtained by adjusting this approach. - -First, there is nothing preventing these parameters from being configured -on a per-level basis. Having different layout policies on different -levels (for example, tiering on higher levels and leveling on lower ones), -different scale factors, etc. More index specific tuning, like controlling -memory budget for auxiliary structures, could also be considered. - -This fine-grained tuning will open up an even broader design space, -which has the benefit of improving the configurability of the system, -but the disadvantage of making configuration more difficult. Additionally, -it does nothing to address the problem of workload drift: a configuration -may be optimal now, but will it remain effective in the future as the -read/write mix of the workload changes? Both of these challenges can be -addressed using dynamic tuning. - -The theory is that the framework could be augmented with some workload -and performance statistics tracking. Based on these numbers, during -reconstruction, the framework could decide to adjust the configuration -of one or more levels in an online fashion, to lean more towards read -or write performance, or to dial back memory budgets as the system's -memory usage increases. Additionally, buffer-related parameters could -be tweaked in real time as well. If insertion throughput is high, it -might be worth it to temporarily increase the buffer size, rather than -spawning multiple smaller buffers. - -A system like this would allow for more consistent performance of the -system in the face of changing workloads, and also increase the ease -of use of the framework by removing the burden of configuration from -the user. - - -\section{Alternative Data Partitioning Schemes} - -One problem with Bentley-Saxe or LSM-tree derived systems is temporary -memory usage spikes. When performing a reconstruction, the system needs -enough storage to store the shards involved in the reconstruction, -and also the newly constructed shard. This is made worse in the face -of multi-version concurrency, where multiple older versions of shards -may be retained in memory at once. It's well known that, in the worst -case, such a system may temporarily require double its current memory -usage~\cite{dayan22}. - -One approach to addressing this problem in LSM-tree based systems is -to adjust the compaction granularity~\cite{dayan22}. In the terminology -associated with this framework, the idea is to further sub-divide each -shard into smaller chunks, partitioned based on keys. That way, when a -reconstruction is triggered, rather than reconstructing an entire shard, -these smaller partitions can be used instead. One of the partitions in -the source shard can be selected, and then merged with the partitions -in the next level down having overlapping key ranges. The amount of -memory required for reconstruction (and also reconstruction time costs) -can then be controlled by adjusting these partitions. - -Unfortunately, while this system works incredibly well for LSM-tree -based systems which store one-dimensional data in sorted arrays, it -encounters some problems in the context of a general index. It isn't -clear how to effectively partition multi-dimensional data in the same -way. Additionally, in the general case, each partition would need to -contain its own instance of the index, as the framework supports data -structures that don't themselves support effective partitioning in the -way that a simple sorted array would. These challenges will need to be -overcome to devise effective, general schemes for data partitioning to -address the problems of reconstruction size and memory usage. +\chapter{Future Work} +\label{chap:future} |