From 5ffc53e69e956054fdefd1fe193e00eee705dcab Mon Sep 17 00:00:00 2001 From: Douglas Rumbaugh Date: Mon, 12 May 2025 19:59:26 -0400 Subject: Updates --- chapters/sigmod23/extensions.tex | 106 +++++++++++++++++++++++---------------- 1 file changed, 62 insertions(+), 44 deletions(-) (limited to 'chapters/sigmod23/extensions.tex') diff --git a/chapters/sigmod23/extensions.tex b/chapters/sigmod23/extensions.tex index 2752b0f..06d55a5 100644 --- a/chapters/sigmod23/extensions.tex +++ b/chapters/sigmod23/extensions.tex @@ -1,20 +1,31 @@ \captionsetup[subfloat]{justification=centering} -\section{Extensions to the Framework} +\section{Extensions} \label{sec:discussion} -While this chapter has thus far discussed single-threaded, in-memory data -structures, the framework as proposed can be easily extended to support -other use-cases. In this section, we discuss extending this framework -to support concurrency and external data structures. +While this chapter has thus far discussed single-threaded, in-memory +data structures, our technique can be easily extended to support other +use-cases. In this section, we will discuss extensions to support +concurrency and external data structures. +\subsection{External Data Structures} +\label{ssec:ext-external} -\Paragraph{Larger-than-Memory Data.} Our dynamization techniques, -as discussed thus far, can easily accomodate external data structures +Our dynamization techniques can easily accomodate external data structures as well as in-memory ones. To demonstrate this, we have implemented a dynamized version of an external ISAM tree for use in answering IRS queries. The mutable buffer remains an unsorted array in memory, however -the shards themselves can either \emph{either} an in-memory ISAM tree -or an external one. Our system allows for a user-configurable number of -shards and the rest on disk, for performance tuning purposes. +the shards themselves can be \emph{either} an in-memory ISAM tree, or an +external one. Our system allows for a user-configurable number of shards +to reside in memory, and the rest on disk. This allows for the smallest +few shards, which sustain the most reconstructions, to reside in memory +for performance, while storing most of the data on disk, in an attempt +to get the best of both worlds, so to speak.\footnote{ + In traditional LSM Trees, which are an external data structure, + only the memtable resides in memory. We have decided to break with + this model because, for query performance reasons, the mutable + buffer must remain small. By placing a few levels in memory, the + performance effects of frequent buffer flushes can be mitigated. This + isn't strictly necessary, however. +} The on-disk shards are built from standard ISAM trees using $8$ KiB page-aligned internal and leaf nodes. To avoid random writes, we only @@ -25,43 +36,50 @@ when it is not located. However, because of the geometric growth rate of the shards, at any given time the majority of the data will be on disk anyway, so this would only provide a marginal improvement. -Our implementation does not include a buffer manager, for simplicty. The -external interface requires passing in page-aligned buffers. +\subsection{Distributed Data Structures} +Many distributed data processing systems are built on immutable +abstractions, such Apache Spark's resilient distributed dataset +(RDD)~\cite{rdd} or the Hadoop file system's (HDFS) append-only +files~\cite{hadoop}. Each shard can be encapsulated within an HDFS +file or a Spark RDD, and a centralized control node can manage the +mutable buffer. Flushing this buffer would create a new file/RDD, and +reconstructions could likewise be performed by creating new immutable +structures through the merging of existing ones, using the same basic +scheme as has already been discussed in this chapter. Using thes tools, +SSIs over datasets that exceed the capacity of a single node could be +supported. Such distributed SSIs do exist, such as the RDD-based sampling +structure using in XDB~\cite{li19}. +\subsection{Concurrency} +\label{ssec:ext-concurrency} +Because our dynamization technique is built on top of static data +structures, a limited form of concurrency support is straightforward to +implement. To that end, created a proof-of-concept dynamization of an +ISAM Tree for IRS based on a simplified version of a general concurrency +controlled scheme for log-structured data stores~\cite{golan-gueta15}. -\Paragraph{Applications to distributed data structures.} -Many distributed file-systems are built on immutable abstracted, such -Apache Spark's resilient distributed dataset (RDD)~\cite{rdd} or Hadoop's -immutable +First, we restrict ourselves to tombstone deletes. This ensures that +all the static data structures within our dynamization are also immutable. +When using tagging, the deleted flags on records in these structures could +be dynamically updated, leading to possible synchronization issues. While +this isn't a fundamentally unsolvable problem, and could be addressed +simply through the use of a timestamp in the header of the records, we +decided to keep things simple and implement our concurrency scheme on the +assumption of full shard immutability. +Given this immutability, we can construct a simple versioning system over +the entire structure. Reconstructions can be performed in the background +and then ``activated'' atomically by using a simple compare-and-swap of +a pointer to the entire structure. Reference counting can then be used +to automatically free old versions of the structure when all queries +accessing them have finished. -Because the framework maintains immutability of shards, it is also well suited for -use on top of distributed file-systems or with other distributed data -abstractions like RDDs in Apache Spark~\cite{rdd}. Each shard can be -encapsulated within an immutable file in HDFS or an RDD in Spark. A centralized -control node or driver program can manage the mutable buffer, flushing it into -a new file or RDD when it is full, merging with existing files or RDDs using -the same reconstruction scheme already discussed for the framework. This setup -allows for datasets exceeding the capacity of a single node to be supported. As -an example, XDB~\cite{li19} features an RDD-based distributed sampling -structure that could be supported by this framework. - -\Paragraph{Concurrency.} The immutability of the majority of the structures -within the index makes for a straightforward concurrency implementation. -Concurrency control on the buffer is made trivial by the fact it is a simple, -unsorted array. The rest of the structure is never updated (aside from possible -delete tagging), and so concurrency becomes a simple matter of delaying the -freeing of memory used by internal structures until all the threads accessing -them have exited, rather than immediately on merge completion. A very basic -concurrency implementation can be achieved by using the tombstone delete -policy, and a reference counting scheme to control the deletion of the shards -following reconstructions. Multiple insert buffers can be used to improve -insertion throughput, as this will allow inserts to proceed in parallel with -merges, ultimately allowing concurrency to scale up to the point of being -bottlenecked by memory bandwidth and available storage. This proof-of-concept -implementation is based on a simplified version of an approach proposed by -Golan-Gueta et al. for concurrent log-structured data stores -\cite{golan-gueta15}. - +The buffer itself is an unsorted array, so a query can capture a +consistent and static version by storing the tail pointer at the time +the query begins. New inserts can be performed concurrently by doing +a fetch-and-and on the tail. By using multiple buffers, inserts and +reconstructions can proceed, to some extent, in parallel, which helps to +hide some of the insertion tail latency due to blocking on reconstructions +during a buffer flush. -- cgit v1.2.3