\captionsetup[subfloat]{justification=centering} \section{Extensions to the Framework} \label{sec:discussion} While this chapter has thus far discussed single-threaded, in-memory data structures, the framework as proposed can be easily extended to support other use-cases. In this section, we discuss extending this framework to support concurrency and external data structures. \Paragraph{Larger-than-Memory Data.} Our dynamization techniques, as discussed thus far, can easily accomodate external data structures as well as in-memory ones. To demonstrate this, we have implemented a dynamized version of an external ISAM tree for use in answering IRS queries. The mutable buffer remains an unsorted array in memory, however the shards themselves can either \emph{either} an in-memory ISAM tree or an external one. Our system allows for a user-configurable number of shards and the rest on disk, for performance tuning purposes. The on-disk shards are built from standard ISAM trees using $8$ KiB page-aligned internal and leaf nodes. To avoid random writes, we only support tombstone-based deletes. Theoretically, it should be possible to implement a hybrid approach, where deletes first search the in-memory shards for the record and tag it if found, inserting a tombstone only when it is not located. However, because of the geometric growth rate of the shards, at any given time the majority of the data will be on disk anyway, so this would only provide a marginal improvement. Our implementation does not include a buffer manager, for simplicty. The external interface requires passing in page-aligned buffers. \Paragraph{Applications to distributed data structures.} Many distributed file-systems are built on immutable abstracted, such Apache Spark's resilient distributed dataset (RDD)~\cite{rdd} or Hadoop's immutable Because the framework maintains immutability of shards, it is also well suited for use on top of distributed file-systems or with other distributed data abstractions like RDDs in Apache Spark~\cite{rdd}. Each shard can be encapsulated within an immutable file in HDFS or an RDD in Spark. A centralized control node or driver program can manage the mutable buffer, flushing it into a new file or RDD when it is full, merging with existing files or RDDs using the same reconstruction scheme already discussed for the framework. This setup allows for datasets exceeding the capacity of a single node to be supported. As an example, XDB~\cite{li19} features an RDD-based distributed sampling structure that could be supported by this framework. \Paragraph{Concurrency.} The immutability of the majority of the structures within the index makes for a straightforward concurrency implementation. Concurrency control on the buffer is made trivial by the fact it is a simple, unsorted array. The rest of the structure is never updated (aside from possible delete tagging), and so concurrency becomes a simple matter of delaying the freeing of memory used by internal structures until all the threads accessing them have exited, rather than immediately on merge completion. A very basic concurrency implementation can be achieved by using the tombstone delete policy, and a reference counting scheme to control the deletion of the shards following reconstructions. Multiple insert buffers can be used to improve insertion throughput, as this will allow inserts to proceed in parallel with merges, ultimately allowing concurrency to scale up to the point of being bottlenecked by memory bandwidth and available storage. This proof-of-concept implementation is based on a simplified version of an approach proposed by Golan-Gueta et al. for concurrent log-structured data stores \cite{golan-gueta15}.