\chapter{Introduction} \label{chap:intro} One of the major challenges facing current data systems is the processing of complex and varied analytical queries over vast data sets. One commonly used technique for accelerating these queries is the application of data structures to create indexes, which are the basis for specialized database systems and data processing libraries. Unfortunately, the development of these indexes is difficult because of the requirements placed on them by data processing systems. Data is frequently subject to updates, yet a large number of potentially useful data structures are static. Further, many large-scale data processing systems are highly concurrent, which increases the barrier to entry even further. The process for developing data structures that satisfy these requirements is arduous. To demonstrate this difficulty, consder the recent example of the evolution of learned indexes. These are data structures designed to efficiently solve a simple problem: single dimensional range queries over sorted data. They seek to reduce the size of the structure, as well as lookup times, by replacing a traditional data structure with a learned model capable of predicting the location of a record in storage that matches a key value to within bounded error. This concept was first proposed by Kraska et al. in 2017, when they published a paper on the first learned index, RMI~\cite{RMI}. This index succeeding in showing that a learned model can be both faster and smaller than a conventional range index, but the proposed solution did not support updates. The first (non-concurrently) updatable learned index, ALEX, took a year and a half to appear~\cite{ALEX}. Over the course of the subsequent three years, several learned indexes were proposed with concurrency support~\cite{10.1145/3332466.3374547,10.14778/3489496.3489512} but a recent performance study~\cite{10.14778/3551793.3551848} showed that these were still generally inferior to ART-OLC~\cite{10.1145/2933349.2933352}, a traditional index. This same study did however demonstrate that a new design, ALEX+, was able to outperform ART-OLC under certain circumstances, but even with this result learned indexes are not generally considered production ready, because they suffer from significant performance regressions under certain workloads, and are highly sensitive to the distribution of keys~\cite{10.14778/3551793.3551848}. Despite the demonstrable advantages of the technique and over half a decade of development, learned indexes still have not reached a generally usable state.\footnote{ In Chapter~\ref{chap:framework}, we apply our proposed technique to existing static learned indexes to produce an effective dynamic index. } This work proposes a strategy for addressing this problem by providing a framework for automatically introducing support for concurrent updates (including both inserts and deletes) to many static data structures. With this framework, a wide range of static, or otherwise impractical, data structures will be made practically useful in data systems. Based on a classical, theoretical framework called the Bentley-Saxe Method~\cite{saxe79}, the proposed system will provide a library that can automatically extend many data structures with support for concurrent updates, as well as a tunable design space to allow for the user to make trade-offs between read performance, write performance, and storage usage. The framework will address a number of limitations present in the original technique, widely increasing its applicability and practicality. It will also provide a workload-adaptive, online tuning system that can automatically adjust the tuning parameters of the data structure in the face of changing workloads. This framework is based on the splitting of the data structure into several smaller pieces, which are periodically reconstructed to support updates. A systematic partitioning and reconstruction approach is used to provide specific guarantees on amortized insertion performance, and worst case query performance. The underlying Bentley-Saxe method is extended using a novel query abstraction to broaden its applicability, and the partitioning and reconstruction processes are adjusted to improve performance and introduce configurability. Specifically, the proposed work will address the following points, \begin{enumerate} \item The proposal of a theoretical framework for analysing queries and data structures that extends existing theoretical approaches and allows for more data structures to be dynamized. \item The design of a system based upon this theoretical framework for automatically dynamizing static data structures in a performant and configurable manner. \item The extension of this system with support for concurrent operations, and the use of concurrency to provide more effective worst-case performance guarantees. \end{enumerate} The rest of this document is structured as follows. First, Chapter~\ref{chap:background} introduces relevant background information, including the importance of data structures and indexes in database systems, the concept of a search problem, and techniques for designing updatable data structures. Next, in Chapter~\ref{chap:sampling}, the application of the Bentley-Saxe method to a number of sampling data structures is presented. The extension of these structures introduces a number of challenges which must be addressed, resulting in significant modification of the underlying technique. Then, Chapter~\ref{chap:framework} discusses the generalization of the modifications from the sampling framework into a more general framework. Chapter~\ref{chap:proposed} discusses the work that remains to be completed as part of this project, and Chapter~\ref{chap:conclusion} concludes the work.