\chapter{Summary and Future Work} \label{chap:conclusion} One of the perennial problems in database systems is the design of new indices to support new data types and search problems. While there exist numerous data structures that could be used as the basis for such indices, there is a mismatch between the required feature set of an index and that of a data structure. This requires a significant amount of effort to be expended in order to implement the missing features. In order to circumvent this problem, there have been past efforts at creating systems for automating some, or all, of the index design process in certain contexts. These existing efforts fall short of a truly general solution to the problem of automatic index generation. Automatic index composition assumes a particular search problem and a set of data structure primitives, and then composes those primitives into a custom structure that is optimized for a particular workload. Generalized index templates assume a solution structure, and attempt to solve a search problem within that structure. In both cases, the core methodology of the approach imposes restrictions on the types of problems to which they can be applied. Thus, neither is a truly viable approach to creating indices for arbitrary search problems in the general case. We propose a system based on a third technique: automatic feature extension. Starting with an existing data structure for the search problem of interest, various general techniques can be used to automatically add the features missing by the structure to create an index. A special case of this approach is well studied in the theoretical literature: dynamization. Dynamization seeks to automatically add support for inserts, and sometimes deletes, to a static data structure for a search problem that satisfies certain constraints. Dynamization has a number of limitations that prevent it from standing on its own as a solution to this problem, and so this work has concentrated on overcoming these shortcomings. By introducing new classifications of search problem, along with mechanisms to support solving them over a dynamized structure, we extended the applicability of dynamization techniques to a broader set of data structures and search problems, as well as increased the number of search problems for which deletes can be efficiently supported. We considered the design space of the similarly structured LSM Tree data structure, and borrowed certain applicable elements to introduce a configurable design space to allow for trade-offs between insertion and query performance. We then devised a system for controlling the worst-case insertion performance dynamized structures, leveraging concurrency to match the lowest existing worst-case bound in the theoretical literature, and then parallelism to beat it. Through this effort, we have managed to resolve what we saw as the most significant barriers to the use of dynamization in the context of database indexing. \section{Future Work} While this is a significant step forward, there remains significant work to be done before the ultimate goal of a general, automatic index generation framework has been reached. We have resolved a number of existing problems to make dynamization viable in the context of database systems, as well as expanded the scope of dynamization to include concurrency, but a database index requires more features than update support. In particular, our framework must also support the following additional features, \begin{enumerate} \item \textbf{Automatic Tuning of Insertion Rejection Rate.} \\ The tail latency control system discussed in Chapter~\ref{chap:tail-latency} is based upon setting a rejection rate parameter for inserts, which must be tuned for the data structure being dynamized. The current version treats this as a user-specified constant parameter, but it would be ideal for this parameter to be automatically determined based on the performance of the framework. In particular, we noted in Chapter~\ref{chap:tail-latency} that having it fixed to a single value is sub-optimal for some data structures, and there also exist opportunities to dynamically adjust it based on the actual rate of inserts into the system to achieve better throughput. The design of a system for doing this automatic rejection rate tuning is an important next step for the framework. \item \textbf{Support for external storage.} \\ While we did have an implementation of sampling framework discussed in Chapter~\ref{chap:sampling} that used an external data structure, the general framework discussed in the following chapters was considered for in-memory structures only. We will need to extend it with support for external structures, as well as evaluate whether our proposed techniques still function effectively in this context. \item \textbf{Crash recovery.} \\ It is critical for a database index to support crash recovery, so that it can be recovered to a state consistent with the rest of the database in the event of a system fault. Because our dynamized indices are append-only, and can be viewed as a log of sorts, inefficient crash recovery is straightforward: All operations can be logged and replayed in the event of a crash. But this is highly inefficient, and so a better scheme must be devised. \item \textbf{Distributed systems support.} \\ The append-only and decomposed nature of dynamized indices make them seem a natural fit in a distributed systems context. This was briefly discussed in Section~\ref{ssec:ext-distributed}. While not required for all, or even most, applications, support for automatically distributing an index over multiple nodes in a distributed system would be desirable. \end{enumerate} Once the full set of necessary index features can be supported by the framework, we plan to integrate the system into a database to allow user-defined indexing. To accommodate this, it will also be necessary to devise a mechanism for allowing the query optimizer to use these arbitrary, user-defined indices, when generating query plans.