Initial commit

author: Douglas Rumbaugh <dbr4@psu.edu> 2025-04-27 17:36:57 -0400
committer: Douglas Rumbaugh <dbr4@psu.edu> 2025-04-27 17:36:57 -0400
commit: 5e4ad2777acc4c2420514e39fb98b7cf2e200996 (patch)
tree: 276c075048e85426436db8babf0ca1f37e9fdba2 /chapters/abstract.tex
download: dissertation-5e4ad2777acc4c2420514e39fb98b7cf2e200996.tar.gz
1 files changed, 42 insertions, 0 deletions
diff --git a/chapters/abstract.tex b/chapters/abstract.tex
new file mode 100644
index 0000000..5ddfd37
--- /dev/null
+++ b/chapters/abstract.tex
@@ -0,0 +1,42 @@
+Modern data systems must cope with a wider variety of data than ever
+before, and as a result we've seen the proliferation of a large number of
+highly specialized data management systems, such as vector and graph
+databases. These systems are built upon specialized data structures for
+a particular query, or class of queries, and as a result have a very
+specific range of efficacy. Beyond this, they are difficult to develop
+because of the requirements that they place upon the data structures at
+their core, including requiring support for concurrent updates. As a
+result, a large number of potentially useful data structures are excluded
+from use in such systems, or at the very least require a large amount of
+development time to be made useful.
+
+This work seeks to address this difficulty by introducing a framework for
+automatic data structure dynamization. Given a static data structure and
+an associated query, satisfying certain requirements, this proposed work
+will enable automatically adding support for concurrent updates, with
+minimal modification to the data structure itself. It is based on a
+body of theoretical work on dynamization, often called the "Bentley-Saxe
+Method", which partitions data into a number of small data structures,
+and periodically rebuilds these as records are inserted or deleted, in
+a manner that maintains asymptotic bounds on worst case query time,
+as well as amortized insertion time. These techniques, as they currently
+exist, are limited in usefulness as they exhibit poor performance in
+practice, and lack support for concurrency. But, they serve as a solid
+theoretical base upon which a novel system can be built to address
+these concerns.
+
+To develop this framework, sampling queries (which are not well served
+by existing dynamic data structures) are first considered. The results
+of this analysis are then generalized to produce a framework for
+single-threaded dynamization that is applicable to a large number
+of possible data structures and query types, and the general framework
+evaluated across a number of data structures and query types. These
+dynamized static structures are shown to equal or exceed the performance
+of existing specialized dynamic structures in both update and query
+performance.
+
+Finally, this general framework is expanded with support for concurrent
+operations (inserts and queries), and the use of scheduling and
+parallelism is studied to provide worst-case insertion guarantees,
+as well as a rich trade-off space between query and insertion performance.
+
author	Douglas Rumbaugh <dbr4@psu.edu>	2025-04-27 17:36:57 -0400
committer	Douglas Rumbaugh <dbr4@psu.edu>	2025-04-27 17:36:57 -0400
commit	5e4ad2777acc4c2420514e39fb98b7cf2e200996 (patch)
tree	276c075048e85426436db8babf0ca1f37e9fdba2 /chapters/abstract.tex
download	dissertation-5e4ad2777acc4c2420514e39fb98b7cf2e200996.tar.gz