summaryrefslogtreecommitdiffstats
path: root/chapters/introduction.tex
diff options
context:
space:
mode:
authorDouglas Rumbaugh <dbr4@psu.edu>2025-05-12 19:59:26 -0400
committerDouglas Rumbaugh <dbr4@psu.edu>2025-05-12 19:59:26 -0400
commit5ffc53e69e956054fdefd1fe193e00eee705dcab (patch)
tree74fd32db95211d0be067d22919e65ac959e4fa46 /chapters/introduction.tex
parent901a04fd8ec9a07b7bd195517a6d9e89da3ecab6 (diff)
downloaddissertation-5ffc53e69e956054fdefd1fe193e00eee705dcab.tar.gz
Updates
Diffstat (limited to 'chapters/introduction.tex')
-rw-r--r--chapters/introduction.tex42
1 files changed, 42 insertions, 0 deletions
diff --git a/chapters/introduction.tex b/chapters/introduction.tex
index a5d9740..7084867 100644
--- a/chapters/introduction.tex
+++ b/chapters/introduction.tex
@@ -1,6 +1,48 @@
\chapter{Introduction}
\label{chap:intro}
+Modern relational database management systems (RDBMS) are founded
+upon a set-based representation of data~\cite{codd70}. This model is
+very flexible and can be used to represent data of a wide variety of
+types, from standard tabular information, to vectors, to graphs, and
+more. However, this flexibility comes at a significant cost in terms of
+its ability to answer queries: the most basic data access operation is
+a linear table scan.
+
+To work around this limitation, RDBMS support the creation of special
+data structures called indices, which can be used to accelerate
+particular types of query, and feature sophisticated query planning and
+optimization systems that can identify opportunities to utilize these
+indices~\cite{cowbook}. This approach works well for particular types
+of queries for which an index has been designed and integrated into
+the database. Unfortunately, many RDBMS only support a very limited
+set of indices for accelerating single dimensional range queries and
+point-lookups~\cite{mysql-btree-hash, cowbook}.
+
+This situation is unfortunate, because one of the major challenges
+currently facing data systems is the processing of complex analytical
+queries of varying types over large sets of data. These queries and
+data types are supported, nominally, by a relational database, but
+are not well addressed by existing indexing techniques and as a result
+have horrible performance. This has led to the development of a variety
+of specialized systems for particular types of query, such as spatial
+systems~\cite{postgis-doc}, vector databases~\cite{pinecone-db}, and
+graph databases~\cite{neptune, neo4j}.
+
+
+
+
+
+
+however the cost of this flexibility is
+
+Modern relational database systems are based upon the fundamental data
+
+
+highly optimized for addressing
+particular types of search problems, such as point lookups and range
+queries.
+
One of the major challenges facing current data systems is the processing
of complex and varied analytical queries over vast data sets. One commonly
used technique for accelerating these queries is the application of data