diff options
| author | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-12 19:59:26 -0400 |
|---|---|---|
| committer | Douglas Rumbaugh <dbr4@psu.edu> | 2025-05-12 19:59:26 -0400 |
| commit | 5ffc53e69e956054fdefd1fe193e00eee705dcab (patch) | |
| tree | 74fd32db95211d0be067d22919e65ac959e4fa46 /chapters/introduction.tex | |
| parent | 901a04fd8ec9a07b7bd195517a6d9e89da3ecab6 (diff) | |
| download | dissertation-5ffc53e69e956054fdefd1fe193e00eee705dcab.tar.gz | |
Updates
Diffstat (limited to 'chapters/introduction.tex')
| -rw-r--r-- | chapters/introduction.tex | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/chapters/introduction.tex b/chapters/introduction.tex index a5d9740..7084867 100644 --- a/chapters/introduction.tex +++ b/chapters/introduction.tex @@ -1,6 +1,48 @@ \chapter{Introduction} \label{chap:intro} +Modern relational database management systems (RDBMS) are founded +upon a set-based representation of data~\cite{codd70}. This model is +very flexible and can be used to represent data of a wide variety of +types, from standard tabular information, to vectors, to graphs, and +more. However, this flexibility comes at a significant cost in terms of +its ability to answer queries: the most basic data access operation is +a linear table scan. + +To work around this limitation, RDBMS support the creation of special +data structures called indices, which can be used to accelerate +particular types of query, and feature sophisticated query planning and +optimization systems that can identify opportunities to utilize these +indices~\cite{cowbook}. This approach works well for particular types +of queries for which an index has been designed and integrated into +the database. Unfortunately, many RDBMS only support a very limited +set of indices for accelerating single dimensional range queries and +point-lookups~\cite{mysql-btree-hash, cowbook}. + +This situation is unfortunate, because one of the major challenges +currently facing data systems is the processing of complex analytical +queries of varying types over large sets of data. These queries and +data types are supported, nominally, by a relational database, but +are not well addressed by existing indexing techniques and as a result +have horrible performance. This has led to the development of a variety +of specialized systems for particular types of query, such as spatial +systems~\cite{postgis-doc}, vector databases~\cite{pinecone-db}, and +graph databases~\cite{neptune, neo4j}. + + + + + + +however the cost of this flexibility is + +Modern relational database systems are based upon the fundamental data + + +highly optimized for addressing +particular types of search problems, such as point lookups and range +queries. + One of the major challenges facing current data systems is the processing of complex and varied analytical queries over vast data sets. One commonly used technique for accelerating these queries is the application of data |