1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
|
\chapter{Exploring the Design Space}
\label{chap:design-space}
\section{Introduction}
In the previous two chapters, we introduced an LSM tree inspired design
space into the Bentley-Saxe method to allow for more flexibility in
tuning the performance. However, aside from some general comments
about how these parameters operator in relation to insertion and
query performance, and some limited experimental evaluation, we haven't
performed a systematic analysis of this space, its capabilities, and its
limitations. We will rectify this situation in this chapter, performing
both a detailed mathematical analysis of the design parameter space,
as well as experiments to demonstrate these trade-offs exist in practice.
\subsection{Why bother?}
Before diving into the design space we have introduced in detail, it's
worth taking some time to motivate this entire endeavor. There is a large
body of theoretical work in the area of data structure dynamization,
and, to the best of our knowledge, none of these papers have introduced
a design space of the sort that we have introduced here. Despite this,
some papers which \emph{use} these techniques have introduced similar
design elements into their own implementations~\cite{pgm}, with some
even going so far as to (inaccurately) describe these elements as part
of the Bentley-Saxe method~\cite{almodaresi23}.
This situation is best understood, we think, in terms of the ultimate
goals of the respective lines of work. In the classical literature on
dynamization, the focus is mostly on proving theoretical asymptotic
bounds about the techniques. In this context, the LSM tree design space
is of limited utility, because its tuning parameters adjust constant
factors only, and thus don't play a major role in asymptotics. Where
the theoretical literature does introduce configurability, such as
with the equal blocks method~\cite{overmars-art-of-dyn} or more
complex schemes that nest the equal block method \emph{inside}
of a binary decomposition~\cite{overmars81}, the intention is
to produce asymptotically relevant trade-offs between insert,
query, and delete performance for deletion decomposable search
problems~\cite[pg. 117]{overmars83}. This is why the equal block method
is described in terms of a function, rather than a constant value,
to enable it to appear in the asymptotics.
On the other hand, in practical scenarios, constant tuning of performance
can be very relevant. We've already shown in Sections~\ref{ssec:ds-exp}
and \ref{ssec:dyn-ds-exp} how tuning parameters, particularly the
number of shards per level, can have measurable real-world effects on the
performance characteristics of dynamized structures, and in fact sometimes
this tuning is \emph{necessary} to enable reasonable performance. It's
quite telling that the two most direct implementations of the Bentley-Saxe
method that we have identified in the literature are both in the context
of metric indices~\cite{naidan14,bkdtree}, a class of data structure
and search problem for which we saw very good performance from standard
Bentley-Saxe in Section~\ref{ssec:dyn-knn-exp}. The other experiments
in Chapter~\ref{chap:framework} show that, for other types of problem,
the technique does not fare quite so well.
\section{Asymptotic Analysis}
\label{sec:design-asymp}
Before beginning with derivations for
the cost functions of dynamized structures within the context of our
proposed design space, we should make a few comments about the assumptions
and techniques that we will use in our analysis. As this design space
involves adjusting constants, we will leave the design-space related
constants within our asymptotic expressions. Additionally, we will
perform the analysis for a simple decomposable search problem. Deletes
will be entirely neglected, and we won't make any assumptions about
mergeability. We will also neglect the buffer size, $N_B$, during this
analysis. Buffering isn't fundamental to the techniques we are examining
in this chapter, and including it would increase the complexity of the
analysis without contributing any useful insights.\footnote{
The contribution of the buffer size is simply to replace each of the
individual records considered in the analysis with batches of $N_B$
records. The same patterns hold.
}
\subsection{Generalized Bentley Saxe Method}
As a first step, we will derive a modified version of the Bentley-Saxe
method that has been adjusted to support arbitrary scale factors, and
buffering. There's nothing fundamental to the technique that prevents
such modifications, and its likely that they have not been analyzed
like this before simply out of a lack of interest in constant factors in
theoretical asymptotic analysis. During our analysis, we'll intentionally
leave these constant factors in place.
When generalizing the Bentley-Saxe method for arbitrary scale factors, we
decided to maintain the core concept of binary decomposition. One interesting
mathematical property of a Bentley-Saxe dynamization is that the internal
layout of levels exactly matches the binary representation of the record
count contained within the index. For example, a dynamization containing
$n=20$ records will have 4 records in the third level, and 16 in the fifth,
with all other levels being empty. If we represent a full level with a 1
and an empty level with a 0, then we'd have $10100$, which is $20$ in
base 2.
\begin{algorithm}
\caption{The Generalized BSM Layout Policy}
\label{alg:design-bsm}
\KwIn{$r$: set of records to be inserted, $\mathscr{I}$: a dynamized structure, $n$: number of records in $\mathscr{I}$}
\BlankLine
\Comment{Find the first non-full level}
$target \gets -1$ \;
\For{$i=0\ldots \log_s n$} {
\If {$|\mathscr{I}_i| < N_B (s - 1)\cdot s^i$} {
$target \gets i$ \;
break \;
}
}
\BlankLine
\Comment{If the structure is full, we need to grow it}
\If {$target = -1$} {
$target \gets 1 + (\log_s n)$ \;
}
\BlankLine
\Comment{Build the new structure}
$\mathscr{I}_{target} \gets \text{build}(\text{unbuild}(\mathscr{I}_0) \cup \ldots \text{unbuild}(\mathscr{I}_{target}) \cup r)$ \;
\BlankLine
\Comment{Empty the levels used to build the new shard}
\For{$i=0\ldots target-1$} {
$\mathscr{I}_i \gets \emptyset$ \;
}
\end{algorithm}
Our generalization, then, is to represent the data as an $s$-ary
decomposition, where the scale factor represents the base of the
representation. To accomplish this, we set of capacity of level $i$ to
be $N_B (s - 1) \cdot s^i$, where $N_B$ is the size of the buffer. The
resulting structure will have at most $\log_s n$ shards. The resulting
policy is described in Algorithm~\ref{alg:design-bsm}.
Unfortunately, the approach used by Bentley and Saxe to calculate the
amortized insertion cost of the BSM does not generalize to larger bases,
and so we will need to derive this result using a different approach.
\begin{theorem}
The amortized insertion cost for generalized BSM with a growth factor of
$s$ is $\Theta\left(\frac{B(n)}{n} \cdot s\log_s n)\right)$.
\end{theorem}
\begin{proof}
In order to calculate the amortized insertion cost, we will first
determine the average number of times that a record is involved in a
reconstruction, and then amortize those reconstructions over the records
in the structure.
If we consider only the first level of the structure, it's clear that
the reconstruction count associated with each record in that structure
will follow the pattern, $1, 2, 3, 4, ..., s-1$ when the level is full.
Thus, the total number of reconstructions associated with records on level
$i=0$ is the sum of that sequence, or
\begin{equation*}
W(0) = \sum_{j=1}^{s-1} j = \frac{1}{2}\left(s^2 - s\right)
\end{equation*}
Considering the next level, $i=1$, each reconstruction involving this
level will copy down the entirety of the structure above it, adding
one more write per record, as well as one extra write for the new record.
More specifically, in the above example, the first "batch" of records in
level $i=1$ will have the following write counts: $1, 2, 3, 4, 5, ..., s$,
the second "batch" of records will increment all of the existing write
counts by one, and then introduce another copy of $1, 2, 3, 4, 5, ..., s$
writes, and so on.
Thus, each new "batch" written to level $i$ will introduce $W(i-1) + 1$
writes from the previous level into level $i$, as well as rewriting all
of the records currently on level $i$.
The net result of this is that the number of writes on level $i$ is given
by the following recurrence relation (combined with the $W(0)$ base case),
\begin{equation*}
W(i) = sW(i-1) + \frac{1}{2}\left(s-1\right)^2 \cdot s^i
\end{equation*}
which can be solved to give the following closed-form expression,
\begin{equation*}
W(i) = s^i \cdot \left(\frac{1}{2} (s-1) \cdot (s(i+1) - i)\right)
\end{equation*}
which provides the total number of reconstructions that records in
level $i$ of the structure have participated in. As each record
is involved in a different number of reconstructions, we'll consider the
average number by dividing $W(i)$ by the number of records in level $i$.
From here, the proof proceeds in the standard way for this sort of
analysis. The worst-case cost of a reconstruction is $B(n)$, and there
are $\log_s(n)$ total levels, so the total reconstruction costs associated
with a record can be upper-bounded by, $B(n) \cdot
\frac{W(\log_s(n))}{n}$, and then this cost amortized over the $n$
insertions necessary to get the record into the last level. We'll also
condense the multiplicative constants and drop the additive ones to more
clearly represent the relationship we're looking to show. This results
in an amortized insertion cost of,
\begin{equation*}
\frac{B(n)}{n} \cdot s \log_s n
\end{equation*}
\end{proof}
\begin{theorem}
The worst-case insertion cost for generalized BSM with a scale factor
of $s$ is $\Theta(B(n))$.
\end{theorem}
\begin{proof}
The Bentley-Saxe method finds the smallest non-full block and performs
a reconstruction including all of the records from that block, as well
as all blocks smaller than it, and the new records to be added. The
worst case, then, will occur when all of the existing blocks in the
structure are full, and a new, larger, block must be added.
In this case, the reconstruction will involve every record currently
in the dynamized structure, and will thus have a cost of $I(n) \in
\Theta(B(n))$.
\end{proof}
\begin{theorem}
The worst-case query cost for generalized BSM for a decomposable
search problem with cost $\mathscr{Q}_S(n)$ is $O(\log_s(n) \cdot
\mathscr{Q}_s(n))$.
\end{theorem}
\begin{proof}
The worst-case scenario for queries in BSM occurs when every existing
level is full. In this case, there will be $\log_s n$ levels that must
be queried, with the $i$th level containing $(s - 1) \cdot s^i$ records.
Thus, the total cost of querying the structure will be,
\begin{equation}
\mathscr{Q}(n) = \sum_{i=0}^{\log_s n} \mathscr{Q}_S\left((s - 1) \cdot s^i\right)
\end{equation}
The number of records per shard will be upper bounded by $O(n)$, so
\begin{equation}
\mathscr{Q}(n) \in O\left(\sum_{i=0}^{\log_s n} \mathscr{Q}_S(n)\right)
\in O\left(\log_s n \cdot \mathscr{Q}_S(n)\right)
\end{equation}
\end{proof}
\begin{theorem}
The best-case query cost for generalized BSM for a decomposable
search problem with a cost of $\mathscr{Q}_S(n)$ is $\mathscr{Q}(n)
\in \Theta(\mathscr{Q}_S(n))$.
\end{theorem}
\begin{proof}
The best case scenario for queries in BSM occurs when a new level is
added, which results in every record in the structure being compacted
into a single structure. In this case, there is only a single data
structure in the dynamization, and so the query cost over the dynamized
structure is identical to the query cost of a single static instance of
the structure. Thus, the best case query cost in BSM is,
\begin{equation*}
\mathscr{Q}_B(n) \in \Theta \left( 1 \cdot \mathscr{Q}_S(n) \right) \in \Theta\left(\mathscr{Q}_S(n)\right)
\end{equation*}
\end{proof}
\subsection{Leveling}
\begin{algorithm}
\caption{The Leveling Policy}
\label{alg:design-leveling}
\KwIn{$r$: set of records to be inserted, $\mathscr{I}$: a dynamized structure, $n$: number of records in $\mathscr{I}$}
\BlankLine
\Comment{Find the first non-full level}
$target \gets -1$ \;
\For{$i=0\ldots \log_s n$} {
\If {$|\mathscr{I}_i| < N_B \cdot s^{i+1}$} {
$target \gets i$ \;
break \;
}
}
\BlankLine
\Comment{If the target is $0$, then just merge the buffer into it}
\If{$target = 0$} {
$\mathscr{I}_0 \gets \text{build}(\text{unbuild}(\mathscr{I}_0) \cup r)$ \;
\Return
}
\BlankLine
\Comment{If the structure is full, we need to grow it}
\If {$target = -1$} {
$target \gets 1 + (\log_s n)$ \;
}
\BlankLine
\Comment{Perform the reconstruction}
$\mathscr{I}_{target} \gets \text{build}(\text{unbuild}(\mathscr{I}_{target}) \cup \text{unbuild}(\mathscr{I}_{target - 1}))$ \;
\BlankLine
\Comment{Shift the remaining levels down to free up $\mathscr{I}_0$}
\For{$i=target-1 \ldots 1$} {
$\mathscr{I}_i \gets \mathscr{I}_{i-1}$ \;
}
\BlankLine
\Comment{Flush the buffer in $\mathscr{I}_0$}
$\mathscr{I}_0 \gets \text{build}(r)$ \;
\Return \;
\end{algorithm}
Our leveling layout policy is described in
Algorithm~\ref{alg:design-leveling}. Each level contains a single
structure with a capacity of $N_B\cdot s^{i+1}$ records. When a
reconstruction occurs, the smallest level, $i$, with space to contain the
records from level $i-1$, in addition to the records currently within
it, is located. Then, a new structure is built at level $i$ containing
all of the records in levels $i$ and $i-1$, and the structure at level
$i-1$ is deleted. Finally, all levels $j < (i - 1)$ are shifted to level
$j+1$. This process clears space in level $0$ to contain the buffer flush.
\begin{theorem}
The amortized insertion cost of leveling with a scale factor of $s$ is
\begin{equation*}
I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \frac{1}{2}(s+1)\log_s n\right)
\end{equation*}
\end{theorem}
\begin{proof}
Similarly to generalized BSM, the records in each level will be rewritten
up to $s$ times before they move down to the next level. Thus, the
amortized insertion cost for leveling can be found by determining how
many times a record is expected to be rewritten on a single level, and
how many levels there are in the structure.
On any given level, the total number of writes require to fill the level
is given by the expression,
\begin{equation*}
B(s + (s - 1) + (s - 2) + \ldots + 1)
\end{equation*}
where $B$ is the number of records added to the level during each
reconstruction (i.e., $N_B$ for level $0$ and $N_B\cdots^{i-1}$ for any
other level).
This is because the first batch of records entering the level will be
rewritten each of the $s$ times that the level is rebuilt before the
records are merged into the level below. The next batch will be rewritten
one fewer times, and so on. Thus, the total number of writes is,
\begin{equation*}
B\sum_{i=0}^{s-1} (s - i) = B\left(s^2 + \sum_{i=0}^{i-1} i\right) = B\left(s^2 + \frac{(s-1)s}{2}\right)
\end{equation*}
which can be simplified to get,
\begin{equation*}
\frac{1}{2}s(s+1)\cdot B
\end{equation*}
writes occurring on each level.\footnote{
This write count is not cumulative over the entire structure. It only
accounts for the number of writes occurring on this specific level.
}
To obtain the total number of times records are rewritten, we need to
calculate the average number of times a record is rewritten per level,
and sum this over all of the levels.
\begin{equation*}
\sum_{i=0}^{\log_s n} \frac{\frac{1}{2}B_i s (s+1)}{s B_i} = \frac{1}{2} \sum_{i=0}^{\log_s n} (s + 1) = \frac{1}{2} (s+1) \log_s n
\end{equation*}
To calculate the amortized insertion cost, we multiply this write amplification
number of the cost of rebuilding the structures, and divide by the total number
of records. We'll condense the constant into a single $s$, as this best
expresses the nature of the relationship we're looking for,
\begin{equation*}
I_A(n) \in \Theta\left(\frac{B(n)}{n}\cdot s \log_s n\right)
\end{equation*}
\end{proof}
\begin{theorem}
The worst-case insertion cost for leveling with a scale factor of $s$ is
\begin{equation*}
\Theta\left(B\left(\frac{s-1}{s} \cdot n\right)\right)
\end{equation*}
\end{theorem}
\begin{proof}
Unlike in BSM, where the worst case reconstruction involves all of the
records within the structure, in leveling it only includes the records
in the last two levels. In particular, the worst case behavior occurs
when the last level is one reconstruction away from its capacity, and the
level above it is full. In this case, the reconstruction will involve the
full capacity of the last level, or $N_B \cdot s^{\log_s n +1}$ records.
We can relate this to $n$ by finding the ratio of elements contained in
the last level of the structure to the entire structure. This is given
by,
\begin{equation*}
\frac{N_B \cdot s^{\log_s n + 1}}{\sum_{i=0}^{\log_s n} N_B \cdot s^{i + 1}} = \frac{(s - 1)n}{sn - 1}
\end{equation*}
This fraction can be simplified by noting that the $1$ subtracted in
the denominator is negligible and dropping it, allowing the $n$ to be
canceled and giving a ratio of $\frac{s-1}{s}$. Thus the worst case reconstruction
will involve $\frac{s - 1}{s} \cdot n$ records, with all the other levels
simply shifting down at no cost, resulting in a worst-case insertion cost
of,
\begin{equation*}
I(n) \in \Theta\left(B\left(\frac{s-1}{s} \cdot n\right)\right)
\end{equation*}
\end{proof}
\begin{theorem}
The worst-case query cost for leveling for a decomposable search
problem with cost $\mathscr{Q}_S(n)$ is
\begin{equation*}
O\left(\mathscr{Q}_S(n) \cdot \log_s n \right)
\end{equation*}
\end{theorem}
\begin{proof}
The worst-case scenario for leveling is right before the structure gains
a new level, at which point there will be $\log_s n$ data structures
each with $O(n)$ records. Thus the worst-case cost will be the cost
of querying each of these structures,
\begin{equation*}
O\left(\mathscr{Q}_S(n) \cdot \log_s n \right)
\end{equation*}
\end{proof}
\begin{theorem}
The best-case query cost for leveling for a decomposable search
problem with cost $\mathscr{Q}_S(n)$ is
\begin{equation*}
\mathscr{Q}_B(n) \in O(\mathscr{Q}_S(n) \cdot \log_s n)
\end{equation*}
\end{theorem}
\begin{proof}
Unlike BSM, leveling will never have empty levels. The policy ensures
that there is always a data structure on every level. As a result, the
best-case query still must query $\log_s n$ structures, and so has a
best-case cost of,
\begin{equation*}
\mathscr{Q}_B(n) \in O\left(\mathscr{Q}_S(n) \cdot \log_s n\right)
\end{equation*}
\end{proof}
\subsection{Tiering}
\begin{algorithm}
\caption{The Tiering Policy}
\label{alg:design-tiering}
\KwIn{$r$: set of records to be inserted, $\mathscr{L}_0 \ldots \mathscr{L}_{\log_s n}$: the levels of $\mathscr{I}$, $n$: the number of records in $\mathscr{I}$}
\BlankLine
\Comment{Find the first non-full level}
$target \gets -1$ \;
\For{$i=0\ldots \log_s n$} {
\If {$|\mathscr{L}_i| < s$} {
$target \gets i$ \;
break \;
}
}
\BlankLine
\Comment{If the structure is full, we need to grow it}
\If {$target = -1$} {
$target \gets 1 + (\log_s n)$ \;
}
\BlankLine
\Comment{Walk the structure backwards, applying reconstructions}
\For {$i \gets target \ldots 1$} {
$\mathscr{L}_i \gets \mathscr{L_i} \cup \text{build}(\text{unbuild}(\mathscr{L}_{i-1, 0}) \ldots \text{unbuild}(\mathscr{L}_{i-1, s-1}))$ \;
}
\BlankLine
\Comment{Add the buffered records to $\mathscr{L}_0$}
$\mathscr{L}_0 \gets \mathscr{L}_0 \cup \text{build}(r)$ \;
\Return \;
\end{algorithm}
Our tiering layout policy is described in Algorithm~\ref{alg:design-tiering}. In
this policy, each level contains $s$ shards, each with a capacity
$N_B\cdot s^i$ records. When a reconstruction occurs, the first level
with fewer than $s$ shards is selected as the target, $t$. Then, for
every level with $i < t$, all of the shards in $i$ are merged into a
single shard using a reconstruction and placed in level $i+1$. These
reconstructions are performed backwards, starting at $t-1$ and moving
back up towards $0$. Then, the the shard created by the buffer flush is
placed in level $0$.
\begin{theorem}
The amortized insertion cost of tiering with a scale factor of $s$ is,
\begin{equation*}
I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \log_s n \right)
\end{equation*}
\end{theorem}
\begin{proof}
For tiering, each record is written \emph{exactly} one time per
level. As a result, each record will be involved in exactly $\log_s n$
reconstructions over the lifetime of the structure. Each reconstruction
will have cost $B(n)$, and thus the amortized insertion cost must be,
\begin{equation*}
I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \log_s n\right)
\end{equation*}
\end{proof}
\begin{theorem}
The worst-case insertion cost of tiering with a scale factor of $s$ is,
\begin{equation*}
I(n) \in \Theta\left(B(n)\right)
\end{equation*}
\end{theorem}
\begin{proof}
The worst-case reconstruction in tiering involves performing a
reconstruction on each level. More formally, the total cost of this
reconstruction will be,
\begin{equation*}
I(n) \in \Theta\left(\sum_{i=0}^{\log_s n} B(s^i)\right)
\end{equation*}
\end{proof}
\begin{theorem}
The worst-case query cost for tiering for a decomposable search
problem with cost $\mathscr{Q}_S(n)$ is
\begin{equation*}
\mathscr{Q}(n) \in O( \mathscr{Q}_S(n) \cdot s \log_s n)
\end{equation*}
\end{theorem}
\begin{proof}
As with the previous two policies, the worst-case query occurs when the
structure is completely full. In case of tiering, that means that there
will be $\log_s n$ levels, each containing $s$ shards with a size bounded
by $O(n)$. Thus, there will be $s \log_s n$ structures to query, and the
query cost must be,
\begin{equation*}
\mathscr{Q}(n) \in O \left(\mathscr{Q}_S(n) \cdot s \log_s n \right)
\end{equation*}
\end{proof}
\begin{theorem}
The best-case query cost for tiering for a decomposable search problem
with cost $\mathscr{Q}_S(n)$ is $O(\log_s n)$.
\end{theorem}
\begin{proof}
The tiering policy ensures that there are no internal empty levels, and
as a result the best case scenario for tiering occurs when each level is
populated by exactly $1$ shard. In this case, there will only be $\log_s n$
shards to query, resulting in,
\begin{equation*}
\mathscr{Q}_B(n) \in O\left(\mathscr{Q}_S(n) \cdot \log_S n \right)
\end{equation*}
best-case query cost.
\end{proof}
\section{General Observations}
The asymptotic results from the previous section are summarized in
Table~\ref{tab:policy-comp}. When the scale factor is accounted for
in the analysis, we can see that possible trade-offs begin to manifest
within the space. We've seen some of these in action directly in
the experimental sections of previous chapters.
Most notably, we can directly see in these cost functions the reason why
tiering and leveling experience opposite effects as the scale factor
changes. In both policies, increasing the scale factor increases the
base of the logarithm governing the height, and so in the absence of
the additional constants in the analysis, it would superficially appear
as though both policies should see the same effects. But, with other
constants retained, we can see that this is in fact not the case. For
tiering, increasing the scale factor does reduce the number of levels,
however it also increases the number of shards. Because the level
reduction is in the base of the logarithm, but the shard count increase
is directly linear, the shard count effect dominates and we see the query
performance reduce as the scale factor increases. Leveling, however,
does not include this linear term and sees only a reduction in height.
When considering insertion, we see a similar situation in reverse. For
leveling and tiering, increasing the scale factor reduces the size of
the log term, and there are no other terms at play in tiering, so we
see an improvement in insertion performance. However, leveling also
has a linear dependency on the scale factor, as increasing the scale
factor also increases the write amplification. This is why leveling sees
its insertion performance reduce with scale factor. The generalized
Bentley-Saxe method follows the same general trends as leveling for
worst-case query cost and for amortized insertion cost.
Of note as well is the fact that leveling has slightly better worst-case
insertion performance. This is because leveling only ever reconstructs
one level at a time, with the other levels simply shifting around in
constant time. Bentley-Saxe and tiering have strictly worse worst-case
insertion cost as their worst-case reconstructions involve all of the
levels. In the Bentley-Saxe method, this worst-case cost is manifest
in a single, large reconstruction. In tiering, it involves $\log_s n$
reconstructions, one per level.
\begin{table*}
\centering
\small
\renewcommand{\arraystretch}{1.6}
\begin{tabular}{|l l l l|}
\hline
& \textbf{Gen. BSM} & \textbf{Leveling} & \textbf{Tiering} \\ \hline
$\mathscr{Q}(n)$ &$O\left(\log_s n \cdot \mathscr{Q}_S(n)\right)$ & $O\left(\log_s n \cdot \mathscr{Q}_S(n)\right)$ & $O\left(s \log_s n \cdot \mathscr{Q}_S(n)\right)$\\ \hline
$\mathscr{Q}_B(n)$ & $\Theta(\mathscr{Q}_S(n))$ & $O(\log_s n \cdot \mathscr{Q}_S(n))$ & $O(\log_s n \cdot \mathscr{Q}_S(n))$ \\ \hline
$I(n)$ & $\Theta(B(n))$ & $\Theta\left(B\left(\frac{s-1}{s} \cdot n\right)\right)$ & $ \Theta\left(\sum_{i=0}^{\log_s n} B(s^i)\right)$ \\ \hline
$I_A(n)$ & $\Theta\left(\frac{B(n)}{n} s\log_s n)\right)$ & $\Theta\left(\frac{B(n)}{n} s\log_s n\right)$& $\Theta\left(\frac{B(n)}{n} \log_s n\right)$ \\ \hline
\end{tabular}
\caption{Comparison of cost functions for various layout policies for DSPs}
\label{tab:policy-comp}
\end{table*}
\section{Experimental Evaluation}
In the previous sections, we mathematically proved various claims about
the performance characteristics of our three layout policies to assess
the trade-offs that exist within the design space. While this analysis is
useful, the effects we are examining are at the level of constant factors,
and so it would be useful to perform experimental testing to validate
that these claimed performance characteristics manifest in practice. In
this section, we will do just that, running various benchmarks to explore
the real-world performance implications of the configuration parameter
space of our framework.
\subsection{Asymptotic Insertion Performance}
We'll begin by validating our results for the insertion performance
characteristics of the three layout policies. For this test, we
consider two data structures: the ISAM tree and the VP tree. The ISAM
tree structure is merge-decomposable using a sorted-array merge, with
a build cost of $B_M(n) \in \Theta(n \log k)$, where $k$ is the number
of structures being merged. The VPTree, by contrast, is \emph{not}
merge decomposable, and is built in $B(n) \in \Theta(n \log n)$ time. We
use the $200,000,000$ record SOSD \texttt{OSM} dataset~\cite{sosd-datasets} for
ISAM testing, and the $1,000,000$ record, $300$-dimensional Spanish
Billion Words (\texttt{SBW}) dataset~\cite{sbw} for VPTree testing.
For our first experiment, we will examine the latency distribution
for inserts into our structures. We tested the three layout policies,
using a common scale factor of $s=2$. This scale factor was selected
to minimize its influence on the results (we've seen before in
Sections~\ref{ssec:ds-exp} and \ref{ssec:dyn-ds-exp} that scale factor
affects leveling and tiering in opposite ways) and isolate the influence
of the layout policy alone to as great a degree as possible. We used a
buffer size of $N_B=12000$ for the ISAM tree structure, and $N_B=1000$
for the VPTree.
We generated this distribution by inserting $30\%$ of the records from
the set to ``warm up'' the dynamized structure, and then measuring the
insertion latency for each individual insert for the remaining $70\%$
of the data. Note that, due to timer resolution issues at nanosecond
scales, the specific latency values associated with the faster end of
the insertion distribution are not precise. However, it is our intention
to examine the latency distribution, not the values themselves, and so
this is not a significant limitation for our analysis.
\begin{figure}
\centering
\subfloat[ISAM Tree Insertion Latencies]{\includegraphics[width=.5\textwidth]{img/design-space/isam-insert-dist.pdf} \label{fig:design-isam-ins-dist}}
\subfloat[VPTree Insertion Latencies]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-insert-dist.pdf} \label{fig:design-vptree-ins-dist}} \\
\caption{Insertion Latency Distributions for Layout Policies}
\label{fig:design-policy-ins-latency}
\end{figure}
The resulting distributions are shown in
Figure~\ref{fig:design-policy-ins-latency}. These distributions are
representing using a "reversed" CDF with log scaling on both axes. This
representation has proven very useful for interpreting the latency
distributions that we see in evaluating dynamization, but are slightly
unusual, and so we've included a guide to interpreting these charts
in Appendix~\ref{append:rcdf}.
The first notable point is that, for both the ISAM
tree in Figure~\ref{fig:design-isam-ins-dist} and VPTree in
Figure~\ref{fig:design-vptree-ins-dist}, the Leveling policy results in a
measurable lower worst-case insertion latency. This result is in line with
our theoretical analysis in Section~\ref{sec:design-asymp}. However, there
is a major deviation from theoretical in the worst-case performance of
Tiering and BSM. Both of these should have similar worst-case latencies,
as the worst-case reconstruction in both cases involves every record
in the structure. Yet, we see tiering consistently performing better,
particularly for the ISAM tree.
The reason for this has to do with the way that the records are
partitioned in these worst-case reconstructions. In Tiering, with a scale
factor of $s$, the worst-case reconstruction consists of $\Theta(\log_2
n)$ distinct reconstructions, each involving exactly $2$ structures. BSM,
on the other hand, will use exactly $1$ reconstruction involving
$\Theta(\log_2 n)$ structures. This explains why ISAM performs much better
in Tiering than BSM, as the actual reconstruction cost function there is
$\Theta(n \log_2 k)$. For tiering, this results in $\Theta(n)$ cost in
the worst case. BSM, on the other hand, has $\Theta(n \log_2 \log_2 n)$,
as many more distinct structures must be merged in the reconstruction,
and is thus asymptotically worse-off. VPTree, on the other hand, sees
less of a difference because it is \emph{not} merge decomposable, and so
the number of structures playing a role in the reconstructions plays less
of a role. Having the records more partitioned still hurts performance,
due to cache effects most likely, but less so than in the MDSP case.
\begin{figure}
\centering
\subfloat[ISAM Tree]{\includegraphics[width=.5\textwidth]{img/design-space/isam-tput.pdf} \label{fig:design-isam-tput}}
\subfloat[VPTree]{\includegraphics[width=.5\textwidth]{img/design-space/vptree-tput.pdf} \label{fig:design-vptree-tput}} \\
\caption{Insertion Throughput for Layout Policies}
\label{fig:design-ins-tput}
\end{figure}
Next, in Figure~\ref{fig:design-ins-tput}, we show the overall insertion
throughput for the three policies for both ISAM Tree and VPTree. This
result should correlate with the amortized insertion costs for each
policy derived in Section~\ref{sec:design-asymp}. At a scale factor of
$s=2$, all three policies have similar insertion performance. This makes
sense, as both leveling and Bentley-Saxe experience write-amplification
proportional to the scale factor, and at $s=2$ this isn't significantly
larger than tiering's write amplification, particularly compared
to the other factors influencing insertion performance, such as
reconstruction time. However, for larger scale factors, tiering shows
\emph{significantly} higher insertion throughput, and Leveling and
Bentley-Saxe show greatly degraded performance due to the large amount
of additional write amplification. These results are perfectly in line
with the mathematical analysis of the previous section.
\subsection{General Insert vs. Query Trends}
For our next experiment, we will consider the trade-offs between insertion
and query performance that exist within this design space. We benchmarked
each layout policy for a range of scale factors, measuring both their
respective insertion throughputs and query latencies for both ISAM Tree
and VPTree.
\begin{figure}
\centering
\subfloat[ISAM Tree Range Count]{\includegraphics[width=.5\textwidth]{img/design-space/isam-parm-sweep.pdf} \label{fig:design-isam-tradeoff}}
\subfloat[VPTree $k$-NN]{\includegraphics[width=.5\textwidth]{img/design-space/knn-parm-sweep.pdf} \label{fig:design-knn-tradeoff}} \\
\caption{Insertion Throughput vs. Query Latency for varying scale factors}
\label{fig:design-tradeoff}
\end{figure}
Figure~\ref{fig:design-isam-tradeoff} shows the trade-off curve between
insertion throughput and query latency for range count queries executed
against a dynamized ISAM tree. This test was run with a dataset
of 500 million uniform integer keys, and a selectivity of $\sigma =
0.0000001$, the scale factor associated with each point is annotated on
the plot. These results show that there is a very direct relationship
between scale factor, layout policy, and insertion throughput. Leveling
almost universally has lower insertion throughput but also lower
query latency than tiering does, though at scale factor $s=2$ they are
fairly similar. Tiering gains insertion throughput at the cost of query
performance as the scale factor increases, although the rate at which
the insertion performance improves decreases for larger scale factors,
and the rate at which query performance declines increases dramatically.
One interesting note is that leveling sees very little improvement in
query latency as the scale factor is increased. This is due to the fact
that, asymptotically, the scale factor only affects leveling's query
performance by increasing the base of a logarithm. Thus, small increases
in scale factor have very little effect. However, level's insertion
performance degrades linearly with scale factor, and this is well
demonstrated in the plot.
The story is a bit clearer in Figure~\ref{fig:design-knn-tradeoff}. The
VPTree has a much greater construction time, both asymptotically and
in absolute terms, and the average query latency is also significantly
greater. These result in the configuration changes showing much more
significant changes in performance, and present us with a far clearer
trade-off space. The same general trends hold as in ISAM, just amplified.
Leveling has better query performance than tiering and sees increased
query performance and decreased insert performance as the scale factor
increases. Tiering has better insertion performance and worse query
performance than leveling, and sees improved insert and worsening
query performance as the scale factor is increased. The Bentley-Saxe
method shows similar trends to leveling.
In general, the Bentley-Saxe method appears to follow a very similar
trend to that of leveling, albeit with even more dramatic performance
degradation as the scale factor is increased and slightly better query
performance across the board. Generally it seems to be a strictly worse
alternative to leveling in all but its best-case query cost, and we will
omit it from our tests moving forward.
\subsection{Buffer Size}
\begin{figure}
\centering
\subfloat[ISAM Tree Range Count]{\includegraphics[width=.5\textwidth]{img/design-space/isam-bs-sweep.pdf} \label{fig:buffer-isam-tradeoff}}
\subfloat[VPTree $k$-NN]{\includegraphics[width=.5\textwidth]{img/design-space/knn-bs-sweep.pdf} \label{fig:buffer-knn-tradeoff}} \\
\caption{Insertion Throughput vs. Query Latency for varying buffer sizes}
\label{fig:buffer-size}
\end{figure}
In the previous section, we considered the effect of various scale
factors on the trade-off between insertion and query performance. Our
framework also supports varying buffer sizes, and so we will examine this
next. Figure~\ref{fig:buffer-size} shows the same insertion throughput
vs. query latency curves for fixed layout policy and scale factor
configurations at varying buffer sizes.
Unlike with the scale factor, there is a significant difference in the
behavior of the two tested structures under buffer size variation. For
the ISAM tree, shown in Figure~\ref{fig:buffer-isam-tradeoff}, we see that
all layout policies follow a similar pattern. Increasing the buffer size
increases insertion throughput for little to no additional query cost up
to a certain point, after which query performance degrades substantially.
This isn't terribly surprising: growing the buffer size will increase
the number of records on each level, and therefore decrease the number
of shards, while at the same time reducing the number of reconstructions
that must be performed. However, the query must be answered against the
buffer too, and once the buffer gets sufficiently large, this increased
cost will exceed any query latency benefit from decreased shard count.
We see this pattern fairly clearly on all tested configurations, however
BSM sees the least benefit from an increased buffer size in terms of
insertion performance.
VPTree is another story, shown in Figure~\ref{fig:buffer-knn-tradeoff}.
This plot is far more chaotic; in fact there aren't any particularly
strong patterns to draw from it. This is likely due to the fact that the
time scales associated with the VPTree in terms of both reconstruction
and query latency are significantly larger, and so the relatively small
constant associated with adjusting the buffer size doesn't have as strong
an influence on performance as it does for the ISAM tree.
\subsection{Query Size Effects}
One potentially interesting aspect of decomposition-based dynamization
techniques is that, asymptotically, the additional cost added by
decomposing the data structure vanished for sufficiently expensive
queries. Bentley and Saxe proved that for query costs of the form
$\mathscr{Q}_B(n) \in \Omega(n^\epsilon)$ for $\epsilon > 0$, the
overall query cost is unaffected (asymptotically) by the decomposition.
This would seem to suggest that, as the cost of the query over a single
shard increases, the effectiveness of our design space for tuning query
performance should reduce. This is because our tuning space consists
of adjusting the number of shards within the structure, and so as the
effects of decomposition on the query cost reduce, we should see all
configurations approaching a similar query performance.
In order to evaluate this effect, we tested the query latency of range
queries of varying selectivity against various configurations of our
framework to see at what points the query latencies begin to converge. We
also tested $k$-NN queries with varying values of $k$.
\begin{figure}
\centering
\subfloat[ISAM Tree Range Count]{\includegraphics[width=.5\textwidth]{img/design-space/selectivity-sweep.pdf} \label{fig:design-isam-sel}}
\subfloat[VPTree $k$-NN]{\includegraphics[width=.5\textwidth]{img/design-space/selectivity-sweep-knn.pdf} \label{fig:design-knn-sel}} \\
\caption{Query Result Size Effect Analysis}
\label{fig:design-query-sze}
\end{figure}
Interestingly, for the range of selectivities tested for range counts, the
overall query latency failed to converge, and there remains a consistent,
albeit slight, stratification amongst the tested policies, as shown in
Figure~\ref{fig:design-isam-sel}. As the selectivity continues to rise
above those shown in the chart, the relative ordering of the policies
remains the same, but the relative differences between them begin to
shrink. This result makes sense given the asymptotics--there is still
\emph{some} overhead associated with the decomposition, but as the cost
of the query approaches linear, it makes up an increasingly irrelevant
portion of the run time.
The $k$-NN results in Figure~\ref{fig:design-knn-sel} show a slightly
different story. This is also not surprising, because $k$-NN is a
$C(n)$-decomposable problem, and the cost of result combination grows
with $k$. Thus, larger $k$ values will \emph{increase} the effect that
the decomposition has on the query run time, unlike was the case in the
range count queries, where the total cost of the combination is constant.
% \section{Asymptotically Relevant Trade-offs}
% Thus far, we have considered a configuration system that trades in
% constant factors only. In general asymptotic analysis, all possible
% configurations of our framework in this scheme collapse to the same basic
% cost functions when the constants are removed. While we have demonstrated
% that, in practice, the effects of this configuration are measurable, there
% do exist techniques in the classical literature that provide asymptotically
% relevant trade-offs, such as the equal block method~\cite{maurer80} and
% the mixed method~\cite[pp. 117-118]{overmars83}. These techniques have
% cost functions that are derived from arbitrary, positive, monotonically
% increasing functions of $n$ that govern various ways in which the data
% structure is partitioned, and changing the selection of function allows
% for "tuning" the performance. However, to the best of our knowledge,
% these techniques have never been implemented, and no useful guidance in
% the literature exists for selecting these functions.
% However, it is useful to consider the general approach of these
% techniques. They accomplish asymptotically relevant trade-offs by tying
% the decomposition of the data structure directly to a function of $n$,
% the number of records, in a user-configurable way. We can import a similar
% concept into our already existing configuration framework for dynamization
% to enable similar trade-offs, by replacing the constant scale factor,
% $s$, with some function $s(n)$. However, we must take extreme care when
% doing this to select a function that doesn't catastrophically impair
% query performance.
% Recall that, generally speaking, our dynamization technique requires
% multiplying the cost function for the data structure being dynamized by
% the number of shards that the data structure has been decomposed into. For
% search problems that are solvable in sub-polynomial time, this results in
% a worst-case query cost of,
% \begin{equation}
% \mathscr{Q}(n) \in O(S(n) \cdot \mathscr{Q}_S(n))
% \end{equation}
% where $S(n)$ is the number of shards and, for our framework, is $S(n) \in
% O(s \log_s n)$. The user can adjust $s$, but this tuning does not have
% asymptotically relevant consequences. Unfortunately, there is not much
% room, practically, for adjustment. If, for example, we were to allow the
% user to specify $S(n) \in \Theta(n)$, rather than $\Theta(\log n)$, then
% query performance would be greatly impaired. We need a function that is
% sub-linear to ensure useful performance.
% To accomplish this, we proposed adding a second scaling factor, $k$, such
% that the number of records on level $i$ is given by,
% \begin{equation}
% \label{eqn:design-k-expr}
% N_B \cdot \left(s \log_2^k(n)\right)^{i}
% \end{equation}
% with $k=0$ being equivalent to the configuration space we have discussed
% thus far. The addition of $k$ allows for the dependency of the number of
% shards on $n$ to be slightly biased upwards or downwards, in a way that
% \emph{does} show up in the asymptotic analysis for inserts and queries,
% but also ensures sub-polynomial additional query cost.
% In particular, we prove the following asymptotic properties of this
% configuration.
% \begin{theorem}
% The worst-case query latency of a dynamization scheme where the
% capacity of each level is provided by Equation~\ref{eqn:design-k-expr} is
% \begin{equation}
% \mathscr{Q}(n) \in O\left(\left(\frac{\log n}{\log (k \log n))}\right) \cdot \mathscr{Q}_S(n)\right)
% \end{equation}
% \end{theorem}
% \begin{proof}
% The number of levels within the structure is given by $\log_s (n)$,
% where $s$ is the scale factor. The addition of $k$ to the parametrization
% replaces this scale factor with $s \log^k n$, and so we have
% \begin{equation*}
% \log_{s \log^k n}n = \frac{\log n}{\log\left(s \log^k n\right)} = \frac{\log n}{\log s + \log\left(k \log n\right)} \in O\left(\frac{\log n}{\log (k \log n)}\right)
% \end{equation*}
% by the application of various logarithm rules and change-of-base formula.
% The cost of a query against a decomposed structure is $O(S(n) \cdot \mathscr{Q}_S(n))$, and
% there are $\Theta(1)$ shards per level. Thus, the worst case query cost is
% \begin{equation*}
% \mathscr{Q}(n) \in O\left(\left(\frac{\log n}{\log (k \log n))}\right) \cdot \mathscr{Q}_S(n)\right)
% \end{equation*}
% \end{proof}
% \begin{theorem}
% The amortized insertion cost of a dynamization scheme where the capacity of
% each level is provided by Equation~\ref{eqn:design-k-expr} is,
% \begin{equation*}
% I_A(n) \in \Theta\left(\frac{B(n)}{n} \cdot \frac{\log n}{\log ( k \log n)}\right)
% \end{equation*}
% \end{theorem}
% \begin{proof}
% \end{proof}
% \subsection{Evaluation}
% In this section, we'll access the effect that modifying $k$ in our
% new parameter space has on the insertion and query performance of our
% dynamization framework.
\section{Conclusion}
In this chapter, we considered the proposed design space for our
dynamization framework both mathematically and experimentally, and derived
some general principles for configuration within the space. We generalized
the Bentley-Saxe method to support scale factors and buffering, but
found that the result was strictly worse than leveling in all but its
best case query performance. We also showed that there does exist a
trade-off, mediated by scale factor, between insertion performance and
query performance for the tiering layout policy. Unfortunately, the
leveling layout policy does not have a particularly useful trade-off
in this area because the cost in insertion performance grows far faster
than any query performance benefit, due to the way to two effects scale
in the cost functions for the method.
Broadly speaking, we can draw a few general conclusions. First, the
leveling layout policy is better than tiering for query latency in
all configurations, but worse in insertion performance. Leveling also
has the best insertion tail latency performance by a small margin,
owing to the way it performs reconstructions. Tiering, however,
has significantly better insertion performance and can be configured
with query performance that is similar to leveling. These results are
aligned with the smaller-scale parameter testing done in the previous
chapters, which landed on tiering as a good general solution for most
cases. Tiering also has the advantage of meaningful tuning through scale
factor adjustment.
|