How well does a GNU dbm database scale?
While working with linked lists to mimic Brtieve keys for QRetail, I idly thought, “how well does a Berkeley or GNU database scale?” I’m not talking a little database with a few thousand keys in it; I’m thinking up in the hundreds of millions. For example, a database of citizens of a large country or group of countries; or a year’s worth of call detail records for a large telephone company with eight million subscribers, each of whom could make any number of calls each day; or a year’s worth of credit card transactions for a major credit supplier.
The first problem I had was getting a dataset with the required quantity of records. I hit on the idea of using a database of numbers converted to English words:
Number (key) | In English (value) |
---|---|
o | zero |
1 | one |
2 | two |
3 | three |
(over 123 million rows not shown) | |
123456788 | one hundred twenty-three million, four hundred fifty-six thousand, seven hundred eighty-eight |
123456789 | one hundred twenty-three million, four hundred fifty-six thousand, seven hundred eighty-nine |
123456790 | one hundred twenty-three million, four hundred fifty-six thousand, seven hundred ninety |
(and so on) |
I wrote a Perl program to generate the input file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
#!/usr/bin/perl -w use strict; use integer; my @number = qw( undef one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen ); my @tens = qw(x x twenty thirty forty fifty sixty seventy eighty ninety); my @multiplier = (undef, 'thousand,', 'million,', 'billion,'); for (my $N=0; $N <= ($ARGV[0] ? $ARGV[0] : 1000000); $N++) { my $text = ''; # Handle zero as a special case print("store 0 \"zero\"\n"), next if $N == 0; # Process the number in groups of three, right-to-left my ($Q, $exponent, $j) = ($N, 0, 0); while ($Q) { my $group = $Q % 1000; # Add "thousand" or "million" if needed ($j tells us if we need a trailing comma) $text = ($j ? $multiplier[$exponent] : substr($multiplier[$exponent], 0, -1))." $text" if $exponent && $group; # Determine the words for the last two digits of the group $j = $group % 100; $text = ($j >= 20 ? $tens[$j/10] . ($j % 10 ? '-' . $number[$j % 10] : '') : $number[$j]) . " $text" if $j; # Determine the hundreds $text = ($number[$group/100] . ' hundred')." $text" if $group > 99; # Move to the next group of thousands $Q /= 1000; $exponent++; $j = $group; # (use j to pass this group's value to the next iteration) } my $sp = ''; print "store $N \"", substr($text,0,-1), "\"\n"; } |
That done, it was a simple (but initially futile) task to generate the database:
[brian@raven numbers]$ RECORDS=125000000; time ./numbers.pl -vmax=$RECORDS | pv -ls$RECORDS | gdbmtool --newdb numbers.db 74.5M 12:14:23 [1.69k/s] [======================================> ] 59% gdbmtool: interrupt
As can be seen from the above, I cancelled the run after 12 hours and only 74
million records. It turned out the old disc on which I was running the program
was nearing its end of life: smartctl
showed numerous errors when I ran it to
check the drive.
I replaced the drive with a 240 GB SSD (that really should be running in sparrow but which to now I haven’t made the time to set up the latest Fedora on it) and I ran the program again:
[brian@raven numbers]$ RECORDS=125000000; time ./numbers.pl -vmax=$RECORDS | pv -ls$RECORDS | gdbmtool --newdb numbers.db 112.1M 10:13:36 537/s] [===========================================================> ] 89% ETA 2:33:08 gdbmtool: interrupt
I let the program run for only ten hours before I cancelled it. (The program was running in an ssh shell from my laptop. I needed to take the laptop somewhere and becasue I wasn’t running screen on raven I lost both ssh and raven’s shell when I disconnected the laptop from my network.)
At the time I cancelled the program, it had inserted 112,063,312 records over 10.25 hours, for an overall average of 3,044 records/second. The file size was 18.5 GiB (19,864,502,272 bytes.)
Two things are notable from the above:
- At 112 million records, the GNU database was handling less than 1,000 new records a second. That’s way down from the average of 175,000/second at the outset.
- The ETA is very optimistic. With 12.9 million records to go and only 537 new records/second, the ETA is more like 6 hours and 50 minutes, and that doesn’t account for the distinct possibility later inserts will be even slower.
Another test I ran started with 131,072 (217) records and doubled from there. Here are the results:
Records | Elapsed Time | Average records/second |
---|---|---|
131,072 (217) | 0:00:01 | 131,072 |
262,144 (218) | 0:00:02 | 131,072 |
524,288 (219) | 0:00:03 | 174,763 |
1,048,576 (220) | 0:00:06 | 174,763 |
2,097,152 (221) | 0:00:13 | 161,319 |
4,194,304 (222) | 0:00:31 | 135,300 |
8,388,608 (223) | 0:01:19 | 106,183 |
16,777,216 (224) | 0:07:31 | 37,200 |
33,554,432 (225) | 0:12:55 | 43,296 |
67,108,864 (226) | 4:33:02 | 4,096 |
112,063,312 (226+) | 10:13:36 | 3,044 |
Aside from a curious anomaly at 33.5 million records, where the average number of records/second went up, (possibly due to not having to allocate as much new space to the file and less effort to re-balance the B-tree) the more records added to the file the slower they went. This is especially true for the jump from 33.5 to 67.1 million records. Although the number of records only doubled, the elapsed time jumped by more than an order of magnitude (it took 21 times longer for just twice as many records.)
Finally, I ran the program again on raven, this time using screen so I could disconnect as needed. Here is the result:
[brian@raven numbers]$ RECORDS=125000000; time ./numbers.pl -vmax=$RECORDS | pv -ls$RECORDS | gdbmtool --newdb numbers.db 125M 33:12:02 [1.05k/s] [============================================================>] 100% real 1992m4.133s user 227m6.004s sys 608m43.781s
The final file size was 20.6 GiB (22,154,166,272 bytes.)
Under screen I noticed the program ran more slowly than expected, possibly because the system lowers a program’s priority when screen is disconnected. I later ran the program on the console and saw a 53% improvement in the number of adds/second:
[brian@raven numbers]$ RECORDS=125000000; time ./numbers.pl -vmax=$RECORDS | pv -ls$RECORDS | gdbmtool --newdb numbers.db 125M 21:43:12 [1.60k/s] [============================================================>] 100% real 1303m15.037s user 217m51.004s sys 579m21.597s