Storing a list of files on a filesystem in sqlite

On occasion I wish to have a comprehensive list of the files on a filesystem, then stash that list somewhere for fast access. On Friday 5 October I did this and stored the list in a sqlite database.

Here’s the code. Note the use of Perl to do the actual load–it’s far faster than building an INSERT ... VALUES statement and running it. All the more so since recent versions of sqlite have a fairly hard limit of 500 VALUE clauses on an INSERT statement.

cd /var/tmp
rm -f allfiles.db
find / -type f | egrep --text -v '/(bin|dev|lib|lib64|proc|run|sbin|sys)/' | 
  sed 's/"/""/g' | perl -MDBI -MDBD::SQLite -ne '
    BEGIN { $| = 1;
        $dbh = DBI->connect("DBI:SQLite:database=allfiles.db", {AutoCommit => 1});
        $dbh->do("CREATE TABLE allfiles (id INTEGER PRIMARY KEY, filepath TEXT)");
        $sth = $dbh->prepare("INSERT INTO allfiles VALUES (NULL, ?)") or die $dbh->errstr;
        $dbh->begin_work;
    }
    $sth->execute($_);
    print "* $file_count files\r" if (++$file_count % 500) == 0;
    END { $dbh->commit; $dbh->disconnect; print "\n"; }'

FILE_COUNT="$(echo 'SELECT COUNT(*) FROM allfiles;' | sqlite3 allfiles.db)"
echo "Database contains $FILE_COUNT file names"

Using this, it’s possible to very quickly get a random file name:

FILE_COUNT="$(echo 'SELECT COUNT(*) FROM allfiles;' | sqlite3 allfiles.db)"
N=$((${RANDOM}${RANDOM}${RANDOM} % FILE_COUNT))
FILENAME="$(echo "SELECT filepath FROM allfiles WHERE id=$N;" | sqlite3 allfiles.db)"