Idea: categorizing commands in /usr/bin by package

Contents

The idea

In [thought experiment: QRetail directory layout on Linux], I noted:

/user/bin/qretail/bin is not standard. The usual thing to do is simply throw all the executables and shell scripts into /usr/bin; that way it’s not necessary to update the PATH.

That got me to thinking: what if all commands in /usr/bin and /bin (which on my system is a symlink to /usr/bin) were put in subdirectories by category? The obvious way to categorize them would be by the name of the package they came from.

A test script

So I wrote a script that does the following:

  • Lists all the RPM packages on the system
  • If a package has /bin and /usr/bin files, examines my /usr/bin to see if my current system has that file
  • If the command is on my system, creates a subdirecotry in /r/bin-test/usr,bin with the name of the package (usr,bin is a stand-in for the actual path /usr/bin)
  • Creates an empty file with the same name as the one in /usr/bin
  • If the package has only one executable file, and it’s the exact same name as the package itself, moves it to /r/bin-test/usr,bin to reduce redundancy (eg, usr,bin/cmark/cmark becomes simply usr,bin/cmark)

Here’s the script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#!/bin/bash
rm -rf usr,bin
mkdir usr,bin

# Create a file with the list of all packages on the system
if [ ! -f x-rpms.text ]; then rpm -qa | sort >x-rpms.text; fi

# Create usr,bin with programs grouped by package
RPM_COUNT=0
for RPM in $(<x-rpms.text)
do
    # DN is the package name terminated at the first '-', which groups similar
    # packages together (e.g. all 'perl-*' packages go into 'perl')
    DN="$(echo $RPM|awk 'match($0, /^(^[[:alnum:]\+_]+)/, a){print a[1]}')"
    [ "usr,bin/$DN" ] || continue
    echo $RPM
    rpm -ql $RPM | grep -e '^/bin' -e '^/usr/bin' | while read FILE     # -e '^/sbin' -e '^/usr/sbin'
    do
        [ -x $FN ] || continue
        [ -d usr,bin/$DN ] || mkdir usr,bin/$DN
        BASE_FN="$(basename $FILE)"
        [ -e usr,bin/$DN/$BASE_FN ] && continue
        if [ -h $FILE ]
        then
            RL_FN="$(readlink $FILE)"
            echo "  $FILE -> $RL_FN"
            ln -s $RL_FN usr,bin/$DN/$BASE_FN
        else
            echo "  $FILE"
            touch -r $FILE usr,bin/$DN/$BASE_FN
        fi
    done
    [ -d usr,bin/$DN ] || continue
    #[ $((RPM_COUNT++)) -gt 50 ] && break
done

# Search for directories that contain only one file with the same name as the directory
for DIR in usr,bin/*
do
    FILE_COUNT="$(ls $DIR | wc -l | cut -f1 -d' ')"
    [ $FILE_COUNT -gt 1 ] && continue
    FILE="$(ls $DIR)"
    [ "$FILE" == "$(basename $DIR)" ] || continue
    echo "Moving single $FILE to usr,bin"
    mv $DIR/$FILE usr,bin/_$FILE
    rmdir $DIR
    mv usr,bin/_$FILE usr,bin/$FILE
done

Results

With this approach, my /usr/bin directory would go from almost 3,800 scattered files to just under 550 directories representing the packages those files are in, plus about 190 programs whose names are exactly the same as their package.

As an example, consider the following seemingly unrelated files:

-rwxr-xr-x. 1 root root 200128 Aug  2  2016 bccmd
-rwxr-xr-x. 1 root root  36928 Aug  2  2016 bluemoon
-rwxr-xr-x. 1 root root 145552 Aug  2  2016 bluetoothctl
-rwxr-xr-x. 1 root root 612856 Aug  2  2016 btmon
-rwxr-xr-x. 1 root root 148752 Aug  2  2016 ciptool
-rwxr-xr-x. 1 root root 247168 Aug  2  2016 gatttool
-rwxr-xr-x. 1 root root 144288 Aug  2  2016 hciattach
-rwxr-xr-x. 1 root root 199880 Aug  2  2016 hciconfig
-rwxr-xr-x. 1 root root 417912 Aug  2  2016 hcidump
-rwxr-xr-x. 1 root root 149832 Aug  2  2016 hcitool
-rwxr-xr-x. 1 root root  15400 Aug  2  2016 hex2hcd
-rwxr-xr-x. 1 root root 100608 Aug  2  2016 l2ping
-rwxr-xr-x. 1 root root 117344 Aug  2  2016 l2test
-rwxr-xr-x. 1 root root  95672 Aug  2  2016 mpris-proxy
-rwxr-xr-x. 1 root root 148656 Aug  2  2016 rctest
-rwxr-xr-x. 1 root root 109520 Aug  2  2016 rfcomm
-rwxr-xr-x. 1 root root 214072 Aug  2  2016 sdptool

Now they would all be in /usr/bin/bluez.

The PATH variable

One dowside to this approach would be a huge PATH variable, since it would need to include all 550+ directories in /usr/bin. Or maybe not. One way this could be viable would be to extend the PATH definition for something like:

PATH=/usr/bin:/usr/bin/@

The ‘@’ would be a hint to the shell to search the immediate subdirectories under /usr/bin for a match to the command.

One downside to this approach is it would require all shells to be updated to handle /@ in their PATH. In addition to bash, Linux has tcsh and zsh, and the Midnight Commander.