New script: update-python3-docs

Contents

update-python3-docs

Python of late has become the language of choice for system administration tasks, edging out Perl which is now consisdered passé. Figuring it would be a good idea to pick up some Python skills, I first decided to get a dump of the available on-system documentation. I much prefer having documentation like this in a local collection of text files, because I can use my jgrep utility to rapidly search them for keywords.

Because the Python modules are included in the collection, and I expect I’ll be adding Python modules to my system over time, I wrote a script that can find new and updated files and store them while keeping old files in place.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#!/bin/bash
if [ ! -r /etc/shadow ]; then echo "Sorry, must run as root"; exit 1; fi

function check_for_update {
    local PAGE=$1
    if [ -f $PAGE.text ]
    then
      SIZE_CURR=$(stat -c %s $PAGE.text)
      SIZE_NEW=$(stat -c %s $PAGE.new.text)
      if [ $SIZE_NEW != $SIZE_CURR ]
      then
        mv -f $PAGE.new.text $PAGE.text
        echo -n "*"
      else
        rm -f $PAGE.new.text
      fi
    else
      mv $PAGE.new.text $PAGE.text
      echo -n "+"
    fi
}

# First update the main Python man page
cd /home/open/pythondocs-3.4.3-local
echo -n " >> man page: python3"
export JMAN_VIEWER='save_to_file'
[ -f python3.text ] && mv python3.text python3.prev.text
jman python3 &>/dev/null
mv python3.1.text python3.new.text
[ -f python3.prev.text ] && mv python3.prev.text python3.text
check_for_update python3
echo

# Create a working 'pydoc' for Python 3
echo "#!/usr/bin/python3
import pydoc
if __name__ == '__main__':
    pydoc.cli()" >py3doc
chmod +x py3doc

# Update the pages for keywords, modules, and topics
# Note that 'py3doc antigravity' opens a web browser to display xkcd 353
for TOPIC in keywords topics modules
do
    echo -n " >> $TOPIC:"
    [ -d $TOPIC ] || mkdir $TOPIC
    cd $TOPIC
    ../py3doc $TOPIC | sed '1,3d; /^Enter any module name/,$d; s/[[:space:]]\+/\n/g' | 
      sort | while read PAGE
    do
      [ "$PAGE" ] || continue
      echo -n " $PAGE"
      ../py3doc $PAGE >$PAGE.new.text 2>/dev/null
      check_for_update $PAGE
    done
    echo
    cd ..
done
rm -f py3doc

Documentation from the Python’s web site

In addition to the above, I downloaded a complete set of documentation from the Python’s home web site, specifically for my older Python 3.4.3 version:

wget "https://docs.python.org/ftp/python/doc/3.4.3/python-3.4.3-docs-text.tar.bz2"

That set is much more complete, having 1.2 million words in 458 files. The set I generated with the script above has only 89,500 words in 187 files. However, the local modules documentation–which was not included in the set I downloaded from web, and which I excluded from the word count for the local docset–has 737,000 words in 378 files.

Ergo, combining the web documentation with the local modules means I have 837 files to read through, with a total of 1,934,872 words. (I rather doubt I’ll be reading it all!)