New Python Markdown extension proc_summary

Contents

Adds a customizable section summarizing the document creation. By default the summary is placed at the end of the document, but that can be changed by including a {PROCESSING_SUMMARY} marker (or a custom marker) in the source file.

Syntax

The format of the summary section is controlled by the summary layout, which is set by one of the following items; the first one found is used:

  • An environment variable MDX_PROC_SUMMARY_LAYOUT
  • A meta-data keyword proc_summary_layout or ProcSummaryLayout, if the meta extension is enabled
  • A configuration option named proc_summary_layout
  • The built-in default layout:
      %h%N%V{%N%J{File %F (%T) processed}%K{Created} on %R}

The layout consists of text interspersed with any of the following formatting codes. The codes were chosen so as not to conflict with existing date and time codes.

Code Meaning Notes
%V{} <div> tag with class “processing-summary”. 2
%F Source document’s file name, without the path. See Note 1. 1
%P Source document’s directory, if known (P=Path.) 1, 3
%T{time-format} Source document’s modification date and time. 1, 4
%R{time-format} Time that markdown was run to create the output (R=Run.) 4
%Q List of processors involved in creating the document. 5
%E Time spent rendering the document, in seconds (E=Elapsed)  
%h Horizontal rule, typically rendered as <hr/>\n 6
%v Version of Python Markdown used to create the document. 6
%L Line break, typically rendered as <br/>\n  
%N Newline, always rendered as \n  
%J{} Include text between { and } only if the input is from a file.  
%K{} Include text between { and } only if the input is from stdin.  
%C{} Text between { and } is formatted as an HTML comment.  
%% Percent sign  

Notes:

  1. markdown_py, markdown.markdown, and markdown.markdownFromFile do not record how the text got to the processor, be it from stdin, a file, or passed as a string to the object instance. If you’re processing a file and want to use the %F, %P or %T format codes, you have give the file name to the processor. You can use one of the following; the first one found is used:

    • Put the file name into an environment variable named MDX_PROC_SUMMARY_FILE
    • If the meta extension is enabled, you can put the file name into a proc_summary_file or ProcSummaryFile metadata keyword in the source file.
    • Pass the file name in the proc_summary_file configuration option.

    If none of the above are set, stdin is assumed, meaning %K parts of the layout are used and %J parts are dropped.

  2. For the %V{} code, if %N (newline) is the first item after the ‘{‘, a newline is also inserted before </div>.

  3. The %P code includes a trailing ‘/’ (or as appropriate to the operatng system) if it has a value, so to show both the file path and name use %P%F.

  4. For the %T and %R codes, the {time-format} is optional. See Time and date formatting below for details.

  5. See the example for full details on how %Q is rendered.

  6. In contrast to most of the other codes, %h and %v codes are lower case

The default layout gives output similar to the following:

<hr/>
<div class="processing-summary">
    File file.txt (1 January 2018 12:34) processed on 3 February 2018 23:45
</div>

Special layout strings

The following strings have special meaning for the summary layout. While they can be used anywhere, they are intended for use by the MDX_PROC_SUMMARY_LAYOUT environment variable to override layouts defined in a configuration file or meta-data keyword, in case you need do to this but don’t have the ability to (or don’t want to) edit a configuration file or the source document.

Value Meaning
__DEFAULT__ Use the default layout built to the extension
__CONFIG__ Use the layout as defined in a config entry or file (falls back to default if not set)
__NONE__ Don’t set up a summary at all. Same as setting the marker to __NONE__.

Summary placement

By default the summary is added at the end of the file, or just before </body> if it exists. This can be changed by placing a marker within the document. The default marker is:

{PROCESSING_SUMMARY}

The marker can be changed by one of the following; the first one found is used:

  • An environment variable named MDX_PROC_SUMMARY_MARKER
  • A meta-data keyword proc_summary_marker or ProcSummaryMarker, if the meta extension is enabled
  • A configuration option named proc_summary_marker

Setting the marker to __NONE__ in a meta-data keyword or environment variable skips the summary section in the output. If there is a summary marker in the file, it will be removed.

Time and date formatting

The default format for %T and %R consists of the month’s full name (for example, “January”,) the day of the month, the year, and hours and minutes in 24 hour time format. The exact ordering of the elements is locale dependent. If wish to use a different format, you use the following format codes to create your own, and put them between { and } following the %T and %R codes.

With the exception of %D, which is unique to this module, these codes are drawn directly from Python’s ‘time’ module.

Code Meaning
%a Weekday as locale’s abbreviated name.
%A Weekday as locale’s full name.
%w Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.
%d Day of the month as a zero-padded decimal number.
%D Day of the month as a decimal number, not zero-padded
%b Month as locale’s abbreviated name.
%B Month as locale’s full name.
%m Month as a zero-padded decimal
%y Year without century as a zero-padded decimal number.
%Y Year with century as a decimal number.
%H Hour (24-hour clock) as a zero-padded decimal number.
%I Hour (12-hour clock) as a zero-padded decimal number (‘time’ module)
%p Locale’s equivalent of either AM or PM.
%M Minute as a zero-padded decimal number.
%S Second as a zero-padded decimal number.
%f Microsecond as a decimal number
%z UTC offset in the form +HHMM or -HHMM (empty string if the object is naive).
%Z Time zone name (empty string if the object is naive).
%j Day of the year as a zero-padded decimal number.
%U Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.
%W Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.
%c Locale’s appropriate date and time representation.
%x Locale’s appropriate date representation.
%X Locale’s appropriate time representation.
%% A literal “’%’” character.

Example

Here is a very verbose summary layout. The string is broken into several lines for readabilty; however, newlines are removed from the layout before processing starts. (Use %N in the string to indicate newlines that appear only in the rendered HTML, or %L (“line break”) for a line break rendered as <br/>.)

  <strong>Processing summary:</strong>%L
  %J{File %P%F (%T{%Y-%m-%d %H:%M:%S.%f})%L Converted to HTML}
  %K{Created}
  by Python Markdown %v on %R{%B %D, %Y at %H:%M:%S}%L
  <em>Processing time: %E seconds</em>%L
  <strong>Processors:</strong>%L
  %Q
}%N%h

Here’s an example that uses the above string:

# Create a test markdown file
md_file = '/tmp/test.md'
fp = open(md_file, 'w')
fp.write('Testing')
fp.close()

# Convert it to HTML
import sys
sys.path.append('/home/neepawa/python')
from markdown import markdownFromFile
from markdown.extensions.extra import ExtraExtension
from md_ext.proc_summary import ProcSummaryExtension

verbose_layout="""%h%N%V{%N
  <strong>Processing summary:</strong>%L
  %J{File %P%F (%T{%Y-%m-%d %H:%M:%S.%f})%L Converted to HTML}
  %K{Created}
  by Python Markdown %v on %R{%B %D, %Y at %H:%M:%S}%L
  <em>Processing time: %E seconds</em>%L
  <strong>Processors:</strong>%L
  %Q
}%N%h"""

print(markdownFromFile(input=md_file, extensions=[
  ExtraExtension(),
  ProcSummaryExtension(layout=verbose_layout, file=md_file)
]))

The output from the example is:

<p>Testing</p><hr/>
<div class="processing-summary">
      <strong>Processing summary:</strong><br/>
      File /tmp/test.md (2018-11-22 03:01:53.688316)<br/>
     Converted to HTML    by Python Markdown 2.6.7 on November 22, 2018 at 13:01:53<br/>
      <em>Processing time: 0.15 seconds</em><br/>
      <strong>Processors:</strong><br/>
      <div class="processor-list">
        <div class="phase">
            <span class="phase-name">Pre-processors:</span>
            <span class="processor-names">normalize_whitespace, fenced_code_block,
            html_block, footnote, abbr, reference</span>
        </div>
        <div class="phase">
            <span class="phase-name">BlockParsers <span class="block-count">(2 blocks)</span>:</span>
            <span class="processor-names">proc_summary, markdown_block, empty,
            indent, defindent, code, table, hashheader, setextheader, hr, olist,
            ulist, deflist, quote, paragraph</span>
        </div>
        <div class="phase">
            <span class="phase-name">Tree processors:</span>
            <span class="processor-names">footnote, inline, prettify,
            attr_list</span>
        </div>
        <div class="phase">
            <span class="phase-name">Post-processors:</span>
            <span class="processor-names">raw_html, amp_substitute, footnote,
            proc_summary, unescape</span>
        </div>
    </div>
</div>
<hr/>

And it renders as:

Testing


Processing summary:
File /tmp/test.md (2018-11-21 13:01:53.688316)
Converted to HTML by Python Markdown 3.0.1 on November 21, 2018 at 13:01:53
Processing time: 0.15 seconds
Processors:
Pre-processors: normalize_whitespace, fenced_code_block, html_block, footnote, abbr, reference
BlockParsers (2 blocks): proc_summary, markdown_block, empty, indent, defindent, code, table, hashheader, setextheader, hr, olist, ulist, deflist, quote, paragraph
Tree processors: footnote, inline, prettify, attr_list
Post-processors: raw_html, amp_substitute, footnote, proc_summary, unescape

Note that the processor list is formatted with four phases, with a list of processors for each phase. In the HTML document, the line length of the processors list defaults to 80 for readability, and can be changed using one of the following; the first one found is used:

  • The value in an environment variable named MDX_PROC_SUMMARY_QWIDTH
  • A proc_summary_qwidth or ProcSummaryQWidth metadata keyword, if the meta extension is enabled
  • A proc_summary_qwidth configuration option

Styling

As can be seen above, most summary elements have class names that can be referenced in a stylesheet to define their appearance.

Usage

See Extensions for general extension usage. Use proc_summary as the name of the extension.

This extension accepts the configuration options. An option set in an environment variable overrides both a config option and a meta-data keyword, while a meta-data keyword overrides the config option.

Option Meta-data Environment variable Usage
layout ProcSummaryLayout MDX_PROC_SUMMARY_LAYOUT Summary layout (format)
file ProcSummaryFile MDX_PROC_SUMMARY_FILE File name of source document
marker ProcSummaryMarker MDX_PROC_SUMMARY_MARKER Where in the document to place the summary
qwidth ProcSummaryQWidth MDX_PROC_SUMMARY_QWIDTH HTML line width of “Processors” line for layout code %Q