Notes for blocks and links in CommonMark

Contents

General purpose code to identify ComonMark blocks: list, blockquote, and code.

Blocks that can and cannot interrupt a paragraph

  • An empty blockquote line such as ‘>’ or ‘> >’ is considered blank
  • A completely empty line will collapse a set of blockquotes
  • The following block types can interrupt a paragraph but not a fenced code block or a table, so they don’t need a blank line above them to be processed:
  • Thematic breaks
  • ATX headings (leading ‘#’)
  • List markers
  • Blockquotes
  • Code fences
  • HTML blocks (except type 7–partial open tag)
  • Comments (allows one to comment out lines in a paragraph)
    • We may want to let comment lines comment out table lines
  • Next up is the “lazy paragraph continuation line”: if ! prev_line_was_blank then the line is added to the paragraph, regardless of marker or indentation
  • Can’t interrupt paragraphs, and so require a blank line before:
  • Setext (underline) headings
  • Indented code block
  • Link reference definition
  • Metadata block
  • Table

List blocks

  1. A list starts with a list marker
    • Unordered lists have markers ‘*‘, ‘+’, or ‘-‘ followed by 1 to 4 spaces.
    • For ordered lists, the marker is a number followed by ‘.’ or ‘)’, followed by 1 to 4 spaces
    • If there are more than 4 spaces afer the merker, the line is interpreted as a code block instead.
  2. A list item has an m_margin (marker margin) and a t_margin (text margin):
    • For an unordered list, the m_margin is at the marker
    • For an ordered list, the m_margin is at the first digit of the marker
    • The t_margin is at the first non-blank character following the marker
      –> The m_margin and t_margin are recomputed with every new list item
  3. Subsequent markers for a list must be of the same marker type and appear at or after the m_margin and before the t_margin.
  4. Paragraphs in list items can be lazy: they can be contiuned anywhere on the next non-blank line and don’t have to align to the t_margin
  5. The current list item continues if a new paragraph (that is, following one or more blank lines) starts at the same column as the current t_margin
  6. A sublist is started when:
    • A marker appears at the t_margin or within three spaces after it
    • With respect to the above, a line starting pretty much anywhere when there is no blank line before it is interpreted as the continuation of the current paragraph. A list marker constitutes an exception the this concept:
    • An unordered list marker appearing in a continuation line at the current left margin or within three spaces after it will always start an unordered list. An ordered list marker appearing in a continuation line at the current left margin or within three spaces after it will start a new list only if the marker is ‘1.’ or ‘1)’. This is true even in the following case: ```
      1. Year 2017.
        1. Month of March ```
    • Putting a blank line before the sublist will cause markdown to identify any list marker, including an arbitrary number for an ordered list.
  7. A change of marker type at the m_margin closes the current list and opens a new one.
  8. A list ends when:
    • a list marker appears before the list’s t_margin, or
    • a blockquote marker appears before the list’s t_margin, or
    • a new paragraph following a blank line has a left margin lower than the list’s t_margin
  9. (Note that what might appear to be a paragraph block could actually be a lazy continuation of a paragraph started on the previous line, in which case the list is not closed.) Markdown closes all the list blocks that have a t_margin greater than the left margin of the new block, along with any blockquote and code blocks they have. This can completely collapse a set of nested lists if the block is to the left of the top-most t_margin.
    –> This can cause a situation where a line indented by a tab and spaces can be interpreted as a code block if it falls to the right of the parent list’s m_margin but to the left of the current list’s t_margin, but shifting the line to the right and aligning it with a t_margin will turn it into a paragraph!
    (I note that MultiMarkdown and cmark disagree on list indentation.)
  10. Caution: a blank line in the middle of a deeply nested list can cause Markdown to interperet a following deeply indented line as a code block and not as a new paragraph for the current list item.
  11. Text starting four characters from the t_margin constitutes a code block only if there is a blank line before it, otherwise it’s seen as a lazy paragraph continuation line.
  12. A blockquote charcter ‘>’ appearing at or within three spaces following the current t_margin starts a blockquote block. Normal rules for lazy and aligned paragraphs apply until the next list marker is found or the current list is closed.

Blockquotes

Paragraph

In this paragraph, the next line starts with
1. *cmark* will recognize this as the start of
an ordered list, see the following lines as lazy
paragraph continuation lines, and add them all
to the first item of the ordered list.

In this paragraph, the next line starts with 1. cmark will recognize this as the start of an ordered list, see the following lines as lazy paragraph continuation lines, and add them all to the first item of the ordered list.

Full blockquote style

In this example, cmark sees the blockquote marker on the second line and recognizes the line as a non-lazy continuation of the paragraph. Then it sees the opening list item maarker and makes a list out of it.

> In this paragraph, the next line starts with
> 1. *cmark* will recognize this as the start of
> an ordered list, see the following lines as lazy
> paragraph continuation lines, and add them all
> to the first item of the ordered list.

In this paragraph, the next line starts with 1. cmark will recognize this as the start of an ordered list, see the following lines as lazy paragraph continuation lines, and add them all to the first item of the ordered list.

Lazy blockquote style

In this example, cmark sees the opening list item before it recognizes the line as a lazy paragraph continuation line of the blockquote. It closes the blockquote and starts a list. This imples cmark processes potential list markers before it processes lazy paragraph continuation lines.

> In this paragraph, the next line starts with
1. *cmark* will recognize this as the start of
an ordered list, see the following lines as lazy
paragraph continuation lines, and add them all
to the first item of the ordered list.

In this paragraph, the next line starts with 1. cmark will recognize this as the start of an ordered list, see the following lines as lazy paragraph continuation lines, and add them all to the first item of the ordered list.

Indented code blocks within a blockquote

One needs at least 5 spaces in a quote block after an empty line at the current blocking level to open a code block – an empty space after the ‘>’ and four spaces (or a tab) to open the block.

Immediately nested blockquote

The following will open three nested blockquotes in one go:

  > > > This is a thrid level of blockquote
  >
  > This is frist level, because there is only one '>'

It’s up the the parser to account for the current level of blockquote nesting and open or close levels as new ‘>’ lines are encountered.

Closing a blockquote

In cmark,

  • A completely blank line or the introduction of a list will close all levels of open blockquote. Always.
  • A line of ‘>’ markers, regardless of number, followed by nothing starts a new paragraph with the next line, or provides a “blank” line for opening a blockblock at whatever level of indentation is indicated by the following blockquote entry.