Fill the Markdown style list items with suspended indentation in Emacs

advertisements

I'd like Emacs to wrap Markdown-style list items with hanging indentation. In fact, even in fundamental-mode, this works quite well. Consider these three list items:

* Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

- Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

+ Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

When filled using M-q (fill-paragraph), the result is

* Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
  eiusmod tempor incididunt ut labore et dolore magna aliqua.

- Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  nisi ut aliquip ex ea commodo consequat.

+ Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur.

Note that the first two are filled as I'd like, but the last is not. I'm having trouble understanding why the behavior is different in the last case, and how to modify it appropriately.

The comments in the source code for fill-paragraph state that the following steps are tried:

  1. Fill the region if it is active when called interactively.
  2. Try fill-paragraph-function.
  3. Try our syntax-aware filling code.
  4. If it all fails, default to the good ol' text paragraph filling.

There is no active region when I press M-q to fill those list items, so step 1 should be skipped. I have not defined a fill-paragraph-function (I verified that fill-paragraph-function is nil in fundamental-mode).

This means we get to step three. In the fundamental-mode, all three list markers have the same syntax (as verified with describe-char):

character: * (displayed as *) (codepoint 42, #o52, #x2a)
   syntax: _    which means: symbol

character: - (displayed as -) (codepoint 45, #o55, #x2d)
   syntax: _    which means: symbol

character: + (displayed as +) (codepoint 43, #o53, #x2b)
   syntax: _    which means: symbol

Therefore, I do not understand why this step would treat these paragraphs any differently. In looking at the source, it seems this just tries to handle comments, so I think believe three is skipped in the cases above.

So, it seems we move to step four, which makes use of paragraph-start. I tried setting paragraph-start to the following (which allows whitespace before the markers):

"\f\\|[ \t]*$\\|^[ \t]*[*+-] "

This did not work either, so I am at a loss for how to make this work.

Do I need to write a custom fill-paragraph function, or is there a simpler approach?


A simple solution is to use adaptive-fill-regexp. This variable is a regular expression which matches text at the beginning of a line which constitutes indentation. In the cases above, we'd like the list markers to be counted towards indentation.

In the first two cases, these markers are already in the default value of adaptive-fill-regexp, as defined in fill.el:

"[ \t]*\\([-–!|#%;>*·•‣⁃◦]+[ \t]*\\)*"

Note that although a lot of different bullet characters are in the character set in the first group, the plus sign (+) is not. For the three cases above, the following does the trick:

"[ \t]*\\([*+-][ \t]*\\)*"