Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use compact lists when converting from docx to markdown

I'm using pandoc on Windows to convert from a .docx file to a .md file.

The flags I'm using are the following:

pandoc --wrap none --to markdown_github --output fms.md "FMS.docx"

When I view the output markdown file, it has newlines separating each list item. The documentation defines this as a loose list such as the one below.

- one

- two

- three

I want to use a compact list for the output such as the one below.

- one
- two
- three

Is there a flag to make pandoc output a compact list?

If not, how can I use a filter to achieve the desired output?

like image 827
styfle Avatar asked Oct 24 '25 18:10

styfle


1 Answers

There is no flag to achieve this, but there is a simple solution using pandoc's filter functionallity. Internally, list items are represented as a list of blocks; a list is compact if all block items only consist of Plain blocks. If all items consist of only a single paragraph, then it is sufficient to change the type of the item block from Para (for paragraph) to Plain.

The Lua program below does just that. Save it and use it as a Lua filter: pandoc -t markdown --lua-filter the-filter.lua your-document.docx (requires pandoc 2.1 or later):

local List = require 'pandoc.List'

function compactifyItem (blocks)
  return (#blocks == 1 and blocks[1].t == 'Para')
    and {pandoc.Plain(blocks[1].content)}
    or blocks
end

function compactifyList (l)
  l.content = List.map(l.content, compactifyItem)
  return l
end

return {{
    BulletList = compactifyList,
    OrderedList = compactifyList
}}

If one prefers Haskell over Lua, it's also possible to use the filter below with pandoc -t markdown --filter the-filter.hs your-document.docx:

import Text.Pandoc.JSON

main = toJSONFilter compactifyList

compactifyList :: Block -> Block
compactifyList blk = case blk of
  (BulletList items)         -> BulletList $ map compactifyItem items
  (OrderedList attrbs items) -> OrderedList attrbs $ map compactifyItem items
  _                          -> blk

compactifyItem :: [Block] -> [Block]
compactifyItem [Para bs] = [Plain bs]
compactifyItem item      = item

The same would also be possible using a Python filter in case neither Lua nor Haskell is an option. See pandoc's filters page for details.

like image 171
tarleb Avatar answered Oct 26 '25 14:10

tarleb