I generate a Reveal.js presentation from Pandoc markdown. It works something like this:
# heading 1) and 2 (## heading 2) starts a new slide---)One can create a two columns layout using the following syntax (which creates <div>s with respective classes):
# My cool slide
:::columns
::::::column
This is column 1
::::::
::::::column
This is column 2
::::::
:::
But this is quite tedious. I'd rather use three plus signs (+++) to define a two column layout, like this:
# My cool slide
This is column 1
+++
This is column 2
# Another slide
But with no columns
I think it should be easy to convert this (+++) into the result expected by Pandoc (:::columns...).
I tried using split method first:
markdown.split(/^(##? .*)|(---)$/) do |slide|
# Do another regex for the slide content that looks for a `+++`:
# - If it finds one, replace it with the `:::columns` (etc.) syntax
# - If it finds none, just leave it be
end.join # Glue everything together again
But I'm quite confused how this works.
1st iteration:
$1 is "# My cool slide"slide is ""2nd iteration:
$1 is "# My cool slide"slide is "# My cool slide"3rd iteration:
$1 is "# Another slide"slide is "\n\nThis is column 1\n\n+++\n\nThis is column 2\n\n"4th iteration:
$1 is "# Another slide"slide is "# Another slide"5th iteration:
$1 is nilslide is "\n\nBut with no columns"What is happening here?
Not sure if regexing your way out is a good solution. I'd break it up into logical chunks where you have a lot more freedom to do what is necessary (write a parser), then convert it into a different format:
# slides.rb
md = <<~MD
# My cool slide
This is column 1
+++
This is column 2
# Another slide
But with no columns
MD
enum = md.split("\n").each
slides = {}
# break it up into slides
loop do
slide = []
line = enum.next
loop do
slide << line # collect lines
break if enum.peek.match?(/^#/) # until next comment
line = enum.next
end
comment, *body = slide # assuming a single line comment
# break slides into columns
# join("\n") if you want to keep newlines
columns = body.join.split("+++")
slides[comment] = columns
end
# p slides
# => {"# My cool slide" => ["This is column 1", "This is column 2"], "# Another slide" => ["But with no columns"]}
# join it together
slides.each do |comment, columns|
puts comment
puts
if columns.size > 1
puts ":::columns"
columns.each do |col|
puts "::::::column"
puts col
puts "::::::"
end
puts ":::"
else
puts columns
end
puts
end
Test:
$ ruby slides.rb
# My cool slide
:::columns
::::::column
This is column 1
::::::
::::::column
This is column 2
::::::
:::
# Another slide
But with no columns
I'll come out with a rather complicated regular expression to match slides (capturing the heading or separator in one group and the content of the slide in another group).
x and m flags)( # Capturing group n°0: begin, heading or slide separator.
(?:
\A # Begin of text (for the first slide).
|
^\#{1,2}?[^\#\r\n]+\r?\n # A heading of level 1 or 2.
| # or
^-{3,}\r?\n # A horizontal ruler.
)
)
( # Capturing group n°1: The content of the slide.
(?: # A line of content.
^ # Match begin of line.
(?!\#{1,2}[^\#]|-{3,}) # Not followed by a heading or horizontal line.
[^\r\n]*(?:\r?\n|\z) # The line content, new line or end of text.
)+
)
See it in action here: https://regex101.com/r/MkTwXs/2
markdown = <<~END_OF_MARKDOWN
The first slide could start without a heading ;-)
+++

# My cool slide
This is column 1 with a link:
[Go to the last slide](#the-end)
+++
This is column 2 followed by
### A title of level 3
Some text, with list items:
- Item 1
- Item 2
- Sub-item
- Last item
# Another slide
But with no columns
## Another slide of level 2 because this is what you wanted
And here comes the content of slide 3, in the first column
+++
Then the content in the second column.
And `+++` or `---` should not break anything.
---
A slide without a header but with some CSS:
```css
body {
font-family: Arial, sans-serif;
}
```
---
<a id="the-end"></a>

+++
Thanks for your attention!
Any questions?
END_OF_MARKDOWN
# The regular expression to match slides.
slidePattern = %r{
( # Capturing group n°0: begin, heading or slide separator.
(?:
\A # Begin of text (for the first slide).
|
^\#{1,2}?[^\#\r\n]+\r?\n # A heading of level 1 or 2.
| # or
^-{3,}\r?\n # A horizontal ruler.
)
)
( # Capturing group n°1: The content of the slide.
(?: # A line of content.
^ # Match begin of line.
(?!\#{1,2}[^\#]|-{3,}) # Not followed by a heading or horizontal line.
[^\r\n]*(?:\r?\n|\z) # The line content, new line or end of text.
)+
)
}mx
# Get all the slide matches.
slides = markdown.scan(slidePattern)
# Convert each slide match (heading/separator + content) into a string.
slides.map! { |slideMatch|
# Take the content and split it with the column separator.
columns = slideMatch[1].split(/^\+{3,}$/m)
if columns.length() > 1
# Wrap each column into a child div with the `column` class.
columns.map! { |column|
# Trim the column content before wrapping it.
column.gsub!(/\A(?:\r?\n)+|(?:\r?\n)+\z/, '')
"\n::::::column\n#{column}\n::::::\n"
}
# Return the heading or separator and all the columns in the parent div.
slideMatch[0] + "\n:::columns\n#{columns.join()}\n:::\n\n"
else
# No columns found, so return the heading or separator and the content.
slideMatch[0] + slideMatch[1]
end
}
puts slides.join()
The output:
:::columns
::::::column
The first slide could start without a heading ;-)
::::::
::::::column

::::::
:::
# My cool slide
:::columns
::::::column
This is column 1 with a link:
[Go to the last slide](#the-end)
::::::
::::::column
This is column 2 followed by
### A title of level 3
Some text, with list items:
- Item 1
- Item 2
- Sub-item
- Last item
::::::
:::
# Another slide
But with no columns
## Another slide of level 2 because this is what you wanted
:::columns
::::::column
And here comes the content of slide 3, in the first column
::::::
::::::column
Then the content in the second column.
And `+++` or `---` should not break anything.
::::::
:::
---
A slide without a header but with some CSS:
```css
body {
font-family: Arial, sans-serif;
}
```
---
:::columns
::::::column
<a id="the-end"></a>

::::::
::::::column
Thanks for your attention!
Any questions?
::::::
:::
I would avoid using a regular expression to handle what you want to do. Instead, try to implement a Markdown extension on a proper parser. Why? Because of cases like this one:
A slide with some plain text:
```
What will happen if we have `+++` or `---` below?
---
Probably break everything!
+++
No?
```
The column separator +++ is inside a block of plain text and it
should not be detected as a column separator :-/
As you pointed out in your final comment, creating a Pandoc filter will be safe and easier to implement. The AST (abstract syntax tree) is the best way to manipulate the document and change it before the final output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With