We discuss the elm-markdown's approach to extensiblility, the markdown specification, and some advanced uses.
October 11, 2021

dillonkearns/elm-markdown's Core Tools for Extensibility


Hello, Jeroen.
Hello, Dillon.
Well, it's been a while since this library was initially released, but I think this is
a pretty good time to finally talk about LMarkdown, my Markdown parsing package, Dillon Kern's
We just had a big milestone with Google Summer of Code and completed a lot of the core parsing
So I think this is a good time to discuss it.
Very cool.
All right.
And maybe for the sake of people who don't know what Markdown is, can you explain what
Markdown is?
So I think of Markdown as you're trying not to write low level HTML directly.
You're trying to write something that's a little bit more high level.
So instead of writing H1s and H2s, you just put a hash space heading name, and then you
write text and it becomes part of a paragraph.
You put something between stars and it becomes italic.
You put something between double stars and it becomes bold.
And a lot of developers will be familiar with this from working in readmes in GitHub or
documentation for Elm package documentation.
Sometimes writing in certain chat applications will support that.
So it's basically like a higher level way to write things.
And it turns out to be very handy for documentation tasks like that.
If it's a very technical team, you might use it for marketing type things.
If it's less technical, you might have a rich text editor that under the hood is actually
creating strings of Markdown text.
Or sometimes people use different rich text formats.
A lot of these CMS content management systems are using their own sort of custom formats
that are actually just JSON data structures under the hood rather than Markdown.
But Markdown is very common these days.
Yeah, I kind of view it like Microsoft Word or Google Docs, but way easier to write.
You don't need any special application.
It's just a text file and then it renders quite nicely.
And it's designed somewhat to look like it renders to a certain extent.
So it brings attention to, you know, if you have two hashes, two number signs, and that's
an H2 level two heading, then it sort of stands out like a level two heading.
And if you have, you know, lists with dash list item dash list item two dash list item
three, then it looks like a list and then it's going to render as bullet points and
you can do a numbered list and you can write one for each item in the numbered list, but
it's going to turn it into a numbered list rather than those explicit numbers.
So it's a more high level way to express things that sort of looks like the markup that you're
describing and you can sort of read it as raw text or you can render it to something
and display it, and it's also designed to be very lenient about how it parses things
to behave like humans think.
So like, well, for example, if you don't properly close a tag, then it's just going to render
that as raw text.
It's not going to give a parsing error.
And so there are some very complicated rules to the markdown parser.
Actually it might seem simple because as a human you write it and it's simple, but there
are some strange cases where, you know, like if you write a paragraph and it has a number
in the middle of the text and you start the next line with, you know, two period for some
I can't remember, there's like an example that they show in the markdown spec.
The markdown spec is gigantic, but the point is things that might be clear to a human with
context are not necessarily clear to a parser.
And markdown is designed to be friendly to humans more so than friendly to parsers.
It's not like a form.
And you know, so it feels very lightweight when you're writing it as a human and it feels
like it sort of understands what you mean because there are all these rules and there
are actually some ambiguities in the spec.
There's like a tool called Babelmark that you can input markdown and it spits out all
these different render outputs from different parsing libraries because there are actually
some places in the markdown spec where it is not explicitly defined.
I know where there's wiggle room, which is which is strange.
In fact, the a little little bit of history about the markdown spec originally John Gruber,
who's sort of a blogger who writes a lot about Apple products, created this not specification,
but this concept of markdown and used it for his own blog, daring fireball.
I'm guessing because it was lazy.
Yeah, and he wanted a nice way to produce content for his blog in a lightweight way.
And you know, he's sort of like a tech technical person and wanted to be able to just pull
up an editor and write some posts.
And then people started adopting adopting it.
But then there were all these different interpretations of it.
And he didn't want there to be a single specification.
So some like Jeff Atwood and some other folks were sort of pushing for creating a specification
for it.
And John Gruber actually did not want to make a specification.
He was opposed to it.
And in fact, that's why there's now a spec called common mark because it's not called
the markdown spec, because John Gruber created markdown.
He didn't want there to be a specification for markdown.
He wanted it to be sort of something that you can interpret how you want to and there's
not a single way to interpret it, which is not how we tend to think of it today, because
common mark sort of took over in people's minds the concept of markdown.
But it's called common mark because John Gruber said, hey, if you're going to create a specification,
give it a different name.
Yeah, I'm guessing it's because he wanted everyone to be able to write markdown the
one the way they wanted to, and then render it the way they wanted to.
But then the thing is, a lot of people adopted it, and then tooling needed to be created
for it, and they needed to do it in a way that is common.
And that's actually a very good segue to Dillon Kern's LMarkdown, because that sort of was
the goal of that library was to solve some of those problems, but not by creating a new
specification or by not having a specification, but rather by providing some ways to extend,
to render in a custom way, to extend the types of things that you can render through this
formal markdown specification.
So that was really the motivation for creating Dillon Kern's LMarkdown.
So as you say, you have different tooling, you have syntax highlighting in VS code, in
GitHub editors, you have different contexts that are rendering markdown in their own way.
So having some sort of standardization is really nice.
Typically these days we think of markdown as GitHub flavored markdown, which is an add
on to CommonMark.
It's like a superset of the CommonMark specification.
Just to be clear, GitHub flavored markdown is the name of the thing.
So there is a specification document, it's very long, called GitHub flavored markdown,
and the first 90, 95% of the document is just the CommonMark specification document.
And then the remaining 5 to 10% of the document is a couple of additions for the GitHub specification.
For example, auto links, to do items, a few small things like that.
To do items?
Yeah, sure.
Which I make use of quite a bit.
I really like that feature.
So that's not part of CommonMark, that's part of GitHub flavored markdown.
So yeah, it's quite nice to be able to edit something in GitHub, see the syntax highlighting
there, edit it in VS code, edit it in Ulysses or different markdown editing tools.
I use a macOS tool called BareNotes, which displays your raw markdown, but visually it
will make headings look bigger and bold look bold, but you see the star characters and
the hash characters in line.
So it's just a really nice way to edit text and display text, and it's portable, you can
copy it and paste it into different places.
So to me, as you say, having that universal specification is really nice.
But then that leaves you with the problem of, well, what if you want to go beyond what
the specification was designed for and extend it?
So that was the problem that I was trying to tackle with Dillon Kern's Elm Markdown.
So the core tools that help with that problem are, number one, the custom renderers allow
you to render to any output format.
You can render to Elm UI, you can render to HTML, you can render to Elm CSS, you can render
to a string with like ANSI terminal coloring.
You can even get more nuance than that, which we can get into.
The second thing is you have HTML handlers.
So this is really key.
So the Markdown specification...
Which one?
Well, CommonMark and GitHub flavored Markdown allow for HTML to be included in the Markdown.
So sometimes you'll see this where people will insert an image tag into a review or
things like that, right?
Or comments.
That's right.
So comments, which usually will just be rendered directly as comments into the rendered HTML.
And so my vision for Elm Markdown was that I don't believe that it's a great idea in
general to put low level HTML into your Markdown.
Because imagine that you have like a marketing person is editing some content for a post
or a landing page.
They're probably doing it with like a rich text editor function, but maybe it's outputting
it into Markdown.
You really shouldn't be putting low level HTML.
And if you do, it should be abstracted somehow, right?
So that's just too low level.
You shouldn't directly put the HTML.
What do you mean with low level HTML?
Do you mean like images?
Do you mean like paragraphs?
So let's say you're putting a button in there and the button has some styling, right?
So if you put like, where should that live?
Well, if you put a button that says like, click now to sign up for this workshop, you
know, and you put in some styling, you put in some classes or some inline styles or whatever.
Well, then when you want to do another button, are you copy pasting that somewhere else?
It just, it seems like the wrong level of abstraction for Markdown because Markdown
is intended to be a high level abstraction.
You don't put H1s and put classes in your H1s.
And that's sort of the point.
The point is you just say this is a level one heading, this is a level two heading.
It's more, it's more markup than styling, right?
And it's designed for that separation and to be high level.
In my view, at least that's the vision of Dillon Kern's LMarkdown, right?
And so what should you do instead?
Well, to me it should be sort of like just in a high level way using HTML, almost like
you would with like a web component or something to encapsulate this as a button and not represent
the low level details of how to do that, but say what it is.
It is a button and then leave it to the renderer to do that.
It could be a web component.
In the case of Dillon Kern's LMarkdown, it's custom HTML handlers.
So you kind of want to separate the contents of the Markdown from the display of it, which
is going to be the website's job.
Because you don't have like functions to extract these abstractions in Markdown.
It's just markup.
So you should be as declarative and terse as possible to just say, this is a button
and be as minimal in describing it as possible and leave the details of how to present that
elsewhere so that you can properly abstract it.
So that's sort of what HTML handlers are for.
So for that example, if you wanted to render a button, in your Dillon Kern's LMarkdown renderer,
you would say, when I see an HTML tag called signup button, I want these attributes.
So you can say, I need a message attribute or whatever information you want to pass along.
And then you define something that renders to your view type given that input data of
the message.
And it gives you the parsed children.
So if you have Markdown within that HTML button, you could have like a list.
Well, you probably wouldn't have a list of things, but you could have something in bold
inside of the button.
You get that in your renderer and you can just pass that through and display that.
So that's sort of like the philosophy of how Dillon Kern's LMarkdown tries to be extensible,
just like John Gruber was trying to solve that problem by saying, hey, implement the
spec however you want to.
Dillon Kern's LMarkdown says, no, like write high level HTML tags.
And then in your specific implementation, you decide what to do with those HTML tags.
If you're displaying it in something else, you could use a web component to display it.
You could use a React library to display it.
It doesn't matter, but just give me the high level information and let the implementation
decide how to present that.
So you follow the GitHub flavored Markdown specification.
And then for anything else that you want to be custom or if you want to add some sort
of extensibility to the document, then you do it through HTML tags, matchers and renders.
Is that right?
Exactly right.
So I said there were three key pieces of providing extensibility.
The first one I mentioned was being able to customize what you render to.
If you want to render to Elm UI or Elm CSS or Elm HTML or a string, you can do all of
those things.
The second one being the HTML handlers we just talked about.
And the third one is, well, Elm is very nice, I think, for working with data.
That's one of the places where Elm really shines is transforming data, working with
nicely typed data.
And well, you parse Markdown and what do you get?
You get data.
It's the Markdown blocks, the abstract syntax tree.
I feel like we need to step back a tiny bit because you talk about parsing and rendering.
So what is this library?
Is it a parser?
Is it a renderer?
Is it something else?
Good question.
Because you have a lot of parts to this library.
I think it's nice to explain how it's split up.
That's a great, great idea.
So is it a parser?
Is it a renderer?
Those are...
Can you use them separately also?
You can.
You can just run a parser if that's all you want for whatever reason.
If you don't want to use the renderer, then you can just...
In fact, I'll give a concrete example of that.
So on my incremental site, I use the Markdown parser, Dillon Kern's Elm Markdown,
to parse all of my notes and get back references.
We've discussed this before.
So any notes that link to each other...
So if you're on a given note page, if you're on the page, Jeroen's Hierarchy of Constraints,
for example, the most famous note, then you can see other notes that link back to that
And how does it do that?
Well, it parses the Markdown and it has nothing to do with rendering.
I'm just extracting data.
So I can run the Markdown parser and I can go look for anything that's a link in there.
Dillon Kern's Elm Markdown also provides an API for folding over Markdown blocks because
you can have nested Markdown blocks and nested Markdown inlines.
What do we use if folding?
Well, so if you want to look for all links...
Okay, so it's to collect things, for instance?
So in this case, we're looking for all these links.
Well, you could have a paragraph, but you could also have...
There are these types of container blocks.
So you could have a list, which is a container block, and then you could have a nested list
within that list, which is another container block.
And then you could have a link within that nested list.
Well, you don't want to go through and manually traverse that.
You could, but it would be a lot of work.
So the library provides a few helpers to help you do that traversal so you don't have to
manually and just say, hey, just go through all of the Markdown blocks and pass them to
me and I want to fold it up into this result.
And so it provides that type of helper.
It's kind of like Elm Review.
You traverse through an AST, and while that is pretty simple in theory, there's a lot
of things you need to know.
What does this node contain?
What children should it allow to visit?
So it makes the visitor pretty complex.
So your fold function makes that easy.
It makes it easy.
It makes it so you're not going to make a mistake implementing that and you don't have
to worry about the little details.
So yeah, and Elm is great for traversing these data structures and transforming things and
extracting data.
It just really shines because the types are so explicit and it's so nice to map over and
transform things.
So I really enjoy using these helpers to parse the AST and then extract information from
So that's one use case that you can use it for.
Also you can transform the AST.
You can transform the blocks.
As I was saying before, the three different areas that Dillonkern's Elm Markdown provides
you with ways to sort of extend the way you're presenting things.
This is the third way, is sort of transforming the Markdown blocks as data.
Because you can, let's say you wanted to take any links that are HTTP and turn them into
Or you could even validate it.
You could turn it into results and then if you're using Elm pages, you could use that
to give a build error because you can sort of take these data sources in Elm pages and
turn a result into something that's going to give you a build error.
You could also, I imagine, find script tags and remove them if you allowed parsing them
because I think you have an option for that.
Or you could, if you were really interested in my hierarchy of constraints article, you
could make that bold everywhere you see it.
That's right.
And you could, there are all sorts of things you could do.
You could take level two headings.
If something has multiple level one headings, which would be an accessibility problem, then
you could clean that up in a transformation step.
You could also add slugs to those.
You could, yeah, absolutely.
Or IDs, I guess.
Yep, exactly.
You can add slugs to headings.
You can extract a table of contents.
If you wanted to, you could add a transformation step that actually takes the AST, finds all
the headings, and actually inserts markdown blocks for presenting a table of contents.
But also, if you wanted to do it another way, which actually I tend to go this way, I would
just do a step to extract the headings from the markdown blocks, the parsed markdown blocks,
and then I would take that as a data structure and render that as its own thing, rather than
just prepending a table of contents markdown block structure at the front of the markdown
I would just extract the table of contents data and then render it, pass that data to
some Elm code and render it.
But there are so many possibilities.
I just really think that Elm shines in this regard.
That's one of my favorite features of Dillon Kern's Elm Markdown.
There's one interesting thing you said quite a while ago.
You said that markdown is designed not to be able to have parsing errors.
But that is not the case in Elm Markdown.
That's true.
That is actually a future goal.
One of the goals that I have is to clarify the HTML handling in a more formal specification.
So this is...
Oh no, you're going to make a new specification?
Well, so, okay.
Essentially the goal of Dillon Kern's Elm Markdown is to be a 100% compliant markdown parser,
GitHub flavored markdown parser, essentially, possibly with an option to toggle between
common mark versus GitHub flavored markdown.
I'm not sure yet, but 100% compliant with the specification except for HTML handling
because the HTML handling, it's intentionally deviating.
So HTML handling in raw markdown is actually very strange.
You can write an opening tag.
You could say opening tag div style equals color red.
And then on the next line, you could write some text.
This is with the color red.
And then never write a closing tag.
And it's just going to make that next line red.
And then lines after that will not be red.
Wait, only the next line?
It always throws me off.
I can link to an example that we can give in the notes and you'll see probably, I'll
link to a Babel mark example and you can see probably you'll have wide variation in how
this is actually handled.
But I think, yeah, it's essentially going to take the next line and it's going to say,
oh, I didn't find a closing HTML tag, so I'm just going to close it after the next line,
something like that.
But if there's a closing one, like 10 or a hundred lines later, then...
Actually it won't.
I don't think it does.
It will still only be the next line?
And then I think it will consider that closing HTML tag to be a new opening HTML tag.
So you could even write like as an opening HTML tag.
Yeah, it's very strange.
I'll include some examples in the show notes.
So the new tag would be of type slash div or something?
I think it would turn it into an opening tag if there's no...
Yeah, but what would be the tag slash div or something?
It would be an opening div tag.
If you just write a closing div tag, I think it will treat it as an opening.
It's very strange.
It's nonsensical, really.
There's no way you will predict correctly what it's going to do.
And if you did, then you might want to get that checked out.
So yeah, it's very arbitrary.
I don't know.
I don't understand the default specification for HTML handling.
But so Dillon Kern's LMarkdown doesn't intend to just deal with HTML in the regular way
So it uses HTML as a way of saying, hey, this is like HTML is valid markdown, but we're
going to treat HTML as a special thing that behaves more like you would expect, where
you have an opening tag, it's going to look until it finds a closing tag.
And then when you do that closing tag, everything between those two is part of that tag, as
you would expect.
That's like an intentional divergence in the specification for Dillon Kern's LMarkdown.
But the only place it intends to diverge is with regards to the HTML handling.
So if it doesn't find the closing tag, then you got a parse failure, right?
So that's one thing I'm trying to decide on.
I think there are a few directions that could go.
But what I'm thinking about is you can handle it as a fallback.
So it just treats it as text, as if it was the opening div.
If it doesn't find a closing div, it just says, I'm just going to pretend I didn't see
that opening div.
I'm going to pretend it was just the raw letters, open, left bracket, div, close, and then everything
that came after it, I'll just render as I normally would if there was no opening div
Which is usually the case with Markdown.
Like when it looks like something Markdownish, but isn't properly formatted, then it's just
plain text.
Which is nice because you might have some non validated user input that you need to
display somehow.
And if it looks a little broken, at least it displays something.
And you might also be like typing in your editor and seeing previews as you type and
you're not quite done.
And you don't want the whole page to just disappear.
You want to be able to see as you're typing so you can preview it.
So those are a couple of reasons that that can actually be like a nice design.
Now another possibility would be to have, I've been considering a possible design to
like catch errors like that and then give you the possibility to treat it like a fallback
or say, hey, take all these warnings and turn them into errors.
So you could either just say, I just want to display something, show me the fallbacks,
don't give me any errors and never parse with any errors.
It will never parse with errors.
Or just give me the errors and I want to deal with them.
When you do have a parsing error, do you still get whatever was parsed before somehow available?
Or do you just get an error saying, hey, I expected a closing HTML tag here?
I guess there are different ways you could deal with it.
You could potentially have like a special, when you parse your markdown blocks, you could
potentially parse it into something where you could treat each individual block where
a block could be an error.
It could be a block with an error and then the renderer could be given that information.
So you could render it with red around it and render it with the fallback or so.
These are some errors that I'm not a hundred percent clear on yet, but that is the goal
is to be, to at least have a way, whatever other decisions are made to be able to just
render something and know you're going to get something on the screen without errors
with fallbacks at least.
So right now it does give you an error if you have an unclosed HTML tag.
So you do need to deal with the error case when you parse it, you might have an error
in your result.
It kind of feels weird when you know that marketing can fail, but it makes a lot of
It's sort of in between these two worlds of like markdown can't fail.
Elm likes to be explicit and know about failures and deal with them.
You've handled everything.
So it's a tricky balance.
I mean, like do you even have like an option to say if a link isn't closed, fail?
Right now the way it deals with that is just with doing the default markdown behavior,
which is part of the specification.
So if you go through, like there are about 1400 tests in this test suite, which are all
of the examples from the common mark or GitHub flavored markdown specification documents.
It runs all of them and checks the output and there's a certain HTML output that's the
So the standard renderer, when it gives you the standard markdown output format, it checks
that against the entire set of all the examples from the specification and those corner cases
around unmatched inlines like emphasis and strong, the bold and italic and that sort
of thing, or unclosed links, they're all explicitly there in the test cases and it handles it
like standard markdown.
Now it could in the future be that it handles it that way, but it gives you the opportunity
to see where these things that probably are a mistake happened.
So you could deal with them or show a warning or present it with an error around it or something
like that.
That's a direction I'm considering, but for now it's just standard markdown parsing.
Yeah, makes sense.
But I do feel like I would never want to publish a blog post where something looks like a link,
but it isn't.
Unless I really want to show, hey, this is how you do a broken link or something.
Now that said, I'm sure there's a markdown linting tool that does that and it's not a
bad idea to just use those two.
So sometimes that can be a reasonable solution, but it is extremely powerful to be able to
do these types of validations in Elm, especially in tandem with Elm pages.
It can be really cool to have very highly customizable validations that you can turn
into build errors.
So we mentioned that we can extend markdown by use of HTML tags.
Do you have any plans of allowing additional parsing features like James Carlson's Elm
Markdown version parses something like math expressions?
Yeah, it parses math expressions and like a poetry syntax.
So I've thought about this idea of extensibility, like we discussed, is very much at the heart
of the Elm Markdown philosophy, Dillon Kern's Elm Markdown.
Now I think that having these HTML handlers is a pretty good tool for extending presentation.
You can do a lot with that.
And I do like, kind of like we talked about that John Gruber's vision of markdown was
sort of, hey, here's this convention I'm using, but I don't want to nail it down into a convention.
I want people to have the freedom to experiment and build their own things.
And then the common mark specification was sort of trying to say, well, let's standardize.
This is like something that a lot of people are using.
Then we can build tooling around it and use it in a sort of tooling agnostic way.
Well, you sort of start to lose that benefit if you start diverging from the specification.
If you start doing custom parsing, it kind of goes against those benefits.
So it's not obvious which direction would make sense in that regard.
That said, I do really like being able, like personally, I really like using Wiki links,
which is like this convention that people sometimes use where you put double square
brackets around something and it will, and then you type the name of a note and that
will link to a note, which is different than the standard markdown link syntax, which is
like the name is in square brackets and then the URL comes after that in parentheses.
So the Wiki link, it's used in a lot of sort of second brain, you know, Zettelkasten type
systems to sort of interconnect notes so they can refer to each other.
And I have like some tooling in VS code, like this foam research ecosystem.
It's a play on like Rome research, but it's trying to create a similar system in your
VS code using extensions.
And it leverages that syntax.
And I would love to be able to parse Wiki links.
So I've considered like maybe it would make sense to have some way of extending like inline
parsing, but I'm not really sure.
It also complexifies the AST a lot, right?
In the parser.
In the parser.
If you need another inline block or you call it inlines, inline elements.
But like, do you try the parser that you got as an argument at every place?
As a fallback, as a first item, it gets hard.
It's providing a very low level way to hook into the parser, which maybe you like break
something in an irreparable way and you know, Hey, whose fault is that?
Is that the user's fault or the library?
It's your fault at that point, right?
It's yours.
So you have to take responsibility for the user having a good experience, right?
And so where does that start and end?
And if you're providing foot guns or things that become foot guns, you have to really
think hard before you make that call.
So I think, yeah, I'm still on the fence about that.
I guess one thing you can do is try to make the code base really nice so that people can
copy paste your library and extend it.
I mean, why not?
Yeah, it's true.
It's true.
But it's a lot of work.
Well, I think the code base is, I mean, it's, I'm very proud of the test suite in particular,
the automation around that.
That was like probably the biggest, like one of the biggest investments I made was just
making a really solid test suite.
And that runs the entire end to end specification and a lot of, lot of additional tests.
Well, it was nice to have a specification in the first place, right?
As a test suite.
That's right.
And yeah, and a lot of people have contributed.
So a big thanks to everybody who's contributed in Hacktoberfest and just in general, there've
been a lot of people really generously contributing some really cool features.
So I couldn't, this is very much a community project.
Like it's, you know, I, and I try to sort of make the technical foundations and automated,
you know, automations around testing and, and all that a solid.
And I try to create like a clear, solid vision.
And I think that's like, you know, something I'm really trying to give this project and,
but so many contributions have come from the community.
It's been really great.
So we had, I mentioned earlier, we had a Google Summer of Code.
We finished a lot of the the core parsing features.
So the really big headline was so Jin Yi who was working on this project for Google Summer
of Code this year did an amazing job pushing through the nested list parsing, which was
the really big item.
So that's, that's huge.
We've also got some, I'll link to some videos.
I've got some videos where I do some pairing with with some contributors to the project
So lots of great community contributions.
So I think at this point, like the only, the only part of the parser that is not implemented
besides, you know, some minor edge cases and bugs, which, you know, it's, it's a giant
I'm sure, you know, it, it's probably never going to be a hundred percent compliant in
every way.
You go to like this Babelmark parser where you can see output from all these different
markdown parsers and there are always little places where things diverge.
But but the, the main thing that's not implemented at this point are auto links, which is like
a GitHub play with markdown feature that will take HTTPS colon slash slash and
turn that into a link.
So that's sort of the last big remaining feature.
You mean, so you don't have to put in brackets at all?
Oh, yeah.
And Stephen Redikoff, I hope I'm pronouncing his name correctly, but he has a work in progress
pull request on that.
So hopefully we can get that in at some point, but yeah, otherwise it's, it's pretty compliant.
And then the last big piece is getting an explicit specification for the HTML handling.
But at this point it's, it's a mature markdown parser and those are sort of the big upcoming
So maybe let's talk about the alternatives that we have in the Elm community.
The main one that I think most people know is the Elm Explorations markdown package.
Do you know what, do you want to explain what the differences are with your version?
So the obvious one is Dillon Kern's Elm markdown is pure Elm parser.
It's written entirely in with Elm parser library, actually except for the the inline parsing
like the links and italics and bold, because that is not a parser.
It is not a traditional parser that the Elm parser library is designed to work with because
it actually goes from the end of the inline text to the beginning of the inline text and
does a pass adding characters to these lists of delimit delimiters and handling fallbacks
if certain closing elements aren't found and there are very specific precedence rules and
is not a traditional parser.
It's very odd.
Seems complex.
It's very complex and it's just not what Elm parser is designed to do.
Elm parser is designed to just go through character by character, fail if something
is not as expected, which is not what markdown inlines do.
Yeah, but it is mostly because Markdown in a way has no spec in the sense that it can't
So you have to fall back to something.
Well, I would say it has a specification for exactly that for the way in which it does
not fail.
The specific ways in which it does not fail to parse.
So yeah, Elm parser is designed to find failures and say, hey, this isn't valid.
Now I mean, you know, Matt Griffith did some really cool work creating like fault tolerant
parsing for his markup library, Elm markup, which is like a custom syntax different than
markdown and designed to be extensible.
And he did some really cool work.
We can link to a talk he gave at Oslo Elm days.
So, you know, he was he was able to build a fault tolerant parser that's able to handle
failures gracefully.
So yeah, so Dillon Kern's Elm Markdown is written in all Elm, whereas Elm Explorations Markdown
is written.
It's literally just, you know, taking this marked.js library and it just like copied
it into some kernel code.
It was, I think, meant as a sort of temporary thing.
But yeah, that is what powers the Elm Package documentation website.
And it, you know, it does a good job.
It's fast.
It's faster than Dillon Kern's Elm Markdown.
Although actually Matt Griffith has been, you know, he's got this great Elm Optimize
Level 2 package that we've talked about, this NPM tool.
And he uses Dillon Kern's Elm Markdown as one of the benchmarks.
And he's optimized it a lot, which is pretty cool.
Do you know if it's faster than the JavaScript version with those optimizations?
I don't think so.
But I'm not sure.
But I mean, honestly, it's not really something you're going to notice.
I mean, it's like you, you know, you'll blink your eye in more time than you'll parse something.
So like, unless you're parsing like a Markdown version of War and Peace, like it's probably
not going to be a problem for you.
Although that you bring this argument like that, it surprises me as since you're the
author of Elm Pages and you care about performance.
I, well, okay.
So it's a very good point, but I care about performance a lot.
But this is a rabbit hole that we should try not to go too far down.
But I think that the proper way to do it, which actually we talked about in our Elm
Pages 2.0 episode, I run the Markdown parser and serialize the AST using the data source
dot distill functionality in Elm Pages, which means you actually don't run Elm Markdown
at all in the client.
It's already parsed.
So that helps sort of bypass that problem.
So I think that's now longer term, I think it would be really interesting to be able
to just output static Markdown parts of the page in a way that doesn't require JavaScript.
It just is there as HTML and the Elm virtual DOM is able to adopt it.
That's something I'm exploring, but the broader point being those are the ways that I see
optimizing performance of these things.
This sounds like a rabbit hole.
So let's get away from it.
Most definitely a rabbit hole.
But yeah, so Elm Explorations Markdown does not parse into an intermediary AST, which
to me is like one of my favorite features of being able to use a Markdown parser in
So that's one difference.
Another difference is Elm Explorations Markdown.
Now it's very convenient and inconvenient at the same time.
If you want to include syntax highlighting, you can include Highlight.js, HL.js in the
global scope of the page, and it will pick that up and use that to syntax highlight code
snippets in your Markdown.
But if you don't want to use Highlight.js, if you want to use something else, you can't.
If you want to have more control over it, or like in the case of what I do with Elm
pages, I use a data source to use a tool called Shiki to use full VS code grammars to get
syntax highlighting and put that onto the page.
It's limiting.
So it's really a hack, the pulling in the globally scoped Highlight.js and requiring
that to be on the page.
It's sort of a hack.
I think it was intended as a short term solution.
I think Dillon Kern's Elm Markdown provides a more flexible option there.
But if you just need something that you don't really need much customization on, you don't
want to think about it, you don't want to worry about custom HTML handlers or extending
it, you just want to put out some Markdown on the page, then it's totally reasonable
to just use Elm Exploration Markdown.
In that case, it might be overkill to use Dillon Kern's Elm Markdown if you don't need
any of those features.
So I have wanted to use Markdown for a few years already.
You know where I'm going, right?
I wrote Elm Review documentation, which tries to find problems in your Elm documentation,
which is written in Markdown.
I just made a release.
So I did use Elm Markdown.
So tell me about that.
So the issue with Elm Markdown is it parses the Markdown just fine.
No issues there as far as I can see.
But it doesn't tell me where the thing I'm looking at is.
Because Elm Review needs to know where's the location of this link so that it can put squiggly
lines under it or to fix it, to change it.
And that was a problem for me.
So in the end, what I did was following someone else's suggestion, Lou Bird, L U E Bird, was
to use a parser, but a custom one and then try it iteratively until it finds all the
links and all the sections and stuff like that.
So yeah, I'm watching.
That's a lot of work.
It was more work than I expected.
I started based on Lou's solution already.
And then I still had to change quite a lot of things, unfortunately.
So why don't we have access to the range, to the position?
I see.
How can we make that happen?
It would be great to make it more usable with Elm Review, yeah.
It's a good question.
I guess one of the challenges is that you could build a markdown block structure that
wasn't from being parsed.
So that's one challenge.
What do you do there?
And then another challenge is just it kind of clutters up the data types in a way that
makes it a little more inconvenient to use for users for most of their use cases.
So we should discuss it more.
It would be interesting.
And one thing I would definitely like to do is I would like to have more information from
the AST that it can give you some really cool possibilities.
Like for example, I know some there's a markdown based slide tool that I really like called
Dexet and it gives you a few features that sort of take advantage of places that you
can sort of put metadata within markdown.
So for example, if you do a code snippet, which you can do like a code fence with triple
back ticks.
So then you kind of like a multi line code snippet.
You can do triple back ticks and then you can put the name of the language that you're
going to syntax highlight with.
So you could put Elm or JS.
So sometimes, so Dillon Kern's Elm markdown gives you access to that language, but it
sort of throws away the rest of it, but it could actually extract that information as
part of the AST.
And that gives you the opportunity to play around with that.
For example, in Dexet, they use that to put lines that you highlight.
So you could put two dash three and it'll highlight lines two and three.
So at the moment with Elm markdown, you would parse the language as Elm colon two, colon
three or something?
I think it might parse the language as Elm and then discard it.
I need to check on that.
Because formally this specification says something about that.
So but yeah, things like that can provide some interesting possibilities for extensibility.
Now like adding information about what actual spaces are behind things and that sort of
thing could be that's probably not in scope of having a concrete syntax tree that would
allow you to parse and then output to the same data that was parsed because that's just
not what it's designed for.
But I don't know, possibly having some regions that tell you what point in the file things
came from could be interesting.
One use case for this would be like writing a markdown linter.
The thing is, there are already plenty of markdown linters out there.
But it is, you know, I mean, you never know the possibilities that open up.
And when you have, you know, this is maybe a topic for another conversation, but I'm
really fascinated by the like emergent dynamics when you when you create an ecosystem where
people have tools to build off of.
So I think Elm Review is like a great case study of this that we see all these really
cool innovations because nobody needs to ask your permission.
Nobody needs your help to build it.
There's documentation, there's a nice API and people go build things and then people
install it and publish it.
They can see the errors and fixes and VS code and in their GitHub actions and everything.
And you have this, you know, things can really thrive because, you know, people can just
build things with this nice language, nicely typed information, nice, nice, nice platform
for doing that.
So you never know, like if you create these types of tools, like if Elm Markdown had more
capabilities for doing that sort of thing, maybe we'd have a flourishing ecosystem.
I certainly have had a lot of fun doing sort of unexpected transformations like parsing
out back references in my notes.
And there's so much more you could do.
Like if you look at unified JS, there's this like ecosystem of like this.
It's actually like hard to keep in your head straight what the different splits of the
tool are as with many things in the JavaScript ecosystem.
Things are like split out into these micro composable pieces that you're like, okay,
which part is responsible for what again?
But anyway, there's this like, there's this ecosystem of like Markdown parsing and transforming
plugins in the React ecosystem, remark and unified JS.
And I'll put a link to it in the show notes.
And you know, it's really cool.
Like you have all these plugins that people publish.
And I think it would be really cool to see that sort of thing with transformations for
Dillon Kern's L Markdown blocks where you could have, you know, people publishing packages
for gathering up a table of contents.
I mean, I have that in examples, but it would be, and I use it all the time, but it would
be nice to package it up nicely so people can use it and create a little ecosystem of
doing cool Markdown transformations because Elm is very good for that, you know?
So yeah.
So is this the right place to segue into L Markdown transforms?
Yeah, I think so.
So yeah, that's a really cool package that our friend Philip built and he built it to,
partially, you know, Philip and I had a lot of conversations about like these ideas that
he had for a different way to approach the renderer in L Markdown.
You know, we figured it would make sense for him to explore it in a separate package, but
it's sort of proven to be a very good approach that I would love to fold back into the core
So that's something we're exploring right now.
So what does it do?
What's the goal of that package at the moment?
That package.
So, and maybe we should talk first a little bit more about the renderer functionality
because I think we haven't fully covered that.
So the renderer, so how it works is you give it a big record that tells you how do I handle,
you know, a strikethrough, a paragraph, a block quote, you know, emphasis strong, which
are bold and italics.
And you just give it a function that says, you know, here's the, you know, here are the
children of this thing, renderer, how do you want to render a paragraph?
And you could render it with Elm HTML.
You could render it with Elm CSS.
You could render a paragraph with Elm UI.
You could render a paragraph as plain text to just, oh, a paragraph.
You just put a new line before and after, for example, whatever it might be.
Am I right in understanding that it's pretty much just a fold?
The fold that we spoke about before.
But made in a way that is more easily extensible or like you already have like a default options
for that.
The Elm Markdown package ships with a default HTML renderer that you can just use to get
the standard HTML output.
So if you don't want to customize your renderer at all and you want to use the standard, you
know, lists parsed to just a list element and they have a certain class on them, if
the starting number is a certain thing and that sort of thing, then you can just do that.
So if you didn't have this render function, then people would have to do a fold with a
big case expression or two, one for the blocks and one for the inlines, I'm guessing.
They do some big recursive call traversing through the, because it, yeah, it would not
be fun.
But yeah, it's essentially, it gives you the ability to render out to any custom format
and any data type.
And now, you know, I mentioned these different view types like HTML and, and you know, LMI
elements, but you can get way more sophisticated than that.
So if you wanted to render to results, so you could have like a result at any point,
you could do that.
If you want it to render to a function, why would you render to a function?
Well, if you wanted to display something interactive.
So, so B. Burdette has this really cool example of like a little scheme type in like Lisp
evaluator that he built.
And he built a little like implementation of these markdown cells where you can put
like a little cell using like HTML, HTML tag cell, and you can put data into it and you
can interpret Lisp expressions within, within them in your view.
And it does that because it, it renders to a function and you can pass in an interactive
model and it can update that model.
So you can have like stateful things that you render, you know, so the sky's the limit.
This was actually my first introduction to L Markdown.
Like I didn't even know L Markdown for anything else when I was introduced to this.
So yeah, my first introduction to L Markdown was, Hey, you can take Markdown and you can
make interactive things with it, which is not the main purpose, but does, does, does
highlight the potential of the library?
To me, that's what's really, you know, that's what's really interesting.
You know, I mean, like, like I, like I said at the top, the three, the three interesting
things I think are, you can parse the AST and manipulate it in Elm, which is awesome
and fold over it and all that you can customize renders and you can add these HTML handlers
and the customizable renderer output is amazing when you consider that you can render to any
type and that includes functions that depend on your model or that take in a dictionary
that has state and it can insert things into it and update things and remove things that
also includes, I think we mentioned this a little bit in the Elm pages 2.0 episode, you
can render to a data source.
And if you render to a data source, that means that you can, you know, I mean, the sky's
the limit.
You could, you could have a build error if there's an invalid link, right?
So you can do validation type things and turn that into a builder and a data source and
Elm pages, you could, you know, you could go and take a, take a link and use a data
source to follow that link and make sure it's valid or, you know, the sky's the limit.
You could also do a single pass for everything like extracting data and rendering.
I tend to think of, you know, using like, I mean, you know, Elm Markdown, I built Elm
Markdown as a, as a yak shave while building Elm pages.
Like I was like, I want, I want a tool that does this, that I can like put something in
GitHub and it's going to present it in a reasonable way.
And I can share a link that says, edit this page on GitHub.
And people can do that and see the preview and it's familiar Markdown syntax, but I want
it to be extensible.
I want to be able to validate it.
I want to be able to use Elm, all my Elm foo on it.
And so like the idea is like, I think of it as you're setting up the machinery for, you
know, validating and rendering your views and like, you're just doing that with Elm.
So you can do anything, you know, you can, you can do all sorts of stuff with that.
So, so that's renderers.
So Phillips library, Elm Markdown transforms, it provides a, a way to sort of traverse over
things and map them.
And it's, you know, I'm still working on coming up with like distilling it down to, to the
core of what it is.
It's one of those things that really hurts your brain, but it's incredible what you could
do with it because you can, you, you just have like much more ability to sort of map
things and, and transform them in multiple passes.
And if you were just with vanilla Dillon Kern's Elm Markdown try to, you know, render things
to a data source, it doesn't work.
It's not, it's not so easy to, to actually do a lot of work with that.
It would be a lot of work and this removes some steps from that.
So I'm working on integrating like the concepts from this library into the core API, cause
I think it's a really good way.
It's it's basically like a nicer renderer.
It's the same concept, except instead of a big record of functions for how to do each
thing, you render to a custom type.
So yeah, that's definitely worth checking out what, what Philip has done there.
I'll link to it.
It hasn't been updated yet for the latest Elm Markdown and Elm pages, but I actually
have like a local fork of that.
Maybe I'll just make a pull request to, to his package and we can, can get that shipped
Dillon promises you that it will be released before this episode is released, right Dillon?
Yeah, absolutely.
Well, so, you know, so much of this stuff, like there's so much you can do.
And part of it is like when it comes to just like dealing with the Markdown blocks and
transforming them and extracting table of contents and stuff, it's pretty straightforward
in a sense.
But then there are all these subtleties of the, the parsing specification and the rent,
you know, the way the render is designed and how you transform things that I feel like
I'm just like constantly tweaking and experimenting in my own little private private, not private
repos, but my own, you know, public repos, but in a little local fork in order to try
these different approaches because there's just so many, there's so much you can do with
There are so many ways you can approach this design, but I'm always sort of iterating on
trying to find a way to, to integrate some of these more powerful things into the core
in a way that's intuitive.
So all in all, L Markdown was a fun little small hobby project in order to have something
nice for L pages.
Is that right?
Yes, it was.
I built it.
I built the initial prototype on a flight to India, which flights to India are very
long from California.
So yeah, I built the initial proof of concept on a flight while I was working on some L
pages stuff and I just really wanted something that operated this way and it was absolutely
a yak shape.
It was which is part of why, you know, I tried to build it in a way that was really hopefully
smooth and easy to contribute from community members and it's been really cool to see that
shape up because I knew that there were too many features that I wanted to build between
L pages and L Markdown to build every single part of the parser myself.
So it's been really cool to see that come together.
And you know, yeah, it's like I was saying earlier, like things get really exciting.
You know, if you build one feature, it's like, it's like you give someone a feature and they'll
be happy for a day.
But if you teach people how to build a feature, then they'll be happy for a lifetime.
But what do you do in the meantime?
If you teach someone, you don't have to fish anymore.
Or do you?
Oh, there's always plenty of work to do.
I don't think that's going to be a problem.
Same here.
More features to do.
Always more features.
All right.
So where can people go if they want to get started with L Markdown or to contribute to
L Markdown?
Good question.
So, well, I wrote a blog post announcing it.
I don't know when a year, year and a half ago, something like that.
Two years ago.
Oh my God.
Time flies.
What is it called?
Extensible Markdown Parting in Pure Elm.
So I actually gave it a reread this morning.
It's a pretty short post, but I think it summarizes the goals pretty well.
So check that out.
Check out the repo.
Check out the examples.
And we'll leave some examples to some of the really cool things people have built with
So Philip built a very cool interactive sort of like, you know, click to edit like graphics
Like it's sort of changing the color and shape of these objects on the screen.
And you can click to apply or deselect these transformations and see how it changes what's
on the screen.
So that's like a cool thing he built with L Markdown on his website.
The Scheme Interpreter example is like a very cool use case for it.
I'll link to some of the stuff that I've done like in my incremental site.
I've been having a ton of fun using it to parse things and validate things and map things
into crazy transformations.
So yeah, check those things out.
Join the Markdown channel in the Elm Slack.
Let us know what you're doing.
I'd love to hear about it and feel free to ask questions there.
And if anybody is interested in sharing some of the things that they're working on in their
own personal projects, their own yakshaves, their own cool Elm experiments, Mario Rogic
has shared a link for a Google form that he created for the Elm online meetup.
So it's been really cool to see the great talks coming out of that meetup.
Or is it a meetdown?
It is a meetdown.
It's a meetdown.
It is a meetdown.
So if you're not familiar with it yet, we'll drop a link to the website for the Elm online
Check it out.
There's great stuff happening there.
And it would be great to see more people submitting talks.
There have already been a ton of great ones recently and they're getting posted on YouTube
So yeah, check that out.
Submit something cool.
And Yaron, until next time.
Until next time.