Writing an elm-review Rule

We walk through how to write an elm-review rule from scratch. Also, how to test your rules, and how to write an automated fix.
April 26, 2021

Getting started

Some repos to look at for inspiration

The elm-review package docs are very through and worth reading


Hello, Jeroen.
Hello, Dillon.
And once again, we're back for more Elm Review.
I was building an Elm Review rule
for the first time recently,
and I was thinking it would be nice to do an episode
just doing a deep dive on not Elm Review
as a consumer of rules,
but writing your own Elm Review rules.
So I noticed that for your Elm Review rule,
which is called Elm Review HTML to Elm, which is very cool.
So if you haven't heard about it before,
what it does is it looks at your Elm code.
It looks for debug to do calls
in which you have pasted HTML strings,
so the less than HTML,
bigger than with spans and attributes and stuff like that.
And then it transforms it to Elm code, Elm View code,
Elm tailwind modules, which we talked about before,
or just regular Elm HTML or Elm CSS.
I don't know what you support at the moment.
Yeah, what it does is it,
I think I have checks for Elm CSS or regular Elm HTML.
And I mean, it was a really cool experience writing the rule
because I was able to very easily hook into a few places
and say,
like, you know, then I've literally just got this Elm data structure
that represents all of the imports
and what it's the imports aliased as.
So if it's import HTML.attributes as adder exposing div,
then I actually take that
and I use that import alias for the auto generated HTML.
So if you have that import,
it's not exposing div, it would be exposing class.
So if you have import HTML.attributes as adder exposing class,
when you have HTML with a class attribute,
it's going to use the unqualified name class.
And when you do, you know, ID,
it's going to do and use that alias.
So like, it's really cool because, you know,
writing an Elm review rule,
you can go to a few places
and just kind of collect the data you need.
And it was so easy to, you know,
just take the kind of code generation that I wrote for basically,
like, originally I wrote this online generator
that you paste in some HTML
and it generates Elm code for that.
And then you can configure your import aliases
and exposing and all that.
But it was so easy to just basically hook into the AST
and give me the imports just as an Elm data structure.
And then I just kind of use that as configuration for my role.
So it was a really good experience.
And particularly, it was very comforting
to be able to just write a unit test
and iterate on it until it's green.
I actually wonder whether you should call those unit tests
because there's so many checks done under the hood.
That's a good point.
Yeah, it's more of a sort of integration test, isn't it?
But it's written using just plain old Elm test, which is awesome.
So I guess that qualifies as unit test, maybe?
It's a potato, potato.
What is Elm program test?
Is that an integration test in a way?
I think of it as an integration test.
All right, then I think that's close to it at least.
So for Elm review rules, what concepts do we need to define
and lay out for writing an Elm review rule?
First of all, you're writing Elm code
using jfm angles slash Elm review.
So that's an Elm package, and it's also an NPM executable CLI.
So you use the Elm package, you define a review rule,
and then basically that gives you a few places
where you can describe which parts,
where you want to match and mark a rule.
And it also allows you to hook into these visitors
and get context about a module or a project.
Is that like a fair highlight or something?
It's in a way oversimplified, but I guess it works, yeah.
Are there any key pieces missing?
Yeah, yeah.
So just first of all, if you try to write an Elm review rule,
you should know about the fact that the CLI gives you
two commands, which are Elm review new package,
new dash package, and Elm review new dash rule,
which will help you get set up.
Especially the new package is useful,
but if you add a rule to your configuration
at your work project, then new rule is already plenty useful.
Okay, so before we talk about ASTs and those details,
let's say you're starting a new,
you want to write an Elm review rule at work.
You want to check that a particular convention is followed.
You want to check that, let's just take an example
that you don't want to use a color function
for your styling for Elm UI or for Elm CSS or something,
unless it's defined in a color palette module, let's just say.
Yeah, so you don't want to call CSS.RGB?
All right.
Yeah, so you want to enforce in your company
that that convention is followed with a review rule.
So you're saying your first step is
you install the Elm review NPM package,
and then you say Elm review new rule from the CLI,
and then that gives you what?
That gives you a folder.
Where does the folder go?
So it asks you for the rule name,
which is formatted as an Elm module name,
because in practice it will just be an Elm module.
So let's actually kind of pretend that we're building this
and sort of live pairing on this.
What are we going to call our rule?
It's like no direct color.
No custom color, maybe,
because you want to use a color from your palette.
Yeah, I'd go for something like no custom color
or no color outside of, yeah, no custom color is good.
Okay, and you have some guys,
so the package docs for Elm review are really helpful.
They're very thorough,
and I definitely recommend leveraging them
because there's a lot of good stuff there.
You've clearly put a lot of thought into that.
And you talk about naming conventions for that.
What's a quick high level advice for how to name something?
I usually try to always name things with no something
because often you want to forbid something.
So make it easy to understand
what you're trying to achieve just with the rule name.
If you tease some rule in the configuration,
they should have a clue as to what it should do.
So for instance, instead of saying something like
use color from color palette module,
it's clear to say no custom color.
So yeah, also the advice that you would give
may depend on situation.
So sometimes it will be use a color from this module
and in this case use color from this module.
When you multiply the advice that you would give,
that's a bit harder to get into the rule name.
So that's why no something is usually faster.
Right, and also like writing an Elm review rule
for something that applies sometimes but not always
is not a good idea, right?
You shouldn't do that.
You want to enforce,
because it's going to mark it as an error.
Or magic comment.
So you don't want to like,
there's no escape hatch like that.
You can't ignore a specific file.
But in general, that would be a smell
that you're not using it as intended, right?
Yeah, or it has to be an exception
that is built in the rule.
So for instance, calling color.rgb or css.rgb
is fine in this particular palette module.
You can encode that into the rule.
Yes, exactly.
Or maybe you say,
here's a function that is from palette
that's an escape hatch.
So you do palette.I'm a bad person
and I'm using a custom color.
And you say, okay, well, we're going to ignore
if they use that or something like that.
Yeah, so yeah, that's a good distinction.
You want to be able to, within the rule
in your rule, figure out if something is okay or not
and not have ambiguity.
So where does it create that folder
when you do elm review new rule?
So what it does, it creates a source file.
So it's source slash no custom color dot elm
and a test file.
So test slash no color module test dot elm.
It's in the current folder,
so you should move to the review folder
in your project before you do that.
Okay, cool.
So that's the convention you typically use,
is just to have your custom elm review rules
that you write for your project
living alongside your source code?
Yeah, it lives along the source code
for the review configuration.
It doesn't live next to your regular source files.
So the review folder is its own elm application.
So it has an elm JSON, it has source files,
and potentially test files, as we will show now.
So the dependencies of the review project
doesn't mix with the dependencies
of your regular project.
But in this case, wouldn't you have to add
your source folder as a dependency
of the review project,
because you've defined your review rule
in your source directory, in your src folder?
Review slash src.
Oh, review slash src.
Okay, got it.
Thank you for clarifying.
Okay, so that's the convention you use.
That seems like a nice convention.
Okay, so your convention that's sort of built into
the elm review CLI sort of scaffolding command,
elm review new rule,
it's assuming this convention of your local project
review rules that you define
as your local package source.
Yeah, it's kind of a convention,
but it's also just the more practical version.
So you already have a source slash review config
dot elm file, which is your configuration,
which is the main for elm review,
and you use rules from packages.
So just adding a new source file for a rule
is actually very easy from the design
point of view.
Yeah, that makes sense.
And it probably would potentially get weird
to clutter up actually having access
to your actual source project.
Oh, yeah.
The top level source,
because then you could be in this interesting gray area
where you can sort of scan the source code
using elm review to mark errors,
but you can also just use code
that you're scanning, which is a weird sort of mix, right?
Yeah, maybe you would go into meta programming or something.
But yeah, it is something that I thought about first
because you know you've got dependencies in your elm JSON,
you've got test dependencies.
I was thinking I should just add a new section
called review dependencies.
But it turns out any time that the compiler
adds a new dependency,
it removed all the review dependencies,
so I had to think of something else.
And this current design is actually pretty neat.
It's annoying that you have a folder,
but it's fine.
I think it makes sense.
Yeah, I think it's nice to find things in there.
Yeah, I like that.
Okay, so that's if you're doing
a sort of project specific rule,
which you could later extract out to a package
and publish it if you wanted to,
but I did it with a package
because it wasn't something I wanted for one project.
It was something I knew I wanted to publish as a package,
so I did elm review new package.
So okay, so I think we've sort of set up
how you would get started
and where your review rule lives and all that.
So you were starting to talk about ASTs.
So what do we need to know about ASTs
for writing our own elm review rule?
So what is an AST?
An AST stands for Abstract Syntax Tree.
So it is a data representation of your code.
So often it's a tree.
Well, it is always just a tree of nodes,
and each node describes a specific part of your code.
So for instance, if you have A plus B in your code,
that is an expression,
addition, so it's the application of the operator plus,
which has two nodes, a left one and a right one.
And the left one is the expression one,
which is a int expression.
I don't remember what it is exactly.
Yeah, expression.integer.
And everything you're talking about,
you're talking about a general concept
of an abstract syntax tree,
but for the context of elm review,
we're talking about the still4m slash elm syntax,
which is an elm package,
and we're talking about this AST data structure
is just an elm custom type.
So for example, you're talking about
an addition expression,
and you're talking about the integer expression type,
like a literal integer.
So the integer, if you have one, two, three,
a literal integer,
then that is,
if you go to the still4m slash elm syntax package,
go to the elm.syntax.expression module,
it defines a type called expression.
It's just type expression equals,
and then a bunch of things
that an elm expression can be.
And one of those things is integer,
which has an int.
So type expression equals integer int,
or et cetera, et cetera.
And so if we were looking at that part of the AST,
which is the number one, two, three,
that's what we would have.
So what an AST does is it gives you
what the code kind of means.
It doesn't necessarily give you how it is formatted.
In this case, we do get the positions
of every expression on every node,
but it's not always enough to reformat it as it was before.
I think we touched on that
in maybe our original elm format,
original elm review episode that we did,
where we had a concrete syntax tree
and an abstract syntax tree,
where an abstract syntax tree
sort of has the information
to represent what your program means,
whereas a concrete syntax tree
has the information to represent
what your program looks like on the file
in a more structured format.
So with elm review,
we have a more abstract representation
which is whitespace agnostic.
So I think,
so I now maintain the elm syntax package with Martin Stewart,
and I think we will get it more and more
to a concrete syntax tree.
Interesting. Cool.
Because, yeah,
we need the information of the position to do better fixes
and to do better reports,
and some of them are missing,
and the code is actually formatted.
Yeah, so one of the core concepts
that comes up a lot in elm review
is the concept of a range,
which is literally just,
like you can go to the module,
which we'll link to in the show notes,
which is just the module elm.syntax.range.
It exposes a record type alias
where a range is a start location
and an end location,
meaning the row and column that it starts at
and the row and column that it ends at.
And so when you parse,
like the AST that you get access to in an elm review rule
gives you a range,
and you can say what part you want to underline
when you're marking a problem.
So if we were doing,
if we said css.rgb,
we could underline css.rgb
and then the list of arguments
that are being applied to that function, for example.
And those would be different ranges
that we would have access to from that AST.
Okay, so...
So I think we covered the AST.
I think that covers AST.
Let's sort of continue our pairing session
on this no custom colors review rule
that we're writing
as an internal project specific review rule
in our project.
So we did elm review new rule.
What's next?
Like, what do we...
I think we might have
a special phantom builder compiler error
at this point if we try to run our rule
after we did elm review new rule, right?
Yeah, so what the rule starts with
is rule.newModuleRuleSchema,
then between quotes the name of the rule,
and then the context,
which we'll go into later,
which is not needed for this case, I think.
So it's just a unit.
And then you've got a common saying,
add your visitors here.
So that's where you will have to add your visitors.
And then a call to rule.fromModuleRuleSchema.
So that concludes the building of your rule.
Add this and this, visitors,
and then fromModuleRuleSchema,
and that gives you your rule.
Right, and all of the visitors
that you apply for your rule,
so you're basically just defining
a set of types of things you want to visit.
You say rule.with simple import visitor,
and now you can...
So this is a really important concept
for building elm review rules,
what module context is, or project context,
and understanding the concept of a visitor.
So basically,
if you were to go try to emulate
the functionality of elm review from scratch,
then you would probably create a little elm app
that is run by a Node.js script
to make a headless elm application,
and then you're going to read a bunch of files.
You're going to read all the elm files
in your elm source directories,
and you're going to pass those files in
so that you have them in elm,
and then you're going to use the elm syntax package
to parse all those files.
So then you would do a bunch of pattern matching
to figure out what's going on.
If you have no custom color rule
using that sort of hand built AST traversal,
then what we're doing is we're just going to
just load in all of the files,
parse the whole AST of every module,
and then we're just going to go through each
and mark a problem if we see css.rgb called,
and the module is not our palette module.
So we're going to detect that css.rgb was called.
Where do you go in the AST to find the information?
And that's where visitors help.
Yes, right, exactly.
Okay, so this is...
The reason I'm talking about what it would be like
to build something from scratch
just for our listeners is because, as I say often,
I think that to understand a particular design in an API,
it helps to understand why was this design done.
Otherwise, it just seems like magic.
So to understand why it's done,
it helps to talk about how would you build it from scratch,
and then you realize, oh my God, this is super convenient
that I don't have to do it from scratch,
and this is why the design is done this way.
So essentially what I'm trying to do is to show
that we would essentially have to build our own visitor.
So by doing it from scratch, you've got this AST,
which, what is it at the top level?
Is it like a bunch of modules that you've parsed?
You've got these modules which are then like
a list of top level declarations or something like that?
You mean the declaration list?
Yeah, that's what it would parse into, right?
Oh yeah, you would have, for each file,
a record which contains the module declaration,
a list of declarations.
Right, and then those declarations would have expressions in them, right?
Yeah, well, there are types, and there are functions,
and functions have expressions.
Right, so basically when we're writing our no custom colors rule by scratch
using our own hand built Node script and Elm code,
we need to find all those places where expressions occur.
So we need to go in and find all those places
and iterate over them.
But with Elm review, you just say rule.withExpressionVisitor.
And now what you've done is you've gotten access to every expression.
It's just going to get passed into your function,
and you can sort of do a pattern match on that
and say if it is a function application expression,
and which is a fancy name for a function call,
is this specific function I'm looking for,
in which case you're just destructuring the list,
which is like the module namespace.
In this case, it's just CSS as the first element,
and RGB as the second one.
And you check for that, and then you say,
if the module I'm in is palette, then that's fine.
And otherwise, now you need to do something.
But it depends on how you want to make the rule work.
So how do you imagine it to work?
I would imagine that if I call CSS.RGB,
let's say the rule takes a configuration option,
which is the name of my palette module.
And if I have a call to CSS.RGB
anywhere outside of that palette module,
then it marks that function application as an error.
As I said before, you just want to match for expressions.
You can go for a simpler expression,
which is the function of value.
So you don't really care that it's a function call.
You just want to match any CSS.RGB functions.
Because potentially someone reassigns that function
to another name, and then they use that name somewhere else.
I see.
So they could be calling a function,
but that function is an alias somewhere.
Yeah, in a way it's an alias.
In other words, when they assign it to an alias,
that's not a function application.
And then when they do the function application,
it's applying a different function.
So you're saying you would forbid an expression
that even refers to the function or value CSS.RGB.
That makes sense.
So that would be a more robust way to do it.
And this is sort of the mindset you have to have
when you're writing Elm review rules.
From hearing you talk about it and doing it a little myself,
you do have to think about these corner cases
to a certain extent, and you try to abstract that away
as much as is reasonable.
But to a certain extent, you're doing static analysis.
You sort of have to think about some of these details.
The false positives and all the false negatives.
You don't want the rule to report more things than necessary.
You don't want to annoy your user
without needing them to do anything.
And you want to catch everything
that you're supposed to catch.
But in this case, there's a simple elegant solution,
which is just you can't even refer to this function,
and then we don't have to get fancy
keeping track of context
if we needed to.
If for some reason we needed to check
that you can refer to this function,
but you can't invoke it,
we could do that by basically building up context
in one of our visitors, right?
So the next step, once we have this rule
that detects RGB functions,
is to not report an issue when you're in the palette module,
the other module that you gave as a configuration.
One is you can just ignore the rule for this module.
Like manually, you would say
rule.ignoreFile or something like that?
Yeah, ignoreErrorForFiles,
and then you would pass that in.
And you could do that in your rule definition.
You don't have to do that in your review config.
So that's the easy case
when you just don't want to report it for a single file.
If you do something more clever,
then you will need to use a context.
That makes sense.
What is your idea of a context?
What do you gather from it?
What I think of a context as
is one of those cool things
that I love seeing in the Elm world,
which is it's sort of whatever type you use there.
So if your context is a unit type,
that's what the type is.
If the data that you need to build up
is a list of places where something is invoked
and you need the range,
you need to grab the range
so you can mark an error in those places,
then that would be the context you need
to later report the error.
Or if you needed to keep track of,
like for example, for my HTML to Elm Elm review rule,
the context I need to get is the configuration
of your import aliases and exposed values
for HTML, HTML.attributes, that sort of thing.
Yeah, so in a way, it's all the information
that you need to gather
to better report an error
or to better not report an error
that you need to report errors or not.
Right, because there are sort of like,
it's actually possible to give an error
in a visitor directly, right?
Or you can update the context
and then check the context at the end
in a final step and report errors based on that, right?
Yeah, so what you're referring to
is the final evaluation.
You report errors after you're done
looking at the entire file.
So I want to report when this function
is not used in this module.
Well, you can't know that until you've looked
at the whole file.
So that's why you need a final evaluation.
Yeah, so in a way,
the context is very, very similar
to a model in the Elm architecture.
So you can kind of see
the function. So you get a new message,
which is the new node, the expression,
or the declaration, or any kind of node
of the AST that you see.
And then you return some errors,
which is the commands.
Yeah, right.
And or you return a updated context,
which is your model.
So, oh, I've seen an import.
OK, now I need to know that class has been imported
and is in scope.
So it's a model and commands,
but model is the context,
and commands are the errors that you report,
except that the order is the other way around.
Yeah, yeah, but it's still a tuple,
and one of the things is saying,
do something.
In this case, the only thing you can do
is report an error or a fix, I think.
The only things that can do fixes are errors.
You can either, right,
you can report an error,
which either is going to warn you about an error,
and it optionally may recommend a fix.
And the second thing in the tuple
is updating the state that you're tracking
as you run all your visitors.
So, yeah, it's really cool.
So basically, every visitor that you apply,
so you start your rule with your rule builder,
and then you give it your initial context,
which in the way you're framing it,
that's your init for your initial model.
You actually don't have any initial commands of errors
because that would just be weird
before you even start, like,
whoa, they're like,
you haven't even looked at my code.
Don't tell me there's an error.
That would be rude.
Well, by the time that this episode airs,
I will have released
a new version of the code.
If you pass in a bad configuration,
before looking at the code,
we can say,
hey, there's a problem with what you gave me.
This doesn't make sense.
This is not proper configuration.
That makes sense.
So you can validate the actual options
that are passed into your rule.
That's great.
Well, validate or parse it.
And then, so that's the initial way
that you start the builder.
You have chained function calls,
so then you apply your visitors.
So rule.withExpression, enter visitor,
and then that visitor,
they all have to be of the same module context type
or project context if you're doing a project visitor.
And so every single one you chain is going to be,
because that's your model type.
Just like the update function,
in your update function,
you have a particular model type.
In the subscriptions function,
you receive an argument of that model type.
So all those types snap together.
It's the same with these visitors.
They all have the same context type
that they either receive or update,
receive and update.
So we've gotten this far,
and we've talked about writing a test,
but we haven't really talked about what that looks like.
So let's continue our pairing session here,
and we're going to write a test
for our no custom color review rule.
So when we did elm review,
new rule on our command line,
that generated a test module called tests,
which is in our review slash tests folder.
Yeah, okay.
So then, and that gives us sort of like a starting point,
we get the elm package documentation
in the elm review package.
There's a review dot test module,
and it's got really good documentation
on just kind of the basics,
but the basic idea is,
you can give it elm source code as a string.
You just define a string.
So let's continue pairing on that.
So we would say like module navbar view.
Navbar module, exposing view,
and then we would say view,
and we would write some valid elm syntax
that uses CSS dot RGB,
and so we would expect there to be a failure there.
And so then we can use the elm review test helpers
to basically say,
in this case, I expect errors,
and then we describe the expected error message,
and then you give it under, which is just a string
of where you expect it to underline.
Yeah, where you would see these quickly lines
in your editor or in your terminal
when you run elm review.
Right, so in our case,
that would just be CSS dot RGB.
Actually, we probably would need to keep track
of the import context to check for import aliases or...
Yeah, but there is a nifty tool inside elm review
that makes it so you don't have to do it manually,
at the start, we did have to.
And that was just annoying and elm prone,
so I made it built in.
That makes sense as a sort of built in feature.
So how would we use that?
That's where it gets a bit more complex.
So you would need to use a module name, lookup table.
That's the name.
So for every expression or every type annotation,
you can ask it, what is the module name for this thing?
So it only works for a few kinds of expressions and types.
So the function of value, which we talked about before,
which represents CSS dot RGB,
you can ask it, what is the real module name for this node?
So maybe CSS refers to HTML dot custom dot CSS,
because it's been aliased as CSS,
or it represents CSS from the RT Feldman elm CSS package.
And the way that you get it is by calling
with module name lookup table.
So that is something that you will then have to store
in your context that we talked about before.
And the way you need to get it is a bit more complicated.
It's not that complicated,
but I'm not sure we want to get into this here.
We want to get into the actual application.
Cool, cool.
Okay, I see.
But that's something that you have to wire in
to your model, essentially, to your context,
and pull it in there and then look at something.
And then you pass it around, and it's easy to use.
That's cool.
Yeah, so these are the kinds of things
that having Elm Review as a platform allows you to do.
Whereas if we're writing our own from scratch,
we could traverse every single element in our AST,
look at every expression,
and look for a certain type of expression.
That's not too complex,
but the things that we lack there are,
for one, having this sort of lifecycle
that you describe sort of like an Elm application
with an init and update and a model and commands.
You're able to hook into those sort of lifecycle events
and build up that context
and maintain that state more easily.
And when you mark an error, you pass the range,
and then you get really nice error messages
that look a lot like Elm's error messages,
and it underlines the appropriate thing.
So those are some of the things that,
not that anyone would want to build it from scratch
because it's just way more convenient
but these are the kinds of things
that this visitor pattern gives you.
And it also allows you to be increasingly complex or simple
depending on what you need,
kind of like browser.element
versus browser.application.
Like if you wanted to define a module,
you can define a module visitor,
or you can define a project visitor.
Yeah, a module rule or a project rule.
Yes, and the module rule
is a module rule that will only look at one file at a time.
So it takes the AST for one file,
go through it, report some errors,
and then when it's done, you go to another file
and you forget all about the previous one.
So when you can use that
and that is sufficient for your needs,
that's much easier to use than project rule.
A project rule is kind of the same thing,
but when you go to another file,
and that way you can gather a lot of information
for the entire project,
and based on what you found,
you can report different things.
So I was kind of thinking like a cool rule,
which may be challenging to write,
but an interesting rule to explore
would be like a parse don't validate rule
where you, like for example,
want to catch would be
if I have a particular value that I check
to see if it equals nothing
or any other custom type variant,
and then I later,
I check that with an equals
if value equals nothing,
else I pass it on somewhere,
and then later in that function I pass it along to
if it then destructures it
and unwraps the just case
sort of a shotgun surgery that we talked about
in our parse don't validate episode
where you're checking a condition over and over.
Ideally you want to check it once.
So that would be interesting,
but it's difficult because it kind of requires
more of a context to be built up
of following the flow of
I checked this equals nothing,
I didn't else,
I went and followed the module
that it got called in, right?
I'm not entirely sure because I haven't
delved in it enough,
but it is not something that I find to be
very easy with Elm review.
That said, I have not really tried anything with it,
but I imagine that it will be a bit tricky.
So you can do this kind of rule, I think,
but to the extent to which you can do it
will be getting harder and harder.
So if you want to do something like
where it says if lists is empty of a value,
then you have a case where it is true
and inside of there you do, you call list.head,
then if it's narrowly,
if those calls are in the same function
or even just inline,
so then the list.head is in the branch
where you called lists is empty,
then I think it's doable.
Right, because in that case, it's a single expression.
And you can destructure that AST case expression
and get the condition or the value that it's doing the case on,
and then you can get the body of the branches
and you've got all the context you need there.
You don't need to continue following that context.
Yeah, it might be a bit trickier if you say
if list is empty or if this or if that
was a bit more blurry, like,
do you really need to parse don't validate?
But the tricky part is when you start using indirection.
So if in that branch you call another function
or a function from a different module,
that's where it becomes very tricky.
So I made it very easy to get information about functions.
So I made it very easy to get information about modules
but not the other way around.
So I made it easy to get information from modules that you import,
but I didn't make it easy to get information from modules that use you.
Yeah, that makes sense.
Right, the upstream dependencies you can inspect
but not the downstream.
Yeah, so I think I want to make that possible,
but to make that very nice, I need new information
that I'm working on but that is very tricky to get
Can you give me a little more of an idea of what,
so I'm guessing looking at the docs that you're talking about
with context from imported modules,
is that what you're talking about?
Yeah, so you were talking about project context.
So we talked about the context before.
So that is the model that you get from
for a single module visit.
So you go through expressions,
you go through declarations, et cetera.
If you are in a project role, you transform that to a project role.
You basically compile all the information that you got from this module,
and that's what we call a project context.
And then this project context, we merge them together
for all the imports that you have in the new module.
So if I import CSS or I import navbar,
then the project context for all of those
will merge together, and that's what I will use
to start my module context with.
So that's what I will init my module context with.
So that's how you can pass data around
from one module to another.
I see.
So there's this fold project context notion
that tells you how to give in,
what would that be, give in all of the modules that import me,
give me the context from all of those modules,
with context from imported modules?
Yeah, that is just the information from the imports
that this file imports.
So if I import navbar,
then I need to get information from navbar
to do my reports correctly,
my understanding of the context correctly.
How it is used will usually not matter too much,
so it's more important to have it this way than the other way.
It's the other way around too, but that is trickier,
because you will still need some information
from the imported modules.
Right, interesting.
Yeah, so I feel like this is a good sensibility to develop,
is what is easy to analyze and what's difficult to analyze,
and try to find rules, try to identify rules
that you can write that are going to be very easy to spot,
real problems, not like maybe a problem.
And so those are the rules ideally that you want to be writing.
In some cases, maybe writing a parse don't validate rule
is extremely difficult, but possible,
and really valuable as a community resource
that is worth a lot of time doing flow analysis
and all of that.
But often you probably want to find the low hanging fruit
of a rule, and in a certain sense,
if it's going to be easy to analyze it using a rule,
it may be going to be easy for people to understand
when they're violating that rule or not.
You mean when Elm Review reports an error?
Yeah, if you make it overly complex, or maybe that's...
Yeah, I try to make the rules very actionable.
So whenever you get an error,
you want to make sure you have a reasonable amount of time
and not have to refactor your whole project to comply.
Right, which having too much context
and too much dependency on context
might imply that you might be making something
that would be difficult to fix.
So for instance, if you...
There's the Elm Review unused package
with a lot of no unused rules.
So I try to make all of these very actionable.
So there's a few rules that tell you
when a function is not used.
So imagine you have a function that gets reported as unused
because it is used in a few functions,
but all of those are unused.
So the thing is, if you try to remove the first one,
then you get compiler errors.
So that is not actionable.
It is much easier to tell the user,
and then when all of those are removed,
then you can remove the one that reported at first.
So don't try to be too smart
and try to let the user do tiny steps, as you like to say.
That's a great point.
So you're kind of saying,
if a problem can be fixed kind of recursively,
like applied over and over
and you can point out a root problem,
it's okay for it to be a multi step problem
where you address an error
and then another error arises
because it uncovers a new issue
that's now at the leaf that was before a parent of a leaf.
Yeah, and also it's just much easier to write
as an Elm Review rule.
Yep, that makes sense.
So okay, that brings up another topic
and another cool feature of Elm Review.
One of the most fun features in my mind, which is fixes.
So you want to introduce what an Elm Review fix is?
Yeah, so Elm Review, when it reports an error,
it tells you sometimes,
hey, this rule is fixable.
If you run the Elm Review dash dash fix
or dash dash fix all,
I can give you suggestions as to how to fix it.
Right, and suggestions being an actual diff of,
here is your code before,
here's what it would look like after.
Does that sound good to you?
Yeah, so you can only add fixes to a problem.
So it always needs to be an error.
In practice, I don't see that.
I haven't seen that as an limitation.
But yeah, what a fix is basically
is just a list of edits to a module.
So it's very string based,
but it's basically take this range,
so from point A to point B, and remove it.
At point A, add this text,
and at this other range,
replace it by something else.
Right, which you would get a range from,
when you do a visitor, it gives you nodes,
which are just these AST elements
like expressions with a range, right?
So you can guess it or calculate it yourself.
You get that from your visitors.
Yeah, or just very simple computations.
Like merging two ranges,
because I want to remove this and that,
and that's it.
So if I did CSS.RGB 0.0.0,
and then I could imagine
we could have gotten some context
where we visited the color palette module,
and then determined that there's something
with that definition of RGB 0.0.0 called black,
and then we could suggest a fix then,
and we could say,
hey, this is defined in your color module,
so there's a problem.
You're not supposed to be using custom colors,
and here's a proposed fix.
Would you like me to replace this
or not?
That would be very useful.
Would it be able to add and import
in that process of doing that fix?
As long as it's in the same file, yeah.
You would get a big diff, though.
No, it would be fine, actually.
Would it reformat the file potentially?
So every time you apply a fix,
we run elm formats.
Right, so like if I had a record that I defined
as a record alias on a single line,
it might end up on multi line or vice versa
after applying that.
No, because we're not actually trying to...
So fixes do not work on the AST.
We do not take the AST
and transform it back to a string.
We take the raw string,
the raw source code,
and we apply the fixes on that.
So the only things that will get changed potentially
are the fixes that are then changed by elm formats.
I see.
So if something parses into a different AST,
it'll get reformatted.
Elm format will be run,
and whatever elm format produces based on that
will be what it results in.
But if the syntax tree didn't change
in that particular part of it with your fix,
it's not going to touch that part of the AST.
Yeah, unless the code was not elm formatted to start with.
That makes sense.
Cool. That's very cool.
I'm a bit sad that the Fix API is so crude
because it works with strings basically,
but it's a bit tricky to transform things into new ASTs
and replace ASTs because that can change the behavior
and mostly the formatting.
So the behavior that is actually normal
because that is something you might want to,
but the formatting is a bit annoying
to get it right.
So hopefully at some point I will have an idea
about how to improve this,
but it works okay, I guess.
I've done some contributions
to the IntelliJ Elm plugin as well,
and one of the things
in the contributing guidelines for that project
is that it is recommended
that you, just in general with IntelliJ plugin authoring,
don't manually try to tweak the AST of files,
but rather that you actually reparse it
and replace the AST from reparsing text.
So you can build up an AST,
but then stringify it and then reparse it.
And the reason is because basically the syntax parser
for the Elm language for the IntelliJ Elm plugin
will never be able to produce the same AST
that you manually created.
So even if it's like valid types for the AST types,
it may not be something that the parser could ever yield,
which would be bad that you would produce that.
So that's why they recommend that you don't manually
update the AST in IntelliJ plugins,
but that you actually transform it into a string
so that your AST be parsed by the plugin
to make sure it doesn't give you some weird special case AST
that would never actually occur.
So I think that that's the right approach.
Also, I think that it's reasonable to do code generation
with string templating in certain cases.
So I don't think it's necessarily a bad thing.
Sometimes that's perfectly okay.
Well, code generation I think is fine,
but if you pass in a bad range,
which shouldn't happen because you should just reuse
the ranges that the AST gives you, but still it's possible,
you can create parsing errors.
So if you have an if expression,
and somehow you remove the f from if,
well, you've got a parsing error.
So Elm Review will have several mechanisms
to work against that.
If you get such an error, it will tell you,
and it will fail the test.
And also the CLI, when it tries to apply a fix,
it will tell you that there's a problem.
But before you run it with fix, it will tell you,
hey, I think I can fix it.
And then when you try it, it says,
oh, I thought I could.
The most challenging part of this sort of static code stuff
that Elm has been generating valid white space
for AST transformations, it's such a pain.
It's like, I mean, like I built in IntelliJ Elm,
I added this feature where it can add or remove pipelines,
which is like a fairly complex feature.
But by far, the hardest part was preserving valid white space
because if you have like a case expression
or an if or anything anywhere in there,
you have to like do all this extremely complex static analysis
to make sure that you preserve the white space
to make it valid.
It's kind of a nightmare.
I've given, you've heard my rant on this,
but basically I wish that, I don't know,
like somehow in the Elm syntax,
I wish that like curly braces were valid there
at least for like tooling authors do code generation
without worrying about white space significance.
But that's my biggest complaint.
That's why I don't try to make anything too smart
because I don't think I can nail that
without a lot of research.
For the Elm review rule that I wrote
for transforming debug.todos with HTML into Elm code,
what I did there is I just indent things a whole bunch
because I'm like, hey, I'm just generating
all this code on one line
and it's possible that something goes wrong.
But if I just indent everything a bunch,
then it's going to be valid.
I'm guessing the diff will not look great on Elm review.
No, the diff looks fine
because it runs Elm format before.
I don't know why the fix
because Elm format has the property that it's a CLI
which means it's an async call
and Elm review does most of its things
in one pure Elm function.
I'll have to check.
It's a bit tricky to call Elm format.
Maybe I made the diff really bad after I made that change.
I'm not sure.
I'll have to take a look.
I didn't notice it being looking bad, but yeah.
One of the things about fixes is that a lot of people
enjoy having fixes for their rules
and sometimes they apply them without looking,
which is dangerous.
But as a rule author, it is a lot more tricky
to fix an issue than to report it.
Maybe 10 times as hard because then you will have to
gather a lot more context.
You will have to make sure you don't report an error twice.
You might understand why I mean that.
Let's take the example of CSS.RGB.
We said we reported functional value expressions.
Because we want to report even when you age it,
not only when you call it.
But when you want to fix it,
you want to look at the function call.
And then you will pattern match to see
is the function CSS.RGB,
in which case you can report it.
And then you will also need to look at the arguments.
Do they match with something that I've seen
by the way in a different module?
So you need a project context.
You also need to look at the imports
to see how palette has been imported.
And then you add a fix.
You can import this node for this function call
But you still keep the pattern that reports function or values
because you still don't want the function to be an alias.
The thing is, now you will report function calls to CSS.RGB twice.
Once in the function call and once for the function value
for the reference.
So you will need to add to your context
So that's a lot more complex than just reporting it,
which was very, very easy.
So basically, would you say that you want to catch problems
pretty consistently and robustly,
except for that advice that you gave about
it's okay if you have to keep reapplying it at the leaves
and work your way up, that's fine.
But you don't necessarily need to suggest a fix for everything.
So if you just have some case where you're like,
in this context, I can't consistently give an accurate,
precise, correct fix, then just say,
I'm not going to give you a fix.
I'm just going to report the error.
That would be fine?
Yeah, I would say that it's perfectly fine.
Because one, it's a lot of work,
and that's not necessarily what most of the value is,
but there are multiple ways to fix an issue.
And choosing one is making the choice of not suggesting
the other one, which could be the better solution.
Yeah, I would say as a user,
the impact of having some bad apples,
some poison pills would be pretty bad.
Like if I have some rules with fixes that are suggested
that aren't actually correct transformations
that change the meaning or whether my code compiles,
it would make me lose trust
in just applying automated fixes in general.
So I would say being extremely conservative
is a good approach here.
It's okay to just back out and say,
I'm not going to try to fix this.
I'm just going to report a problem.
That's the way to go.
Because ideally, we want to be able to just run
elmreview fixall without worrying about it.
So we should hold a high standard for fixes, basically.
Yeah, so I did write a lot about this issue of fixing
and one not to fix in the elmreview package documentation
under review.fix.
And as you said,
you really, really don't want to introduce compiler errors.
Because we can determine that.
But determining whether the project will still compile
is a lot trickier because we don't have the power
of the compiler.
I would say that I would rather have a compiler error
than a semantic change unintentionally from a fix.
Because if it's a compiler error,
then I know that something went wrong.
But if it's a silent change to behavior...
Well, it depends on whether you deem that it's always better.
I don't know when that would apply,
but for instance, if you would need to sanitize some HTML,
some string, then sure, always apply it.
It's fine.
Right, right, right.
The problem with compiler errors or introducing those issues
is that you need to imagine that the user will do...
I will run elmreview with fixall.
I will apply multiple fixes.
Imagine I have 200 errors.
All of them are fixable.
And I apply all of them, and I'm very happy
because at some point it tells me that there's no more errors.
And then I try to compile, and it doesn't compile.
Okay, when was this problem introduced?
Which fix introduced it?
You don't want the user to have to answer that question.
Right, right.
So if you don't think you can fix the issue entirely
don't propose a fix.
It's better to let the user do it,
even if it's annoying to them.
Yeah, that makes sense.
Well, did we miss anything that's important for people to know
when they're getting started writing their own elm review rules?
Definitely look at the docs because you wrote a lot.
I mean, it's like a tutorial practically.
It's pretty in depth.
Yeah, so there's a lot of information and guidelines
in the elm review packages,
that's thought into there.
So if we miss something, it's probably in there,
or you can ask about it in the elmslack.
One thing I would say is also,
there's a lot of examples in the review rule module documentation.
And actually, if you look at with module name lookup table,
which we talked about before,
there is an example that does what?
Forbids calling css.caller.
I actually saw that midway through our conversation.
I actually hadn't noticed that until we started going through it.
But it's, yeah.
Yeah, you said, let's spare on this.
And I was like,
I'm pretty sure I already have an example of this one.
Because this is one of the basic examples I would give
to new guarantees to your project
that the Elm Compiler can give.
One thing I would like to touch on before we finish
is that elm review is a great tool
to add a lot of new guarantees about your project,
a lot of niceties.
So if you want consistent ways of using code,
so all the colors should come from the palette module,
for instance, elm review is a great tool for that.
It's a great linting tool,
and we've seen linting tools in other ecosystems,
but that's not what makes doing this in Elm
with elm review interesting.
Yeah, I would agree.
The thing is, elm review is just one tool.
I love it a lot.
But it really is a new way of thinking about problems.
So it detects how you write code,
not what it necessarily means.
It can catch, and the compiler won't be able to catch.
But there's a lot of things that the compiler
will be able to catch, and elm review won't.
And a successful strategy to catch new problems
is to try and make use of all the tools at your disposal
to get those guarantees.
So I have an article about safe, unsafe operations in Elm,
which takes the example of a function
to which you pass in a reg X as a string,
which is an unsafe operation,
because reg Xs can fail, right?
But the rule will just say,
this is fine if the reg X is valid.
But if it's not valid, then it will report an error.
I lost my train of thought.
So you can do more than just looking at code style,
but you can actually use elm review as a tool
to give you stronger guarantees about your application
in tandem with opaque types, the elm compiler.
You want to use the compiler, you want to use a type system.
And it's kind of like, sometimes you need to do something
like two pronged attacks, so on both sides.
So you want to reduce what is possible with types.
On the other hand, you want to reduce
what is possible to write with elm review,
and maybe also imply code generation.
And when you limit what you can have as values
in the type system and what you can write with elm review,
at some point, maybe, if you're lucky with your problem,
you can get to a state where it is not possible anymore
to write something that will create a problem.
So look at the safe and safe operations article.
I think it's a good example of how you can use
both the type system and elm review to make sure
that all reg Xs in your code are valid.
I was trying to think of some opportunities
to potentially use elm review with elm pages.
And one thing I was thinking about is,
it would be kind of interesting if, for example,
if I know the static paths in your app,
and you wanted to say,
give me a link to this static path.
I mean, you could have a function
that points to a static path,
and you could validate that it exists.
Or if I had a route type for paths
that has the dynamic parts of a URL.
So if a blog route has a slug,
which is a record slug which takes a string,
I could check that the slugs that you're passing in
for a route are valid.
And that's an interesting one,
because it's actually using code generation in the hood.
You check that,
but it's not just one technique or another.
It's using all these together.
But then you also have to ask,
what is the user experience at the end of it all?
And sometimes, if you're checking something
through static analysis,
that means you're not going to get auto completion,
for example.
So maybe, Philip,
who we talked to for our last episode,
when people are listening to this,
instead of doing code generation for Elm Tailwind modules,
Philip could have done an Elm review rule
where you have Tailwind CSS classes that it points to.
I mean, he would have missed out on a lot of the benefits
that he got in terms of dead code elimination
and that sort of thing.
But if you wanted to do a plain Tailwind CSS
to basically validate Tailwind CSS classes,
you could build an API in tandem with Elm review
and requires it to be literals.
But that would have limitations,
not least of which would be
that you don't get auto completion,
so you have to think about the full experience, I guess.
Okay, well, do you have any pointers
for people getting started?
Are there any repos that you would recommend
taking a look at for inspiration?
Yeah, so the most popular package for Elm review
is Elm review unused.
Unfortunately, it's a bit complex
because the domain is pretty complex
and there's a lot of things to look at.
And since we do fixes,
they're a lot more complex
than they need to be otherwise.
So if you want really simple rules,
I would look at Elm review debug under my name.
You can also look at Elm review common,
which is also very accessible, actually.
And other things like I've got a repo
that I haven't published
Elm review simplification.
Those are very accessible.
They're context free.
Yeah, usually context free
or they have little context.
But you can also look through
the review rule documentation.
There's a lot of examples in there
with reasonable but simplified approaches to rules.
And I think just looking at all the possible visitors
and possible APIs that are provided for you,
will give you a better sense
of what can I build with this?
What can I detect?
And how can I best do that?
Do you have any advice
on how to look at a problem
when you're working with some code
and how to think about
what can Elm review help me to do here?
Like what general kinds of smells
are you looking for?
For example, what kinds of invalid states
should people look for?
If you find yourself with some code that says,
error, this should never happen,
then you want to ask yourself the question,
can I make impossible states impossible
with my data modeling?
What are the kind of cues like that
for Elm review, would you say?
Yeah, usually I would go,
I would look for the type system.
And when I reach the limits of using the type system,
I try to see,
can I just by looking at the code,
which is basically what set analysis does,
but at a much larger level,
can I determine just by reading it
whether this is a problem or not?
And are there a lot of false positives?
Would there be a lot of false negatives?
False positives are the problems that are there
that are okay to have in some cases,
but false positives are the ones that you want to avoid
as much as possible.
I think you said false positives both times.
So false positives are reporting a problem
when there isn't one,
and a false negative is not reporting a problem
when there is one.
Yeah, exactly.
Every time you hear false,
it's something negative.
If that helps.
Yeah, I try to determine,
could I by looking at the code,
determine whether there's a problem?
How many false positives would there be?
How many false negatives would there be?
Yeah, right.
When I try to write a rule,
I just start simple, very simple,
like as we said, do the dumbest thing
and then just iterate
until you get to something very sophisticated.
And maybe in the course of doing that,
I will determine, I will find out,
and that becomes very complex.
I would also try to run the rule on a large project.
So I try to run it on my work project sometimes,
and sometimes it finds a lot of false negatives
or false positives.
It doesn't find false negatives,
but sometimes I know that there are false negatives.
Yeah, right.
Yeah, just use tests,
use actual applications to look at.
And just to emphasize the point,
I think the reason why you're saying that false negatives
are such a big problem
is because they give you a false sense of security, right?
You're like, oh, I don't have this kind of problem
in my code base.
Whereas false positives would block a user
and they would get annoyed.
At least I, as the author of the tool,
I really want my user to have a great experience,
and I hope you can see that.
The CLI, the error messages,
it's really, it's been a great experience for me
writing and using Elm review rules.
It's a really cool platform.
Very happy to have it.
Well, I think that gives people a lot to chew on.
I'm really excited to see what emerges in the ecosystem
because I think we're starting to see
more and more Elm review packages being built,
more and more active authors
and the best is yet to come.
I would have said yes
if you didn't say the last part that way.
You got to sell it, Jeroen.
You got to sell the project.
Who will be the next superstar Elm review author?
Will it be you?
If it is you,
you're going to be the next superstar.
All right, and do let a friend know
about the Elm Radio podcast
and give us a rating,
Elm Radio podcast on Apple podcasts.
And well, as always, Jeroen, until next time.
Elm Radio