The Root Cause of False Positives

We explore false positives and negatives in static analysis tools, and how Elm helps us avoid them.
August 15, 2022


Hello Jeroen. Hello Dillon. I'm quite positive you're going to enjoy this episode. But I
could be wrong. Maybe it's a false positive. Maybe it's a false positive. That's a good
one. What are we talking about today? Today let's talk about false positives. Let's talk
about how Elm removes kinds of false positives in at least the area that I care about, static
analysis. But I think we can find ways how that applies to other things like optimizations,
stuff like that. Oh, okay. I like this. So let's start with the definition. What is a
false positive? This one is always kind of tricky. Like which one is the positive and
which one is the negative? Yeah, because you got false positives, you got false negatives.
So yeah, a false positive is when, at least for a linter or for a tool like Elm review
for a competitor, is when the tool reports a problem when it should not. Like it tells
you, hey, there's a problem here and actually there's no problem. That is a false positive.
So positives means that we think there's something and there isn't. And then you've got false
negatives, which are the tool should report a problem, but it doesn't. And then you've
got true positives and true negatives, which are like real errors and real non errors,
which yeah, maybe those names are weird. I actually don't know if there are names for
those probably. Yeah, it seems reasonable. If you take a COVID test and it comes back
positive, it could be a false positive, which means the test said you had COVID, but you
don't have COVID. It could be a false negative, which means the test said you do not have
COVID, but you do have COVID, which in fact is quite common, which kind of makes you think
like how much value is there to testing when there's actually like, from what I understand,
a fairly high rate of both of those. And it just like makes you like question the whole,
how can I make decisions based on this information, which I know is flawed quite often. So it's
a strange situation. The same dynamic applies in static analysis. If you can't rely on the
results you're getting, it makes you sort of lose faith in what the tool is telling
you. Yeah. Well, there's a difference between lenders and medicine. There is. Although sometimes,
you know, you rely on the guarantees you make in your code for medical procedures and when
people's health is on the line. So there can be an overlap. Yeah, absolutely. So yeah,
what is in your opinion, the root cause of false positives? Well, I did see a tweet this
morning that I was thinking maybe should have been marked with spoiler warning, at least
for Elm Radio co hosts. I regret posting it. Let me try to forget I ever saw that and see
if I can answer the question without prior knowledge. Please forget the correct answer.
Which was a very good tweet, by the way. We will link to that tweet. I will explain it,
I guess. It's hard not to be biased by the nice tweet you wrote, but I would tend to
think of it as like maybe complexity. Like how complex is an answer. Like if you're looking
at a chess position and you want to say, is there a forced checkmate in this chess position?
It's easier to, you know, if you can confirm that there is, you know that there is. But
if you can't, maybe there is one, but it's so complex. There's just so much complexity
to the position that there are nearly infinite possibilities. So it's very hard for you to
say. Like if you look at an opening chess position, is there like a forced checkmate?
Like maybe, I don't know, if I had infinite capacity to process infinite lines, then maybe
I could answer that. But it's just too complex for me to say yes or no.
Yeah, it's really easy to figure out whether there's a forced checkmate if you have two
pieces on the board and they can both only move, do one move, then it's really easy.
Like you got two or three, four things to check, depending on how things work. And that's
it. But if you got 10, 20 pieces on the board and they can all move in so many directions,
yeah, things become a lot harder. You have a lot more checks to do. You have a lot more
scenarios to evaluate.
Complexity is in my opinion, for sure, a cause of false positives.
Yeah. And if you can rule things out and if you can eliminate variables, that helps eliminate
that. As you said, that's a very good example with if you just have four pieces on the board
in a chess game, and can you answer that question? At that point, you can answer that with confidence.
If you can say, and sometimes it's easier to say proof by existence proof. And existence
proof is easy. I mean, if you can find it, but sometimes proving absence is difficult.
You could prove existence like, do black birds exist? Well, I can look out my window and
see a crow and say, yes, there are black birds, but then do...
Wrong, it's a dinosaur.
Well, then perhaps. And I mean, yeah, it does get quite philosophical. Like what can you
know? In a way, this is sort of like the term epistemology comes to mind, which is just
sort of what is knowable. And I often found it frustrating in philosophy classes when
epistemology comes up. I always found that topic extremely frustrating in a philosophical
context because it's kind of a dead end because you say what can be known? And at the end
of the day, you basically get to Descartes conclusion. I think therefore I am, which
is to say that is the only thing that I can prove. The only thing that's knowable because
my senses can mislead me. I can see, have I ever seen something and come to the wrong
conclusion about what it was and later discovered that I was wrong? So how can we trust anything
that we think we know? What is knowable? What is the nature of knowing and what is knowable
in the universe? And it's just to me, it's just not a satisfying subject in philosophy
because it's just like, well, nothing except that we exist. And so, okay, done. Like what
more can we do with that topic except go around in circles and we don't really get anywhere.
But with code, what is knowable? With static analysis, what is knowable? That's kind of
a more satisfying question because it's a more constrained area where we're looking
at. It's not just like questioning, can we trust these axioms about the universe? We
can sort of just think about it in terms of what can we know about this code? So yeah,
you want to give us your answer to that question. What is the root cause of false positives?
Well, the one that I came up with, and maybe we can figure out something even deeper than
that, but what I came down to is missing information just in general. Some of that through complexity,
some of that through other things, other means. But basically the way that I imagine it is
if you were omniscient being or omniscient tool, you had knowledge of everything, you
knew everything that happened in your program at runtime, at compile time, and you knew
what the developer's intents was, then you would always be right in what you report and
you would never miss anything. You would know, well, this is never used or this never happens
or this is not a problem or this is a problem. If you had all the information in the world,
then I think you would never be wrong. Therefore, the problem is missing information and you
can get those through different means. So for instance, for static analysis tool, one
of the bigger problems is dynamic code. So knowing what a value is at a certain point
is hard to figure out. Some tools can do it quite well. TypeScript does it quite well
and to some extent, much better than Elm. Elm knows the types, but it doesn't know the
values nor does it care about it. Languages that try to do proofs like Isabel or TLA+,
I think they know a lot more about what's going on in the program, but they're also
pretty complex and I don't know how they work. I don't know what their limitations are. Dynamic
things are complex, for instance. So when you have missing information, what do you
do? Because that's going to happen, right?
And perhaps you want to use the precautionary principle and say, I don't know if there's
a problem, so I will say, I can't prove that everything is okay, therefore, I'm going to
say there's potentially a problem because I can't prove that there's not a problem.
I guess this or the other approach I can think of is you could say, well, I didn't find a
problem, so no problems, right? That's sort of like what TypeScript does, right? It says,
I will tell you if I can guarantee that there is a problem rather than I will tell you if
I cannot guarantee there are no problems.
Yeah. So to me, that is making a presumption. So I might be wrong about the correct word,
but a presumption is when you accept something as true on the basis of probabilities. Like
it's very likely, I'm missing some information. I don't know whether this is true or not,
but I think it's going to be more likely to be true than false. So the example that I'd
like to take is an ESLint rule that is called array callback return, which is basically
when you do and you pass it a function, that function should always return something.
Right. So you don't accidentally have a list of... that you're writing TypeScript, you don't
write return and so it's returning void, which I think has happened to everybody that goes
between like JavaScript and Elm that you forget a return and you're like, why is this value
all nulls or undefined?
Yeah, undefined. Yeah. So it's a very useful rule to have. But the thing is when you analyze
JavaScript code, there is something that is missing and that is type information. So TypeScript
could potentially help here, but basically when you see, array being a variable
or something, it looks like an array or the map method of an array. And therefore you're
going to consider, you're going to presume, well, it's pretty much for sure the
method. So I'm going to report any problems that are confined in the function that is
passed to it, but that might be wrong. So when you're missing, you're having missing
information, you're going to make presumptions and when they turn out to be wrong, that's
when you have a false positive or false negative. Because what we could also do is say, do some
more analysis, like is this read an array? Can we find somewhere where it is declared
where we can clearly see that this is an array? And if we see that it's an array, then we
report a problem. And if we don't see that it's an array, if we don't know, then we don't
report anything. And that removes all the false positives that we have, but that creates
a lot of false negatives. So whenever you need to make presumptions, you're going to
have the choice to lean more towards a false positive or false negative, but you're going
to have to choose or do more analysis, which can be complex and maybe you're not going
to be able to figure out the answer.
Yeah. And it strikes me that this doing more analysis piece, that there are maybe two different
categories here that either something is knowable, but takes a lot of work to know potentially
to the degree where it's essentially unknowable because it's infinite. Like is there a forced
checkmate from the first move on a chessboard? Like technically knowable, but practically
unknowable or even if you knew, maybe the sequence of moves that would lead you there
is so large that it's not usable information.
You mean it's not, you can't compute it?
Or if you compute it, it's like, yeah, here's a list of size 30 trillion of all the possible
responses to these different lines that gives you guaranteed ways to respond with a forced
checkmate and it's like, okay, well I can't really use that. So even if it's technically
knowable, it's essentially as the amount of analysis and information to deduce if something
is knowable approaches infinity, it starts to resemble being unknowable. But there's
like unknowable and then there's knowable with work. Those are two different things.
So like an example of something that is literally unknowable would be like, if you take eval
into account, if you run eval, what can this code do? Well, if it's from user input, that
user input is undefined. You don't know what the bound, there are no bounds on it.
So well, it's not undefined, it's a string.
Right. Or who knows?
Maybe it's a string undefined.
Yeah, it's not known. And therefore there's not enough information there to analyze certain
things about that. Whereas there are certain scenarios where you often talk about code
flow analysis and maybe it's like a massive amount of work. Maybe you need some postdoc
programming language researchers to assemble a team to solve this problem, but it's technically
knowable and you could do that or it's just a huge amount of work.
If you listener want to do that, please contact me.
And Elm is a very interesting space for these problems because as you've sort of been hinting
at it is more knowable because it's more constraint. Like in our chess analogy, it's more akin
to the chessboard that just has a handful of pieces rather than the starting chess position.
Yeah. So you're now hinting at something interesting is like, why is Elm more knowable, more analyzable
than other languages like JavaScript? And to me, there are multiple aspects to that.
One of which is that there's a compiler. Doesn't seem like a big thing, but it actually is
because potentially when you analyze your JavaScript code, it's a bunch of gibberish,
like saying A equals A or one equals two or well, actually not. Yeah. But basically what
you can have is a code that looks like code, but doesn't mean anything like it references
undefined variables or it has embedded semantics.
It's syntactically valid, but not well defined code.
Yeah, exactly. And those are all things that a compiler checks for. So when you know that
these are checked for you by the compiler, by a compiler, then you can start to rely
on them. And that is quite important actually. So JavaScript doesn't have a compiler, but
what it does, what it does have is a linter. So what ends up happening for language like
JavaScript is that you have a lot of ESL rules to do the same work that a compiler would
do. So you have a rule for undefined references. You have a rule for reporting duplicate declarations,
stuff like that. And once you have those, then other rules can kind of depend on those
semantic issues being not there. But it is like they can kind of rely on those because
people can disable the ESLint errors or people can just not enable those. And the fact that
we kind of need those rules is a reason why they're using the recommended configuration
for ESLint or other tools. If the tool doesn't ship with it, then they're not going to be
enforced and other rules will not be able to depend on those. And that is actually something
that we don't, we need with Elm because the compiler checks for so many things that the
Elm review rules don't need any more certainty. They can just rely on the things that the
compiler checks for and that's enough for pretty much anything.
Right. They might have like a snowball effect where by applying different rules and applying
fixes to those rules, you can eliminate more dead code because making one piece of dead
code go away makes another piece of dead code go away and there's this snowball effect.
But as you say, the language guarantees are enough that you're not depending on, I need
this guarantee in order to make my checks, therefore you have to turn on these rules
as prerequisites. I mean, you could imagine scenarios like that, but I guess you haven't
encountered them yet.
Yeah, I haven't yet. But yeah, for instance, if you were trying to evaluate an expression
and you saw a reference to a variable and you didn't have the guarantee because you
were in JavaScript that that variable was actually referenced anywhere, then potentially
you would enter a weird state or you would crash because, oh, well, I expected this to
be in the scope somewhere. So it's really nice not to have to be defensive about those
things. So therefore a compiler really helps with all those things.
Right. I could imagine like some rules around divide by zero or not a number or something
like that. And you could say, well, there are certain entry points where you could get
a number from a port from a JSON decoder. And at those terminal points, maybe you have
an Elm review rule that checks that you need to unwrap them into safe types that are not
a number. And then that rule could be a prerequisite for another rule that assuming that all of
the number inputs that you're using are not a number to begin with, you're dividing them
in a way that you're checking for divide by zero and things like that. And you're going
to have well defined values.
Yeah, I could imagine that as well. At work, we have a rule that detects unused CSS classes.
So what we do is we take our CSS files and we extract all the classes from those and
we turn them into an Elm file that our Elm review configuration then uses. And then we
just go through the entire files and find out the ones that are used and report the
ones that are left. But to be able to tell that, we also have another rule that checks
for any usages of the class function that are too dynamic, that are too hard for Elm
review to tell. So they kind of depend on each other. I actually don't remember whether
we merge them into one rule. But as long as you don't make anything depend on the other
one, like a fix, like imagine you have a fix that you want to apply, that should probably
not depend on information that has not been validated before. And because in Elm review
fixes take the upper hand or prioritize compared to non fixed errors, at least in fixed mode,
that can be kind of dangerous. But yeah, at least the number of guarantees that we have,
the number of presumptions that we need to do in Elm, or at least in Elm review is a
lot lower than what you would do in ESLint. So this hasn't been a problem really, so far,
in my experience, it could be, but I'm guessing it would be for things that are a lot more
precise than what we're currently doing.
And this seems like what you're talking about with checking for class names that are too
dynamic for you to basically effectively analyze. Because you could imagine pulling on that
thread more and more and saying, well, what if it's just an inline concatenation between
two string values? Could we just check those two literal concatenated string values? And
is that literal enough for us to use? And you say, okay, well, now that we're checking
for concatenated string literals, why don't we add something that says, well, what if
it's a string constant that's concatenated to another one? And maybe that's quite useful
because you want to be a little more dynamic with your class name. So and then you say,
well, what if we want to add a number to it? Can we and then we want to be able to do arithmetic
on that numbers, or we want to be able to map over a list of numbers and then check
those values. And eventually, you're just building like a pre compilation like evaluator
that's actually evaluating your program before a compile time. And you certainly can do those
things. But you're intentionally choosing a strategy there to preemptively give a false
positive and just say, or to put a constraint on the rule where you're I mean, I guess another
way to look at it is rather than the false positive, you're saying it's not a false positive,
it's just a constraint that the rule as the rule is saying, this is not a false positive
that like, hey, this could actually be valid, you're actually saying, I'm adding, here's
a rule that adds an additional constraint to your code. And it's not a false positive.
It's a true positive, this is not okay, you have to use this constraint where you only
use string literals for class names.
Yeah, yeah, as you said, like, if you want to figure out a lot more things, basically,
at some point, you're building an interpreter, as I see it, which would definitely be valuable
to be able to infer a lot more things.
And in Elm, you can do a lot in that regard.
I think you can do a lot. Yeah, because it's just pure functions, right? So it's none of
them have side effects that make the next things a lot easier. But it would still be
a lot of work and would make the tool a lot more slower, I think. And for the rule that
doesn't report false positives, but reports things that it wants to enforce new constraints,
you're absolutely right. Where I would say that it switches from a false positive to
a constraint is in the error message. Like if the error message actually explains like,
hey, this is not a problem in the sense that it's not going to cause your code to crash
or behave weirdly. But for the sake of this other Elm review rule that makes sure that
we don't have any new CSS classes, we require that this is a that the argument to class
is a static string, a string literal. And that's what we did. So if you explain the
problem and if you explain the benefits, then people accept it. Now, I haven't heard anyone
complain about this, so I'm very happy about that. But if you have like a one liner in
like in most static analysis tools, like that's going to be hard to explain. Like what do
you what is a problem? How to move forward? Why is this a real problem? Like, yeah, people
want to understand the problems that you're reporting.
So Elm review does, I mean, obviously abstract some things away from the user. In this case,
like a review rule author. Like, for example, like you do provide this lookup table, the
module name lookup table is sort of somewhat going down this path of being able to provide
more information about, you know, like, and it's a very, to me, it's a very interesting
path in Elm review. And I'm curious, like, are there any other examples where you sort
of do some amount of additional analysis of the code where you can sort of process some
information? Not, you know, not a full on pre interpreter, but doing a little more analysis.
Are there other examples of that in Elm review? And are there things on your mind that you
think might be appropriate for Elm review to expose?
Yeah. So just to explain the lookup table that you mentioned, the module name lookup
table is just basically a dictionary that says at this location in the source code,
this is a reference to this value, which comes from this import from this module. Because
in the abstract syntax tree, you have when you say a dots b, you reference the b function
or b type, depending on the uppercase of the a module.
Right, because HTML dot text could just be text, or it could be import HTML as h, and
then it's h dot text.
Yeah, or imports HTML dot styled as HTML, stuff like that. So the module name lookup
table is there to make it much easier for you to figure out what is the real original
module that this value comes from, which we didn't have at the beginning.
People probably invented that from scratch or a sort of imperfect version of that, I
would imagine.
Yeah, basically people were like, do I see an import to HTML? If so, what is the alias?
And also, does it expose the text function literally or using exposing dot dot? And basically,
people did it like that way, which in practice is good enough. I don't think you're gonna
have a lot of false positives, but it's a lot of work. And you could have some false
negatives potentially. So yeah, this was something that I really wanted to add to Elm Review
and I got it in there. And now I don't think about this sort of problem, which is really
nice. But it did require a few iterations to get right.
Yeah, it's basically something that you pass into the context, right?
Yeah, the way that you initialize your context, basically your model for going through the
AST, you say, hey, I'm interested in having the lookup table because I think that's gonna
be useful. Please compute it for me and give it to me.
And then people can use it.
Which is largely like a performance optimization, right? If it's not needed, then you don't
need to compute it.
Yes, exactly. Currently, if any rule requires it, then I compute it. I think I want to be
smart in the future where only the rules that I'm going to run now will, if any of them
are needed, then I compute it. Because basically, the fix mode is quite slow. And I think I'm
going to need to be able to cut up the review phase and running one rule at a time and be
able to stop whenever I find like a fix.
Elm is very interesting in that regard because certain times you need to compute certain
things upfront in a sort of framework design because the user can't just invoke a method
that then mutates some dictionary somewhere and then suddenly it doesn't have that value,
but it goes and performs a side effect and puts it in there and memoizes it. So you sort
of architect things differently in Elm.
Yeah, and also because we can't memorize it. So either we say, well, we're going to compute
it once and then people will use it zero to n times, or we're going to compute it lazily
and then it will be computed as many times as people require it, which is unfortunate.
In practice, it probably works out okay a lot of the time, especially in the context
of a browser application. Maybe for CLI applications, it's a little bit different.
For performance heavy tools, yeah.
So are there more cases where you've considered adding these types of things that provide
more sort of, you know, rather than just the abstract syntax tree, the syntax tree with
a little processing, with a little extra analysis performed for you that you can access through
the Elm review platform?
Yeah, I actually just added one this morning, this weekend, basically the module documentation.
So that's the curly brace dash pipe comment that you have at the beginning of your file
before the imports. That is the module documentation. And currently Elm syntax, the AST library
that we use doesn't have a way to store that as the documentation of the module. It's just
among the comments. So what I had to do in a bunch of rules is to go through the comments,
find the module documentation, which I just learned this weekend that there was room for
false positives because ports also have that problem. Like the documentation for port is
also not attached to the port. Whereas documentation for a function or for a type is properly attached.
So yeah, basically it was possible to confuse the documentation for port as the module documentation.
So yeah, that's, it's not super tricky, but it's not nice to have to compute it everywhere.
And in all of my implementations it was potentially broken. So I just made it, I added a new visitor
or a new context creator function to be able to have that information right away, basically.
And so I'm going to publish that in the next version. And the other big one is type information.
Yeah. So it is really surprising that Elm review works so well for a typed language,
considering we don't have type information. There are two ways that we can do that. One
of them is by invoking the compiler, which has a few problems, notably that you can't
invoke the compiler in tests. So probably have to write a separate testing framework
for Elm review rules where it would create files, run the compiler thousands of times,
because I have thousands of tests. And also the whole review process is one giant pure
function currently. And if I had to ask for the type information, then I would have to
break out of that somehow, especially in the fix all mode, it would be very messy in practice.
So the other method is to do the type inference ourselves, which I've tried a few times so
far and got so far.
It's a somewhat challenging problem, I would imagine.
Yeah. I think you need to know how things work, what the algorithm that Elm uses works
so that you have the same results because you got some edge cases where it can have
some differences. But basically you need to know the theory well in order to do a nice
implementation. And I've never understood that algorithm properly.
Yeah. It's a huge task.
I know someone who's working on this on and off. I don't want to put pressure on them.
Someone who has a very common name, it would seem.
Maybe yes.
We know you're listening.
Yeah. But yeah, that would be really nice. A few applications of that would be, for instance,
the no missing type annotation rule that could generate the missing type annotations.
That would be so nice.
Yeah. So we already have that in the editors. So we know that could work well. It doesn't
always give the nicest error, the nicest type spots. It could still be helpful.
But that would unlock more information. You said that removing false positives, it comes
down to needing more information. If you had unlimited information, you could remove all
false positives. So what are the areas that you could remove false positives with that
extra information?
Well, it's not necessarily false positives. It's false positives and false negatives because
you would be able to know more and therefore you would be able to report more. In Elm Review,
there are basically no false positives. So I'm not sure it would help with much. I know
with one location where it could potentially help, where we do have a false positive that
people report sometimes, which is the no unused custom type constructor arcs. It's a mouthful.
Basically, you can create a custom type where you say type A equals A int, type ID equals
ID int, for instance. And then you never extract that identifier, that string value. So the
rule reports that as not used. But potentially you could use that in a comparison. Like,
is this ID the same one that this one has? And if you use it that way, there's a false
positive. If you never extract the ID in another way. So that could potentially be able to
tell us like, hey, in this comparison, is there a usage of this type? If so, don't report
that type. So that's a false positive that we could remove. And then it's mostly going
to be about false negatives because there's a bunch of rules that we can't write with
that type of information. And well, I don't have that many in mind, but a few, like for
instance, the one that I really want and that some people want is reporting unused record
fields. That can get quite tricky to do right if you want to, basically we can do it. It's
just going to have a lot of false negatives. So as I said, like you can either lean towards
false negatives or false positives when you don't have information. Right. So basically
what we can do is, well, if we see that a function takes a extensible record as an argument
and some of those fields are not used, then we can remove those. And I actually already
have a prototype without working, but if you pass that argument to a function,
for instance, so you have a list of some records and you pass that to a Well, now
you need to figure out what is the type of that mapper function that you pass to the
[00:36:54] because if that one uses some of the fields and those fields are used, if it
doesn't, then they're not used. But if you don't know the type, well, you don't know
whether they will be able to, which fields are used and which ones are unused. So therefore,
if we want to be safe and not report false positive, we're just going to say, well, it
looks like it could use anything. So we're not going to report anything. And that's the
same thing for a model. Like you pass your model, which is usually a record with plenty
of fields, you pass a model to some function that is a lambda that is hard to evaluate.
Therefore, we can't tell anything about it. So we stop. So having type information here
would be a lot, very helpful because we could analyze the type of those functions and we
could see, well, it seems to be using this field, this field, and that's it.
Yeah. It seems like that would unlock a lot of possibilities, not to mention fixes that
could, you know, I mean, code generation fixes, all sorts of ideas you could find there.
Yeah. I can imagine we will still have plenty of false negatives, but I think we will be
able to catch all false positives or we would not have false positives, but that's yeah,
again, like how conservative we want to be about things being used or unused. Cause we
could go either way. We could potentially have a configuration, the rule that says try
to be more aggressive now just for a while. And then you go check the false positives
and maybe you can remove, you could check the errors and maybe you can remove a few
things. Maybe you don't, but yeah.
But yeah, in general, we want to be very conservative and not report any false positives because
those are super annoying.
Yeah. So it seems like, I'm not sure if this falls into the same groups you've mentioned
of choosing to err towards false positives or err towards false negatives. But when we're
talking about ways to work with less information, you don't have as much information as you
need to be 100% sure of something that you're checking for. Well, like if we look at the
chess example again, you know, what do you do in that situation? If you, if you can concretely
determine it, then it's, then it's easy enough. If you can't, then you end up, you know, what
do you do for an opening chess move? You tend to rely on strategies and heuristics. So heuristic
for, you know, determining whether a chess move is good is you want your pawns to be
supporting each other. You want, you want to try to take the opponent's queen if you
can for trading for your knight. That might turn out to be a move that, that leads to
you being checkmated in the next move. But that's a heuristic that you can say, well,
let's just kind of generally assume that this is going to tend to be a good thing. And so
now your rule is now going back to like Elm review rules in the context of Elm review.
Now these heuristics are telling you things about your code that might give you unreliable
results. Because you're, cause essentially what a heuristic is, is it's measuring the
thing that is not directly what you care about. Like in a chess game, you care about checkmate.
That's the only thing you care about. But, and maybe like the number of moves until you
checkmate, like that's all you care about. But in this heuristic of trying to take the
opponent's queen, if you can, you are having a stand in goal that's, that's easier to determine,
but might that stand in might be flawed in some cases that stand in might actually not
yield the result you might, might lead to you getting checkmated.
Yeah. So yeah, in chess, I think computers are powerful enough to basically compute every
possible move in a game or close to no, no, probably not.
They're actually not. They actually rely a lot on heuristics to like prune the tree because
it's an exponentially growing tree. So it it's approaching infinite. So computers can't
deal with that, but they, so they do have to use heuristics.
Yes. They do use heuristics and do prune at all things. Yeah. Let's imagine they could
compute every case. Then basically it has perfect information. Right. So whatever it's
going to feed into try, it's going to work. If it's slightly limited, which in this case
it is, then you can improve the logic by saying, well, this is obviously a bad move. Right.
And you can remove some complexity. You can now rely on those. It's going to be a presumption.
Yeah. Right. Exactly. So when that turned out to be wrong, you're going to have worse
results than expected. But when those are true, then you get some nice results.
Right. So is that acceptable to have that in an Elm review rule or do you try to avoid
that? To have presumptions? Yeah. To have to have
heuristics because if it's a rule, it's telling you it's an error. There's no way to disable
it. And in some cases you might say, well, actually in this case it's okay. Like a code
smell, like, well, it's a code smell if you have a function that's over a certain number
of lines, but maybe in this particular instance, it's fine.
Yeah. In Elm review, I would, well, in general I would say it depends on the criticality
of the issue and how much you want to force it. For instance, the unused CSS classes rule,
that is basically like going to report false positives by saying, yeah, you should use
a literal, but as we said, it's going to be more of a constraint than a false positive
depending on how you frame it. Right. Yeah. Because we don't, so those opinionated rules
are fine if you opt in to those, I think. You need to be, to have the whole team accept
this rule in my opinion, like all of the rules, but in general Elm review doesn't allow ignoring
issues. So that's why at least all of the rules that I wrote tend to go to lean towards
false negatives or false positives. Right. Instead of heuristics.
Using heuristics, like basically using presumptions. I see. Well, I don't know, so I'm going to
take the route that I know will lead to people not getting false positives. You can view
it as a simple heuristic in a way, I think. So basically a heuristic is how you choose
to put some things into the false positive category or choose to put some things into
the false negative category. That heuristic is what determines that. Yeah, I'd say so.
And I think that Elm review really has this stance to go towards false negatives more
than other tools because in those other tools you can disable the errors when you have false
positives. And that also impacts how people write those rules or when they choose to write
and enable those rules. Because I know if I don't have disable comments, I know that
if I report false positives, it's going to be very annoying. And I know that if some
rule that reports like a code smell, which is not always bad when it reports an error
and shouldn't, well, people are going to be blocked. So if I have a way to tell them like,
please write the code this way in order to not have this false positive, then that's
acceptable I think. If I don't, then I'm just not going to write the rule. Right. And not
writing a rule is basically 100% false negatives. Right, right, right. Right. Right. Although
you could argue that 100% false negatives feels very different than 99% or 1% false
negatives because you know you just can't rely on 100% false negatives. Whereas you
don't know if it's 1% false negatives. You don't know if you can rely on that or not.
But other tools like ESLint and they have a lot more rules that have the potential for
false positives and they're considered okay because you can disable them. So I really
think that having the ability to disable errors impacts the way that we choose which rules
to write. Yeah. And as you say, it depends on the criticality of the issue if it is a
constraint that you really depend on for something that you're doing, then it's going to change
the calculus there. Yeah, if it's to report an issue that you know for sure will crash
your application, but it might be wrong, then yeah, it is probably something you want to
enforce at the cost of being a bit annoying sometimes. So people will have to add to the
disable comments or rewrite the code in a way that the linter will understand that this
is not a problem. But yeah, I haven't found any critical problems like that for Elm Review
so far, I think. So yeah.
So you often mention that code flow analysis is sort of the thing that makes a lot of rules
not worth writing. And I wonder... So here's the original tweet that we were talking about
earlier where you kind of talked about missing information being the root cause. So you said,
missing information is the root cause of false positives slash negatives in linters. Add more
information to find more problems and be less wrong at the same time. How? One, the linter
should provide more information to rule authors. And two, languages should restrict dynamic
features. So one, the linter should provide more information to rule authors. Like what?
Like is there information that Elm Review could provide to rule authors to help them
with code flow analysis in addition to the module lookup table we discussed? Like comparing
references seeing if something refers to the same value.
Yeah, for instance, having aliases. And I'm definitely thinking about ways to make analysis
easier, which is in a way providing information that would be hard to compute otherwise. Also,
there's just simply plenty of information that you sometimes can't get. Not so much
with Elm Reviews anymore, but like for instance, only recently I added the function to give
you the file path of a module to analyze. Because I thought people might do some weird
things with it. That's something that I was quite scared about, like people misusing the
tool at the beginning. In practice, not so much. So now I make that available and people
do use that for some applications. I don't have any in my head anymore. But so yeah,
give all the information that you can. And then yeah, make it possible to analyze codes
in a simpler way, like give type inference, give the real module name and yeah, provide
code flow analysis tools. I know that ESLint has something like that, which I never understood.
So I don't know how that would work. I've also thought about being able to figure out
like, is this value an alias to the other function? And that could be interesting. That
could catch more things. Definitely.
For the performance question, I could imagine, I don't know if this would be a fruitful direction
at all, but I could imagine a design where you sort of have, actually very much like
the store pattern that Martin was telling us about in our store pattern episode. Essentially,
you know, the store pattern you have your, I can't remember what he called it now, but
your query of these are the things I depend on for this page. This is the data I need.
You could sort of have that as a sort of subscription that says, this is what I need, which as we
discussed in the store pattern episode, as more information comes online in the store
pattern, it could be getting it with HTTP requests. Then you can do follow up information
because it's sort of a subscription that gets called whenever that changes. And then it
just keeps going until the information you say you need matches the information that
you have already or is a subset of it.
So I can imagine something like that where you sort of have like a subscription to like,
here's some computationally expensive data I need that you're not just going to go analyze
constantly and then you have these sort of remote data or maybe values or whatever that
you're waiting on. And then you can sort of take all those together once you have them
all filled in and then you can continue your analysis. So that could be really interesting
to like provide some primitives for doing that sort of thing.
I think the way that I understand it is I think already what Elm Review does to some
extent because we say like, I request the module name lookup table, therefore please
compute it. And the framework could do a better job at computing only what is necessary. And
then when it looks at the next file, compute again only what is necessary and so on and
so on. That I definitely want to have. And I think that's kind of the same idea like
this module depends on the lookup table for this module. So whenever you get to the next
module you compute it again for that module, etc.
Yeah, it is a similar pattern. I think the main difference would be in the case of a
module Elm Review knows what module it's looking at. And so it can fill in that bit of context
to say, okay, it's requesting the module lookup table and it's in this module so I can compute
it for this specific module. But if it's something more nuanced like I want to pre evaluate this
string for example, then it doesn't know which strings to pre evaluate based on some implicit
context of the process it's running. So in that case, that sort of store pattern style
could work where you can give it that information. You can say, hey, here's the node I'm looking
at and I would like to wait until you can finish analyzing, like pre computing this
string value, please. And then you wait until it's no longer a maybe and then get it back.
And that could allow you to lazily compute and memoize some of these more expensive values
with specific context where the user can say, I want it for this node. So anyway, like seems
like an interesting path to explore.
Yeah, it could be interesting. Yeah. In this case, it would definitely help to be able
to say, please compute this now and store it in the store directly without just by mutation.
That would definitely make things easier.
And I guess it's maybe a little bit of a chicken and egg problem to know which of these things
would open up interesting possibilities because when you offer this information to review
authors, review rule authors, then they do interesting things with it. And then when
they do interesting things with that, it builds and snowballs and it sparks people's imaginations.
And so it's sort of hard to know which ones to explore before you've seen what people
do with them.
Yeah. Yeah. Well, I have my own opinions about things that could be interesting or ideas,
not opinions. But yeah, I've been surprised by what people came up with. For instance,
you made the Elm review HTML to Elm.
Yes, that's right. Yeah.
Based on what Martin Stewart made credit to him and to you, obviously, but the idea was
from Martin.
His idea to use Elm review fixes as a code generation tool is 100% credit to him. And
I used a bunch of his code for that.
So yeah, that one I did not expect. And yeah, that's a pretty cool avenue to explore. Definitely.
I also know that some people would like to be able to generate modules to create files
on disk based on the same idea. So like, that could be interesting.
Yeah. So the type information is like the big one on your wish list right now.
Yeah. And also performance for fixes and performance for Elm review because in my opinion, it's
too slow. But there's maybe just me as a parent to the tool. Like, ah, at works, it takes
like a whole minute to run on our code base, which like, yeah, that's too slow. Like, if
even I want to go do scroll on Twitter while the review is ongoing, like, it's too long.
Yeah. Right.
But I do wonder like, what kinds of use cases could people come up with if there was more
information? Like, I wonder if some sort of dependent typed kind of techniques could emerge
if people had more tools for doing code flow analysis or, you know, just more information
at their fingertips. Because like what Elm can do with all the information it has about
your code, both because it's a compiler and has computed all this information and because
the constraints of the Elm language, all the things, all the guarantees it has based on
how you have to write your code for it to be valid. There are just so many cool things
that it can do. And if you start like looking at the compiler code, you're thinking of all
these possibilities. Like I know I do with Elm pages. I'm like, oh my God, if I was a
compiler, there are so many cool things I could do with the information I would have.
Yeah. So a compiler is basically a static analysis tool, just like Elintor, right?
Right. It's a static analysis tool that the code must pass through in order to run, which
that's basically all it is. It's those two things.
And then it generates some files.
Right. Right. Also then, right.
That is a compiler part, but the rest is very important as well. And the thing is the compiler
is a general purpose tool, right? So it's only going to be able to infer things that
the language tries to allow and to report things that it doesn't want to allow. But
then if you want to do something more precise that the language was not designed for, you
could potentially do that with a very powerful static analysis tool. So like, I don't know
much about dependent types, but being able to figure out at compile time that some number
is always smaller than five, you could potentially do it by adding constraints, just like a language
with dependent types would do. Maybe, I don't know enough, but you could definitely try
to do that and then report errors like, Hey, I am not smart enough to figure this out.
Please change the way that you work with your code. Kind of like proof languages, which
I think they accept plenty of things, but if it's too hard, then they ask the people
to rewrite their code in a way that they can understand.
Right. Which I mean, in a way, like, yeah, if you say non empty list, you know, from
cons or whatever, right? That's like a lazy approach to that in a way where you're saying,
I'm not going to do code flow analysis. You must prove to me by actually passing a single
definite value and then a list which could be empty. I don't care. And so you've proven
it. That's like the shortcut to proving that. Or you could do code flow analysis and you
could say, well, I can analyze your code paths and I can see that you're using this non empty
type that promises to be non empty, but maybe not through the compiler, but through Elm
review and I see this one pinch point that I know this type will always go through and
it adds something to the list. Therefore, you're good. Like that would be the deluxe
Yeah. But then some things are very hard to infer because it uses code from dependencies
that we don't have information about. So again, misinformation. There is a request to be able
to analyze the code from dependencies before analyzing the project. And I think that would
be very valuable. If you do that, you can basically do whole program analysis except
for the JavaScript parts. Maybe we would like to be able to analyze CSS and JavaScript files
as well, but I think that's getting a bit of out of hand at the moment at least. It
should be interesting, but maybe it's better to use two tools like ESLens and Elm review
and configure them in a way to give you all the same guarantees.
And you can always go the other way too, right? Like if you're wanting to analyze things with
your CSS, you can generate CSS from Elm and then you have a more constrained place to
analyze it. Whereas if you're like guarantees are always, you can always flip it on its
head. You can say, well, this is too unconstrained and hard to analyze. Therefore I'm going to
constrain it. Like to take something from an unconstrained environment to a constrained
environment is very, very hard to take something from a constrained environment to an unconstrained
environment is very easy, relatively speaking.
I remember when I rewrote an Elm application to React, that was really easy. Whereas the
opposite would have been way harder, just like basically re implement everything. But
for Elm to React, there was a translation, which is much easier.
To take a lossless audio file and turn it into a compressed one is easy. To take a compressed
audio file or compressed image and turn it into a lossless one or to do the CSI enhance,
it's a harder problem.
I don't know if you want to talk about side effects as well. That's interesting, but I
don't know how we are on time.
We could talk a little more and still be in our general time window.
Well, we can extend our episodes to be two hours long. That's fine as well. I mean, we
did have shorter episodes recently, so we need to compensate, right?
One area where you have a lot of false positives or false negatives in a lot of other languages
and other linters is with the presence of side effects. For instance, if we take the
no unused variables rule for Elm, where you say if you have A equals some function call,
and then this value A is never used.
In Elm review, we know, well, this function call has no side effects. We can remove the
entire declaration from the code, and then we can look at whether that function is used
or not used anywhere else.
But in a language with side effects, it's very hard to tell that. We know we can remove
const A equals, we can remove that part, but we don't know if we can remove the function
call because it might have side effects, right?
And that is going to be true for any language, as far as I know, that is not a pure functional
language, or at least where the function is not annotated in some way as being pure.
So being able to rely on the fact that functions have no side effects, that actually allows
us to do some very cool things, just like dead coded animation, a very powerful one,
as we've seen.
I think removing dead coded in Elm using Elm review is something that a lot of people love,
and I definitely do. And that is very hard to do if you have side effects. And yeah,
then you got things like moving code around where you have one function call after another
one. And if you want to optimize the code or make it nicer to read, then potentially
you have to inverse the order of those function calls. Well, is that safe to do? Well, we
don't know. Unless we have no side effects, then we know we can do it.
So we could still do that analysis. Does this function have a side effect? Does this one
also have a side effect? Do they impact each other? Do they depend on each other? And that's
a lot of work. That's really a big amount of work to do, like a lot of interpretation
and a lot of analysis. And potentially at the end, you still don't know the answer.
So you're still going to have to make a presumption like, yeah, I think this is going to... We
don't know. So we're just going to assume that it has a side effect and that it needs
to stay this way.
Right. Yeah. It's the poison pill. Things can be very easily tainted. And it's the unconstrained
versus constrained environments. And if you can take, as we've talked about in the past,
if you take pure functional Elm code, you can do more complex things under the hood
preserving those guarantees, like persisting data in Lamedera, for example. So it's pretty
compelling how you can still preserve those guarantees and do more complex things when
you have that purity. For example, you could even imagine doing some of these kind of costly
computations in Elm review. Like instead of doing this sort of Elm store pattern style,
you could imagine doing some sort of hacks under the hood, like a sort of Elm review
compiler that could...
Oh, I never thought about doing that. Definitely on my mind, but so far I've never attempted
it because I wanted... For type inference, I think that's going to be slow. Evan said
that it's going to be slow in a language where you don't have mutation. So I'm thinking about
altering that at compile time to make it much faster. We don't have type inference yet.
So I will wait for that to happen.
Interesting. Oh, that's cool. Yeah. Yeah. So I could imagine like...
But I don't know if that will have any surprising effects. That's going to be interesting to
figure out.
Well, it's definitely an ambitious path to go down, but it would open up a lot of interesting
possibilities. But yeah, you could certainly like, I could imagine you saying here's essentially
a magic function that gives you some expensive computational result and under the hood, swap
it out to do some optimizations and make it more efficient and not call it if it's not
needed and that sort of thing.
Yeah, potentially. But yeah, I would definitely not write a baggage code that would depend
on this. It would just be like an improvement that people will not notice.
Yeah, exactly.
In terms of performance, under the hood optimization, that's the only way that I would accept doing
something like that.
Yes, I agree. Exactly. Yeah. But as long as you can preserve the semantics and expectations
of how it's going to behave, you can swap it out for however you achieve that under
the hood.
Yeah. But it would be kind of tricky to test because you could not use Elm test for this
All of these guarantees that we've talked about, things that we can rely on that makes
analysis easier, it applies to linters, but it also applies to code optimizers. For instance,
Elm optimize level two, it knows that it can move some functions or some operations around
as long as they don't depend on each other because they know, well, this function has
no side effect, this function has no side effect, so they can move things. They can
do a lot of these things because they know that the compiler wrote code in a specific
way that the original code was in a specific way, that things are valid, that semantics
match, that types match as it was in the code. So using all of these guarantees that the
compilers, that the type checker, the language design give you, you can do a lot of powerful
things. But as soon as you missing one of those, well, some areas, some optimization
ideas, some linter rules that you wouldn't want to write, they crumble, you can't do
them anymore. Or they require a lot more analysis, which we've seen can be hard. So yeah. So
that's the part about what I was saying, languages should remove dynamic features or features
that are hard to analyze, like side effects and dynamic values. Those are hard and therefore,
if we can remove those, if we can make them more static, well, that helps static analysis
tools. And that is something that I don't think that a lot of other languages know fully
enough, right? I just wish people knew that more.
What I'm taking away from this is basically like move the goalposts. Like instead of trying
to solve a hard problem, define the problem in a way that makes it easier, right? So like
we talked about with static analysis, like if you have a like, oh, I have to do all this
code flow analysis to make this, to figure out what the class name is. Make the problem
easier for yourself by making more assumptions, having more constraints. So you can do that
in a language and you can do that in a static analysis rule and any sort of static analysis
context you can move the goalposts, make the problem easier for yourself.
Yeah. I wrote a blog post called safe unsafe operations in Elm, which is basically doing
the same idea. Like we want to make the idea is we want to make something like reg ex dot
from literal, where we can basically have a function that doesn't return a maybe reg
ex, but a reg ex and Elm review then says, well, this is okay. We know that at compile
time this works because this looks like a valid reg ex. So this is fine. And whenever
you pass in a dynamic value, we move the goalposts and by saying like, please don't write it
this way. We don't understand it. And you can, you can do that that way or you can do,
make the analysis more complex, both work. But as long as at some point you can give
the guarantee then everyone's happy. Otherwise you can fall back on the reg ex dot from string,
which returns maybe never maybe reg ex.
Well, are there any other things people should look at? Any, any blog posts, any conference
talks, perhaps soon to be released?
Yeah. So I, a lot of what I said today was explained hopefully better than today in a
talk that I made at Lender Days mid mid July. So it's called static analysis tools, love
pure FP. I think it's going to be released. I'm pretty sure it's going to be released
after this episode. So hopefully we, I haven't spoiled too much. I think some parts of it
at least, but I think it's going to be, I think it was a good talk. I'm very pleased
with it at least.
I'm excited to watch it. Yeah. We'll keep an eye on our Twitter account then we will,
we'll tweet a link to it. We will we'll try to update the show notes though. They may
be immutable in your podcast.
Yeah. They often are right.
Yeah, I think so. But yeah, keep an eye on our Twitter and you're in until next time.
Until next time.