Performance in Elm

Performance in Elm

We talk about performance tuning Elm applications.

PublishedAugust 16, 2021

Episode#37

elm-review-performance
Tail call optimizations
Jeroen's blog post on Tail-call optimization in Elm
Evan Czaplicki's chapter on Tail-Call Optimization and how to write optimized code
Lighthouse Elm Radio episode
Ju Liu's Performant Elm blog post series
Avoid memoized state when possible to avoid stale data
Html.Lazy
Elm's html lazy only works when the function and args have the same reference as before. List.map will return a list with a new reference, for example.
Elm has function-level dead code elimination
Referencing a record pulls the whole record in no matter how many fields are used directly
bcp-47-language-tag package
Elm list extra gets split by function, unlike lodash which needs to be split
Elm Core Dict package has O(logn) complexity for operations like insert
JavaScript Objects aren't optimized for removing/adding properties
"What's Up With Monomorphism"
elm-optimize-level-2
elm-explorations/benchmark
Jeroen's list-extra PRs (with reference to the benchmark for it) for functions gatherWith isInfixOf
webpagetest.org (or web.dev performance testing)
Netlify Lighthouse plugin
RSLint - fast version of ESLint, but doesn't have custom rules
Instructions to minify Elm code
Jake and Surma talk about optimizing sites - Setting up a static render in 30 minutes
Jake Archibald's talk explaining JavaScript's event loop and requestAnimationFrame - In The Loop

Transcript

And what are we talking about today?

Today we're talking about performance, which we've never talked about before really, I

And so you've been sort of pretty deep in some performance analysis stuff for some Elm

Review rules, right?

So I recently published Elm Review Performance, which has for now only one rule, which is

about detecting tail call optimizations or lack thereof.

So do you know what tail call optimization is?

I do know what tail call optimization is, but it still somewhat confuses me what the

actual performance implications are.

I understand that it's putting a stack frame for every function call that it's recursively

calling if it's not tail call optimized and you can reduce that stack frame.

But I don't understand what are the performance characteristics of like adding that stack

frame for a recursive call versus just having the memory as it goes through a while loop

So that's sort of hard to wrap my brain around.

Also, sometimes it's difficult to understand how do you transform something that's not

tail call optimized to something that's tail call optimized.

Those two things confuse me.

So maybe let's start with the basics.

First of all, what is a recursive function?

So recursive function is a function that calls itself.

So for instance, list.length is a recursive function.

I thought you were going to say a recursive function is a recursive function.

And to understand a recursive function, you have to understand what a recursive function

I totally missed my joke there.

I'm going to try to understand.

No, I usually never do the recursive jokes because when I try to do the recursive jokes,

that's when I try to do the recursive jokes and that's when I try to do the recursive.

You don't want to get too stuck on that.

And that's where tail call optimization comes in.

Actually, it doesn't.

I can't stop you from not having a base case, but still a good day.

So a recursive function is one that calls itself.

So list.length.

I actually don't know if it's implemented in JavaScript or not, but let's imagine it

So how would you implement that?

You would do a case of the list.

So case list of, and if it's empty, then return zero.

And otherwise you return one plus list.length of the rest.

So list.length calls itself.

So you would add one for every element.

So that is recursive.

A tail call recursive function is one where it's the same thing, but it's just more optimized.

So as you said, what happens when you do a function call in JavaScript or in a browser

When you do a function call, it basically adds the current position and the new position

to a stack, the functions call stack.

I'm not going to use the technical terms because I'm not aware of those.

But yeah, basically there's a stack.

So that's once you return from a function, you know where in the code the engine needs

It's basically like a go to instruction that tells it at a low level, like when you're

done, I'm going to store the return value of calling this function in this memory location

and I'm going to, and then jump to this code.

So that's like generally what's happening there.

Yeah, absolutely.

So that is pretty cheap, but it does have a cost.

So if you're recursing 10 times, then you're probably not going to notice it.

But if you're doing list.length on a list of size 10 and it's not tail call optimized,

that's probably okay.

But if you're doing a thousand or 10,000 or a hundred thousand calls or lists of that

size, then it becomes noticeable.

And you also have a different problem, which is that the call stack has a limit.

Which when you go past that, that is called a stack overflow.

And when you have a function that can give a stack overflow, that's called stack safety.

It's not stack safe because it can trigger a stack overflow if it's called with too large

of an input or whatever.

So in Elm, we say that everything is safe, but you can still trigger stack overflows.

So if you actually want to know whether a recursive function is tail call optimized,

what you can do is try to write a test where it recurses more than about 10,000 times.

And if that doesn't create a problem, if that doesn't crash, then it's sufficiently optimized.

And if it crashes, well, it isn't.

And you have a runtime error.

And you can also run your Elm review performance rule to identify these areas that might not

be stack safe, which is really neat.

So what is the Elm review rule doing to determine whether something is tail call optimized or not?

I remember you saying when you got this working, you'd been thinking about it for a long time.

Yeah, almost two years.

And then you finally realized how to do it.

And you're like, oh, it was actually really simple to implement once I realized what I needed to do.

So before I explain that, I think it's nice to explain how a tail call optimized function works.

And compared to recursive function.

Maybe we should point out that this is an optimization that the Elm compiler just has built in.

That it's turning our code into JavaScript code so it can go ahead and say, oh, you know what?

I could rewrite this in a way that doesn't actually do recursive calls under the hood,

but I know it's going to give the same result.

So that's what we're talking about here.

So recursive function, a plain one, when it calls itself, it adds to the stack call.

An optimized one doesn't have a stack call or less so.

More about that later.

But it's literally a while loop in the compiled Elm code.

So instead of calling itself, what it does, it creates a while loop with the current element to be,

the current arguments, defined outside of the while loop.

And the while loop just updates those arguments to be whatever would be the next arguments,

the arguments of the next function call.

So list.length would be a while loop with a count defined outside the while loop.

And in the while loop, it would increment.

And the list to be analyzed would be reduced at every step.

Is that clear enough?

So is that able to allocate less memory?

Because is it taking the memory scope with the recursive calls if it's not tail call optimized?

So the way I understand it, for every call stack, you need to allocate a new variable.

Or new variables.

So if you don't have to allocate any new ones, then yeah, you have a lot less to allocate.

So that also takes time.

So this is time that is saved.

And you also don't have to do the call stack and the changing the position.

Popping off the call stack and everything.

Popping on, popping off.

So very tiny things, but when you do 10,000 of them, it matters.

So, and we should, you know, stepping back a little to like the why here now, and I think

this is, I think you and I both feel the same way here that we're not necessarily experts

by any means on performance.

But one thing that we can say with confidence is that you should measure performance to

identify bottlenecks and not assume that number one, don't assume that something needs to

be optimized unless you know it needs to be optimized because you actually benchmarked

Number two, don't assume that a particular change is in fact going to yield better performance

because it's going to do surprising things.

Also Elm compiles into JavaScript.

That's one layer of indirection.

So Elm compiles to JavaScript.

You don't know exactly what JavaScript it compiles to.

You have some sort of vague sense of if you're calling a function, it's probably going to

compile to something that's calling a function in Elm, but you don't know exactly what it's

compiling to and what the performance characteristics of different code will be.

You can look at it though, because the source code is pretty readable compared to what you

Yes, you can look at it, but also then that JavaScript code is being run by V8 or whatever

JavaScript engine and these JavaScript engines do all sorts of really nuanced optimizations

with the just in time compilation.

I mean things like tail call optimizations, I don't know if any JavaScript engines have

that built in now.

I know that Evan added tail call optimizations to Elm because at the time the JavaScript

engines didn't have that.

I think it's there in a few.

Maybe only one, but not all of them.

But the point being that don't try to predict what's going to perform well because all sorts

of unexpected things are going to perform very well or very poorly counter to your intuition.

So just assume that you don't know whether something is going to perform well or not

unless you benchmark it and assume that an optimization you try to make without measuring

is not necessarily going to improve things.

Yeah, tail call optimization is usually still a good one, but if you need to change how

the function works to make it work, then yeah, benchmark it.

We'll talk about benchmarking later.

So to wrap up tail call optimization to pop the stack again and take our call frame back

So in order to optimize, in order to turn something into a tail call optimized invocation,

what do you do to do that?

So as you said before, the compiler already does it for you.

It optimizes function for you, but only if the function has a certain shape.

Basically it wants the return value to be in specific positions.

So the return value can be in several positions.

And I say return value, I mean the recursive call.

So it needs to be at the end of a branch.

So if you can do a recursive function inside an if else, for instance, you could do it

in both branches, the if or the else, but you cannot do it in the condition.

You can do it in the branches of a case expression.

You can do it in the in of a let expression.

So let blah, blah, blah in, in there you can do the recursive call.

And that's pretty much it actually.

You can compose them together so you can do a recursive call in a if inside of a case

branch and that works.

Like basically it's the recursive call needs to be the last operation that happens.

So if you were to do a recursive call plus recursive call, suddenly you're not in the

simple case, you know, possibilities that you laid out of case expressions, if expressions,

It's now a plus expression.

A plus expression.

So one thing that is, for instance, a bit odd is like when you call the function recursively

using a pipe, like you do one pipe list dot length.

Lists pipe, this is a length.

Unfortunately that is not considered function call by the Elm compiler.

So it doesn't look like what it expects, therefore it doesn't optimize it.

So it really is the if the case and the let's.

Which just happens to be with the, with the architecture of the Elm compiler and the,

the different passes that it does in compilation.

It doesn't see that as it could, but it doesn't.

So that's just a thing to know if you're trying to do a tail call optimization.

And again, like you should benchmark before you assume that you need that, especially

if it's going to make the code more complex.

You shouldn't just assume that everything has to be tail call optimized.

And but it's a good idea.

I mean, it's one area that you could get a runtime exception.

So that's a reason to prefer tail call optimizations in itself.

That's like the biggest reason to do it because the performance gets better, but not that

Like a few percent.

And if you're, if you're designing a package that is doing, you know, using large data

sets in a lot of use cases, then you want to do all these micro optimizations to squeeze

a little bit of performance.

Of course with, with benchmarking to guide where those performance opportunities are,

but for application code, you very much want to avoid runtime exceptions.

So the Elm compiler does this optimization.

I know that other compilers, other languages do other kinds of optimizations, which are

more powerful, more useful.

I'm not exactly sure what they are, but I've been told that there are.

And that Elm isn't actually, doesn't actually do tail call optimizations, like optimized

So like sale, self call, sale recursion, which is only a subsets.

I don't know much more than that.

So what does my Elm review rule does do?

So the only thing it does is check whether you have recursive calls in other places than

the ones that I mentioned in if, in case.

And stuff like that.

And that's all it does.

And yeah, I've been thinking about this rule for a year and a half, two years, maybe like

when I started working on Elm review basically.

And I was thinking, is it is actually a bit complicated, but then when I understood that

it will just work this way, it was like, yeah, this is really simple.

This is very easy Elm review rule to write.

I feel like that's a very common process in like, I mean, whether it's API design or any

sort of like engineering, like that you think really hard in order to find the simple solution,

which then when you tell anyone about it, they're like, oh, that seems pretty simple.

You're like, yeah, but it was so hard to figure that out.

But I explained it to you in the simple way.

It took me years to get to that understanding.

It's like the Mark Twain, like forgive me for the long letter, but I didn't have time

to write you a short one.

So is there anything else people should know about tail call optimizations in Elm?

So you asked before, how do you optimize recursive function?

So in some cases it's like very simple.

Sometimes it's removing a pipe.

I've made a pull request to Elm community list.extra, yeah, list.extra, where they had

a function that was not tail call recursive because they use a pipe, like a lift pipe.

Just remove that pipe and add parens and that was it.

And that was like a 7% increase in performance.

Well, that's a good concrete number.

Oh, another thing that I realized after I wrote my blog post and I edited it when I

made the announcement was that you have recursive functions, you have non recursive functions

and you have partially recursive functions.

So it's not actually a thing of the function is recursive, is that the call is recursive.

So if you do a recursive call inside one of those allowed places, that gets optimized.

If you do another one where it's not allowed, well that one won't get optimized, but it

doesn't deoptimize the other one.

You will have a while loop in one place and a recursive call, a plain one in a place where

it was not allowed.

So yeah, I didn't know, but you can have partially recursive functions.

Partially tail call optimized.

That's really cool.

I didn't realize that until I read your blog post either, which we will link to in the

Also, Evan made a great post on how to explain.

Also Evan wrote a very good article to explain how it works.

Not exactly when it gets applied, which I do better in my article, but it does explain

the reasoning for it.

And how to transform something from a non tail call optimized form into a form.

Basically you need to sometimes make state explicit in a way that it can be passed down

So for instance, how would you optimize list.length?

So I think the problem is that if you do, you need to keep track of the running length

that you have so far.

Because previously we said list.length is basically...

Plus operation.

Plus operation.

So you would need to add an extra argument to get the length so far.

And then you would need a list.length.help function that would start with the length

so far as zero.

Usually you split them up into a function, a public function and a helper function.

So the public one bootstraps it with some default value, some initial starting value.

And then that's sort of like what the stack frame would have been carrying around, that

state of the running length.

But you're making that an explicit part of the quote unquote state of this recursive

So you make that explicit and now you can do list...

What would it be then?

The length of the rest of the list?

And is there a plus one in the function invocation?

Or where does that plus one go now?

So I'm going to say it out loud in code.

So you would have length which takes a list and that equals list.helper the list zero.

And list.helper would...

So it takes the list and the results so far.

So you do a case off of that list.

If it's empty, you return the results so far.

So the accumulated result.

And if it's not empty, then you call the function itself with the rest, but the results so far,

you add plus one to that.

So you do list.helper rest of list and then in parens results so far plus one.

So you can do addition in the arguments.

And that's not an issue.

But if you turn the recursive function call expression into a plus expression, that deoptimizes

that invocation.

So this is a fairly simple example.

Sometimes it's a lot more complex.

Like if you need to do recursive...

Like when you need to do recursive applications on a tree, like you need to call the recursive

function for what's on the left of the tree and what's on the right of the tree.

And then you need to combine them together.

And that is by definition an operation you do on the result.

So that doesn't work.

So what you can do is one technique at least is to emulate a stack.

So the stack of things that you still need to compute, you make that an argument.

Just like the results so far we had before.

You make the stack an argument.

And maybe also the results so far as a second argument.

It feels similar to doing a list fold a little bit.

That you have the accumulator and that has all the state that you need.

And list.fold is telecooptimized in that way.

The difference is that list.fold, you already know beforehand all the elements of the list

that you're looping over.

Which in recursive function, it might not be the case.

And I think that's a good thing to know when you're writing Elm code.

I gave my rant about how you should benchmark before you assume something is a problem.

And to me, the way I think about this stuff is you want to be aware of as many of these

sorts of things as possible.

So that when you need to improve performance, when you know that there's a bottleneck somewhere,

you know what to look for.

For opportunities for what to optimize.

So you can look at something and you realize you have this burned into your brain that

if you do recursive call plus recursive call, oh, that's not telecooptimized.

Just having these patterns and being aware of them is very helpful, I think.

Julia still writes it, like you just said.

Recursive call plus recursive call.

And that's fine.

And then when you really want to optimize, that's when you transform it in a way that

is better for performance.

But you need to benchmark it.

I'm actually very bad at doing that.

For this one, I would be like, I'm not sure it's going to help performance.

But I do care about stack safety.

So that's something already.

So we should, we should talk about ways to benchmark in Elm.

So one of one of my favorite ways to benchmark is just to run Lighthouse.

Well that's true too.

I mean, to a certain extent, like, so I think we shouldn't assume that performance is a

problem unless we benchmark and see it is.

But we shouldn't assume that performance is good without benchmarking it either.

For example, most of us are going to be using high powered machines that are more powerful

than most users are used to.

Maybe internet connections that are more powerful than most users are using.

You know, a lot of users are going to be accessing our sites on mobile devices, which have far

slower internet connections.

They don't have 32 gigabytes of RAM.

They don't have 32 gigabytes of RAM and 32 cores on their machine.

So wait, you have 32 cores?

No, I don't have 32 cores.

But they, it takes a lot more to even just like run JavaScript at all and you know, do

the run the just in time compilation and all of these fancy things that we're running.

So running Lighthouse can give you a little bit of a clue because it will throttle, you

know, it'll simulate throttling the network to simulate using a mobile device and simulate,

I think it simulates like degrading performance for the mobile Lighthouse.

Yeah, I think as well the CPU, I think so.

Which I have no clue how they do that.

Just put a sleep between each instruction.

You know, I mean, so much effort has gone into these tools, you know, both for like

benchmarking and discovering performance issues and for avoiding them with these like very

sophisticated optimizations in like these JavaScript engines.

But it's really worth using Lighthouse and you know, the Chrome performance tab to analyze

So that's really helpful.

Understand like pull up the performance tab.

You can hit the record button and just run some JavaScript code.

If there's something that you suspect might be slow, then you can hit the record button,

do that slow operation and then hit the stop button.

And then you can see this like bottom up view and you're probably going to find a few readable

Elm function names that will point you to some of your bottlenecks in performance.

So do you have any good resources on how to use that performance tab?

Because it is a very nice tool.

It has a lot of options, but I'm not sure I would know exactly what everything means.

So do you have any good resources on that?

So Jue Lu, is it Jue Lu?

Is that how it's pronounced?

We all know Jue.

Check out, Jue wrote a blog post called performant Elm.

It's a two part series and it basically walks you through this.

And that was actually how I discovered this technique and it like steps you through exactly

And it's really helpful.

So we'll leave a link in the show notes and definitely check that out.

It's a lot easier than you would imagine.

So it's a great idea to run that and try to look for bottlenecks.

Also when you run a Lighthouse audit on your site, it now identifies if you have like long

So now if you want a 60 frame per second animation, so when we're writing Elm code, we're writing

something that compiles to JavaScript, which runs in a browser.

And so I think it's important to have some general understanding of performance in JavaScript

and performance in browsers.

And so one really important thing to understand, if you understand one thing about JavaScript

buttons, it should be that it's single threaded, right?

There's one thread of execution.

And if you block it, everything grinds to a halt, including scrolling and input events

and you're blocking user interaction, which is very frustrating.

Which means that you cannot click on buttons.

You cannot type anything.

All of these, even like built in browser animations of clicking buttons and things will just freeze

up and lock up.

And so there's like the first input delay Lighthouse metric, which helps identify this

And this is starting to become the way that Google ranks sites now.

This is becoming a core metric that will actually bump you down in the search results if you

have issues with these metrics.

So it's important for SEO as well.

So if you understand one thing about JavaScript performance, it should be that there's a single

thread of execution and don't block that thread.

If you're trying to get 60 frame per second animations, that means that you have about

16 milliseconds to perform a blocking operation.

So now if you're performing an HTTP request, that's non blocking.

That's another thing about JavaScript that there are these non blocking IO operations

like performing HTTP requests.

And if your HTTP request takes five seconds, that's okay because you're not blocking the

You're queuing that work and it's going to come back and run your single threaded JavaScript

when it's done with HTTP request.

But for the actual processing of things, that's single threaded and you've got 16 seconds

if you want 60 frames per second.

16 milliseconds.

Sorry, 16 milliseconds.

Don't take 16 seconds to do anything.

That would be really bad.

And you've got about 50 milliseconds just in general if you want to not be blocking

user interactions and having a clunky experience there.

But it's a bit hard to know when you reach that budget.

So basically, how to do as few work as possible.

Try to be as minimal as possible and benchmark, right?

Run Lighthouse.

If you run Lighthouse, it will tell you if you have long running blocking tasks that

take 50 milliseconds or more.

And it'll actually point you to where that happens.

So that's a great technique.

And so another thing, a lot of these performance improvements I find come down to most of the

time it's not these micro optimizations.

More often it's architectural or algorithmic improvements.

If you're doing unnecessary work, try to do less work.

If you're holding lots of stuff in memory, try to hold less stuff in memory.

If you're constantly transforming things from different data structures, if you're turning

something from an Elm list to an array to a dict and then pulling and if you're doing

indexed access into large lists in Elm or things like that, those are things to look

I mean, those are usually fine when the collection size is pretty small.

You can do loads of those, no issue.

But if it's on a list of a thousand elements, you will start noticing it.

And especially if you want to reach that 60 FPS magical number, you should avoid doing

this in the view function too much or in every update or things that are recurring.

That's a great point.

Elm's virtual DOM will avoid any unnecessary work as much as it can.

If you have not received a new message, you will not update the model.

And if you don't update the model, you don't call the view again.

If between two frames no message happens, well, you're good.

You have nothing to re render.

You have no computation to do.

But if you have timed out every millisecond.

Then you're calling update every millisecond.

And you have 16 messages to handle in one frame.

So everything needs to be quite fast.

I do think that the virtual DOM or Elm slash browser is calling the view only once per

animation frame.

So the view can be 16 times as slow as the updates.

But still, you don't want to do too many messages if you can avoid it.

And there's this push and pull of you want to optimize performance and you want your

code to be nice to work with.

And those two things are sometimes at odds with each other, which is why you really need

So like, for example, like one of the for me, one of the most important design principles

in an Elm application is to derive state, not store state.

If you have state trickling through and being derived, then you don't have places where

things can get stale and out of date and bugs that can come from that.

Yeah, you want to source state but not derive state.

Well, you want to derive state from the things in the model is what I, you know, we're saying

the same thing with different words here.

But yeah, so like, basically don't have information that can be derived from your state.

Don't duplicate that in multiple places, because that places for bugs to happen and for things

And it just makes Elm better to work with and makes your code less bug prone and easier

So like, if that isn't giving you performance problems, you don't want to like memoize things

to avoid computations.

That should be your last resort.

If you in some cases, you may need to do that, but you want to avoid it if you can.

And so use this is one place that HTML dot lazy can really help out right is that could

take care of memoizing to a certain extent for you.

Do you want to introduce what HTML dot lazy does?

So as I previously Elm will not call the view function if the model didn't change.

So it does that at the root of the application.

But you can also kind of re implement that logic just by sprinkling your view code with

So basically what it does is when you there's HTML dot lazy dot lazy, which which is a function

which takes another function of view function.

So something that returns HTML and takes one arguments to if you use lazy to three, if

you take use lazy three, etc.

And if the arguments to that function did not change, then it will skip the work.

It will skip computing the function that you passed to lazy.

So that is very nice because if you have something that will rarely change like a header footer,

a part of your another part of the page, which is pretty expensive to compute, but it rarely

changes, you just sprinkle lazy in there and you will avoid a lot of unnecessary work.

Yeah, because next time, I'm renders the whole page, it won't compute that part of the page.

Yeah, you're basically proving to Elm through through the signature, it's almost like you're

taking the arguments that go through to some like view helper function.

But instead of directly calling your view helper function with with the arguments it

depends on your, you're giving these lazy helpers that view helper function, which takes

And then you're passing those actual arguments to lazy.

And it can and it can say, Oh, if, if this argument hasn't changed, I'm not going to

re invoke this function because you've basically proven to Elm that those are the only things

that can cause that function to render a different result since they're all pure functions.

So that is very powerful.

And that is very easy to add.

The only problem is that is also very easy to mess up in any way that is very non Elm

So what actually needs to work for lazy to work well is that every argument needs to

be needs to have the same reference as last call.

So what it, what Elm under the hood does is for every argument that you pass to the function,

including the function itself, it does equal equal.

So is it the same function as before using equal equal or no triple equals triple equals.

So, so if it's the same, then it looks at the next argument.

If it's that's the same, etc, etc.

And if one of them is not the same, then it will recompute the function.

So for primitives like strings and integers and bullions and static references like custom

types, custom type constructors that have no arguments that will always succeed if it's

the same value, like you pass in five and next time you call it and you pass in five,

But when you create more complex things like records or dictionaries, they actually need

to be the same reference.

And that's where it gets tricky.

So if you compute a value, if you compute a list, if you create one in a view function

and you pass that as an argument to the lazy function, then that is a new reference.

And that means that it will not be considered the same and therefore the function will be

And that is very non Elm ish.

And can be very confusing.

So the worst case is that it's not actually optimized.

It's no worse than that, but it's not optimized.

And that was what you were trying to do.

But so you're saying like, if I had a list and then I ran a list dot map on that, now

the reference is different.

But if I did model dot my list, then it says it's just a reference.

So it's just passing it through.

Unless obviously you change that function.

Usually things that come from the model are fine.

Yeah, that makes sense.

And that's where usually people feel like they need to store derived state in the model

is because they want to pass that in as their argument to a lazy function.

So they do the computation in updates instead of in the view function.

So that makes the Elm code a lot less nice, but...

Sometimes you need to.

I wonder if you did something in every update call that said, remove this type of thing

from this list.

So the model dot my list equals.

And if you do my list dot filter, remove something from the list every single time, then that's

changing the reference every single time, I'm assuming.

We don't normally think about these things in Elm, which is I think why you're saying

it doesn't feel very Elm because we don't usually have to think about these things.

So you would almost have to avoid updating the reference in that case where the result

is the same or compute the result and say if the number of items that it results in

is the same, then don't update the reference.

So for instance, when you have a module and it uses another module which also has update

view and all those, you usually store the model of that in a record.

So you have model which has a field subcomponent, which is a model of the subcomponent.

So every time a message for that one comes, you will update model and override subcomponent

with the new value of subcomponent, even if it didn't change.

So that means that subcomponent didn't change, the reference didn't change, but model did.

So if you put model as is as an argument to a lazy function, then that one gets recomputed.

So in general, with lazy functions, it's probably a good idea to pick off the minimal set of

data that you need to pass through to avoid busting the cache, quote, unquote.

So you can do like lazy function of subcomponent.view, and pass it model.subcomponent.

That would be very good.

That's really good to know.

Do you actually know how lazy works under the hood?

I looked it up because I was curious.

I'm guessing it's like in the virtual DOM code.

Oh man, you're better than me.

It is in the virtual DOM code, but where are the values stored?

The virtual DOM facts?

So I always thought like Elm has a global store of lazy functions and their last arguments,

which doesn't make sense because sometimes you call the same function with different

arguments in the same view, and those would then be very hard to keep track of.

And that's absolutely not how it works.

It's not a magical global thing.

It actually stores all the arguments and the function itself in the virtual DOM.

So when it renders a virtual DOM, it knows that this node is something that will be mapped.

This node is just plain old HTML.

This node is lazy node.

And basically what it does is if this node is lazy, check whether its value has already

If it has a cache in it.

If it does, and all of the arguments that are passed, including the function are the

same as the one I'm now getting, so nothing changed, then I can just return that value.

And I don't have to recompute it again.

That makes sense.

And otherwise, if something changed or never computed this before, it will compute it.

So it will actually call the lazy function during the diff of the virtual DOM, which

is way later than I expected.

Whereas for the other ones, it's when the view function gets called.

But here it's done only when the diff is happening.

That's really cool.

And then it sorts that value in the virtual DOM node, the lazy node.

And it keeps the whole tree in its model or whatever.

So that's a lot cleaner than I expected.

It's really cool.

So do you have a sense of best practices for when and where to use lazy?

If there's, like you mentioned, a header or a footer, how much of a difference is that

It's hard to say without benchmarking, right?

And it's also hard to benchmark that.

But do you wait until there's a performance problem to add lazy?

Or do you add lazy eagerly or lazily?

I guess is what I'm trying to ask.

I do it lazily.

The only way that it makes sense.

I rarely use it actually.

Maybe because I work mostly with Elm Review where I don't have access to lazy.

I use other tricks to do caching.

The hard and tricky ways.

But lazy I rarely use.

Where I would use it is more like when I know that I will need to do pretty expensive stuff

and things that will rarely change.

For instance, if you have a huge list of items and you want to sort them.

If the list doesn't change very often, then you can just sprinkle lazy and then you will

only do the sorts in the view.

Something that I see quite often on the topic is like, oh, where should I sort this?

Should I sort it in the updates?

Or should I sort it in the view?

If you do it in the view and you lazify it, then it's simpler.

It's free caching.

It's caching that you can't do incorrectly, which should always win.

Because as we know, there are two difficult problems in computer science.

Naming, caching, and off by one errors.

That's the ideal.

You want your values to just trickle through your application without any possibility of

We didn't mention it before, but HTML lazy is very quick in the sense like the comparison

with all the arguments is very quick because it does triple equal.

Which is very, very fast.

So even if it gets defeated, it's not a huge cost.

But you want to avoid it getting busted.

So there's not much maintenance cost or computational overhead to using lazy in a suboptimal spot.

I think at most it adds 10 comparison.

Very quick comparison checks.

That makes sense.

So speaking of comparison.

Actually it just might be.

I think it's one comparison with a list of 10 elements maximum.

That's what it does.

So speaking of comparison, Elm's equality is like, it's so nice to work with.

And when you go back to other languages that don't use deep equal by default, you're like,

why isn't this equal?

These are definitely the same.

And you're like, oh, that's right.

They're comparing references.

And you have to try so hard and use these hacks to check for equality properly.

And it's so nice to not think about that in Elm.

But that is another place to look for potential performance bottlenecks.

If you're doing it over a large set of items doing equality, there is a little bit of a

performance cost if you're doing that equality a whole lot or over extremely complex data

structures because it is doing it when you do double equal in Elm, it is doing a deep

So that's something to be aware of.

So I don't know when I'll be done.

I am working on an Elm review rule to detect lazy.

That would be amazing.

And this is really like one of the root rules that I wanted to make for Elm review.

Like before I published it, this is really one.

I want this one because this is such a non Elm thing.

It should not be a problem for us.

And we should have a tool to detect it.

And that's why I'm making Elm review rule for it.

It's a tricky one.

It will probably not get it right all the time, but it should help at least.

Because you have to do like you were describing in our recent Elm review episode, you have

to do flow analysis type stuff to track where references are changing or coming from.

And the thing is you don't know what will change the reference.

For instance, if you call a function, will it change the reference?

D doesn't change the reference, list.map does.

And you basically need to know what every function does.

If you really want to get it right and function from dependencies, I don't have access to

I could, but it would be a lot.

It would be very expensive to compute that.

I will try to get it as right as possible and with as few false positive as possible

That's amazing.

So maybe it's published already by the time you're hearing this.

Oh, yeah, that's true.

I'm taking some out today.

So don't be surprised if it's not here yet.

And I'm being lazy.

Yeah, you're either being lazy or you published a lazy package, but lazy somewhere or the

So other performance things to think about.

So I think, you know, again, like I was saying before, like go back to the basics, think

about the platform you're running on.

You can't think about performance in a vacuum and we're running code that compiles to JavaScript

So we have to understand, for example, bundle size is pretty important for initial load

performance and Elm has dead code elimination.

So I think it's important, even though Elm does this for us under the hood, if you understand

what it's doing a little bit, then it might help you take advantage of it more easily.

So like basically, I think Mario Rogic was describing it recently saying that it's actually

not dead code elimination.

It's live code inclusion, which I thought was a nice way to describe it.

So like what he means by that is that the Elm compiler lazily pulls in code as needed

that you reference.

So if code doesn't get referenced, the compiler isn't going to reach it and it's not going

to pull that in and compile it.

So you start with the main, you look at what it uses, you pull those in, look at what they

use, et cetera, et cetera.

And when you've reached all the functions that were used, you take all those, you put

them in the bundle and you forget all the rest.

And Elm doesn't care what you import.

If you import something and don't use it, although if you're using Elm review unused,

then hopefully you don't have unused imports.

But even if you do, Elm doesn't care.

It cares the functions that you invoke that are reachable.

So Elm does function level dead code elimination.

So if you structure things in a way, if you touch a giant record and you only use one

field in a giant record, you've just pulled in that giant record to your bundle.

And I think that's the reason why we tend not to have APIs that expose a record with

a to list, from list, blah, blah, blah.

Because if you only use one of those, you still get the whole API.

Yeah, I did that with that in mind in this BCP 47 language tag package I published, which

is just basically like a way to use these different language codes with a little bit

of type safety, helping you be confident that you're getting the codes correct for languages

And so I don't use any lists or records for it.

They're just individual values.

And so if there are thousands of codes and you refer to two, then that's what ends up

in your bundle.

And the fact that Elm has this live code inclusion is that we can put a lot of things in helper

Like a very common complaint about the JavaScript ecosystem is about, for instance, lodash,

which is very big, but also very useful.

But a lot of the functions you will not use in your project, like more than 90%.

And we have packages like list, dash extra, dict extra, et cetera.

And we can put as many functions as we want in there.

It's mostly up to maintenance costs as to what should be in there.

But once you add the package and you use one function, while you only use that one function,

you don't include all the rest.

And that frees us up as library authors to put in as many useful things that we want

without having to care about bundle size.

And that is very nice.

It's really nice.

And you see these lodash sub packages that are splitting out separate categories of lodash

And I mean, as a user.

Every function.

Oh, every single function is that?

As a user, you don't want the user to need to worry about that.

And that's one of the benefits that comes from Elm's purity is that importing a module

doesn't have any side effects, running a function doesn't even have any side effects.

So you don't need to worry about all of those complexities when you're doing your dead code

elimination or your live code inclusion.

You can still have side effects when you call a function if it explodes the call stack,

but that's about it.

Well, that's fair.

But at that point, we know how to solve that one.

We know how to solve it.

And even if the call stack blows up, then Elm can just relax and be like, all right,

my job here is done.

And we talked about having an awareness of the performance implications of these different

techniques and types of code, just so you know what to be on the lookout for if you're

looking to optimize something and to look for red flags.

I think it's also valuable to understand a little bit about Elm's data structures.

And actually, most of the Elm data structures, like if you go to the docs for Elm core dict,

for example, then it tells you the complexity of these operations.

So it's O log N for insert, remove and querying operations in dict.

I think that's really good to know.

What does that mean for people who didn't study computer science?

I mean, what it means is that there's a relationship between the performance cost of adding, removing

or looking up in a dictionary that's related to the size of the number of entries in the

So the more items there are in the dictionary, the longer it's going to take to remove something.

If you say I want to remove something and there are a million things, it's going to

take longer than if there were 10 things.

Now, it's not linearly related to it where it's going to be a million times slower for

a million things than removing one item from a dictionary with one thing.

But it means that...

That would be the case if it was O of N.

That would be the case if it was O of N, exactly.

But it's O of log N. And the reason it's log N is because the worst case scenario is it's

branching down these branches of a tree so it doesn't have to traverse everything.

It can intelligently split the work.

And so log of N is way smaller than N for numbers like a million.

For numbers like five, obviously it's not going to make much of a difference.

But it's important to understand that and just be aware of it.

If you're dealing with very large input and you're trying to remove things frequently

or add things frequently to a dictionary, that's something you should be aware of.

And because I often think of map key value data structures as constant time lookup and

insertion and deletion.

So I think it's important to realize that it's not for a dict.

In a way, in Elm, it feels a bit more expensive.

Like if you do dict.get, because the code around it is more complex.

You need to do a case of or you need to do maybe map.

So it feels a lot more complex than record access or field access in JavaScript, for

So it's pretty interesting in JavaScript.

I recently learned that objects inserting and deleting fields in an object in a JavaScript

object makes it like it deoptimizes all of these just in time optimizations that the

JavaScript runtime does.

And also, again, the monomorphic.

There's this like concept of a shape and it basically calculates where to look for data

But as soon as you change the like prototype by adding or removing things to an object,

it can no longer use that reference to quickly figure out where to look things up.

And it has to basically recalculate how to look up memory from an object.

And anyway, it's surprising the performance characteristics of things are surprising.

So even a JavaScript object doesn't necessarily just have like O of one look up time.

And a JavaScript map is more optimized for that use case.

We're going to link to an article called What's Up with Monomorphism?

Which is a bit complex, but it explains it well.

And one thing that is pretty nice that Elm does is that it tries to keep it optimized

with this regard.

Which is not the case in JavaScript when you write it yourself.

Because from Elm 18 to 19, it actually removed the ability to do an object or a record update

that changed the shape.

So Elm Optimize Level 2 is another really great tool that people should check out.

And Matt Griffith and Simon Twopp, is that his last name or is that his handle?

Either way, they created this really cool tool.

And it's a post processing tool that takes the Elm output and it optimizes it using some

clever optimization opportunities that Robin found who did a lot of cool optimizations

on some of these core data structures like List and Dict.

So anyway, that is a nice little free to use tool.

You just run it and your performance improves because it tries to optimize the code better

for JavaScript engines.

It will take a few seconds.

Yeah, it usually takes around four seconds for me to run.

So instead of doing Elm dash dash optimize, you do Elm Optimize Level 2 dash dash optimize.

Takes a couple seconds longer because it does call Elm make optimize and then does some

code transformations.

So it's additional work, but it can be very worthwhile.

But benchmark it.

And we actually didn't mention Elm Explorations benchmark.

So there's this Elm Explorations package called Benchmark where you can basically create programs

or yeah, programs which run functions over and over and over again and compare them.

So if you want to know if one function is faster when it's written one way or written

another way, then you write them both and then you run a benchmark which compares them

and runs it a bunch of times and gives you which one is faster with how likely is it

that the results are trustworthy.

And it's something that is pretty nice to use.

It does make the browser slow, which is annoying.

Actually don't even know whether you're supposed not to touch your computer when you're running

We should mention the browser for the little benchmarking app, not for your actual app.

I will link to a pull request which I made to this extra where I made a benchmark for

an optimization about telecom optimization and you can look at the results there.

One thing I haven't found a good way to do yet is using Elm Benchmark is great for comparing

two different implementations and for playing around.

You can copy an implementation to a parallel module and try some optimizations and figure

out if they actually help or hurt.

It's really good for that.

But for measuring performance over time, it doesn't help you there.

I haven't figured out a good process for that because for me, building a markdown parser

and some things like this, these are the super performance intensive things where sometimes

you do have to sacrifice code maintainability for performance.

You need to go for performance because they're so critical for performance.

And sometimes you need to add a feature and that will hurt performance.

However you change it.

You need to go that extra effort in a way that often in regular browser applications

you don't need to.

I would love to have a nice way to track things over time.

I guess I really should use this.

I know that, so I think it's WebPagetest, if I'm not mistaken, webpagetest.org.

I believe you can set that up.

And I think maybe Google has a similar thing that you can use through web.dev or something.

But I think that it has the ability to track performance over time.

I think there are some Netlify plugins that help with this too and maybe some GitHub actions.

I'll add some links to some things in the show notes.

Then you can read them and send the results to Dillon.

Yes, that's right.

I really should try this out.

But I think like webpagetest.org, I think that's a thing that people do is kind of track

their Lighthouse scores over time so you can catch any performance degradation.

So that could be a helpful thing.

That could potentially be something we could use when using maybe benchmarking things.

We could create our own little web page that just exercises something and check its performance

That will be very expensive compute in CPU time.

By the way, so I said when you need to add features, sometimes performance will be degraded.

I think I see a lot, especially in the JavaScript ecosystem.

There's a new tool which is 10 times as fast as other similar projects.

This is React but 10 times faster.

When it's new, it's often because they don't have the same feature parity.

So like, yeah, you are faster, but you don't handle that one.

Are you hinting at ES build?

I am not thinking of ES build at all.

I actually don't have anything in particular in mind.

Very often it's like when you implement the same features that other things have, then

you will need to incur big performance costs.

So I'm always very wary about things that say, oh, we're so fast.

We're so much faster.

But you're also kind of new.

So beware, beware.

But if you have the same parity, feature parity, great, awesome.

Oh, no, yeah, where I often see it is like for linters.

There's RSLint, which is ESLint written in Rust.

That's what all the cool kids are doing these days.

They're rewriting JavaScript tools in Rust or Go, which is probably a good idea.

Which is probably a good idea, yeah.

And it's so much faster than ESLint.

But it doesn't have custom built rules.

So it doesn't have all the rules built by the community.

It doesn't have rules that you can build yourself, which is a huge feature in ESLint.

And you can't compare the two, in my opinion, without that.

And it will be a problem for them to include it into compiled source code.

So especially in that instance, I'm like, yeah, no, you're fast, sure.

But no, you can't say that.

Performance is just so good, though.

When you have amazing performance, the experience is incomparable.

It just feels so much nicer to use it.

So performance is a worthy endeavor, but it is also a never ending challenge.

But go for that low hanging fruit, sprinkle in some lazy, run Lighthouse, see what happens.

Spend your time focusing on the bottlenecks, not doing random optimizations that make your

code hard to maintain, but don't actually impact your critical path for performance.

Also serve things up with a CDN.

Get that time to first byte down, get the first load performance improved.

Get all the low hanging fruit that you can.

A lot of the performance will come from web techniques and Elm techniques.

Probably the biggest ones.

And that's the other reason you always benchmark.

If you make something 100 times faster and it's taking a nanosecond, then that's great.

But if you made something 1% faster and it was taking two seconds, then that would have

paid for itself much better.

So sometimes just adding a little preload directive to preload a font and do that initial

handshake or make sure that you're running Terser or some sort of minifier on your Elm

We'll share a link to the instructions for how to do that in the show notes, both with

Elm Optimize Level 2 and with Vanilla Elm.

It's not about Elm performance.

Elm runs in the browser.

So you've got to think about Elm performance, but also the platform it's running in.

Feel like we need the more you know, a little sparkling sound whenever we give our public

service announcements.

Well, I think we've covered performance.

All of performance.

We also have a whole episode Lighthouse.

You forgot about it?

It was a long time ago.

Yeah, yeah, that's true.

There are a lot of good resources out there on optimizing those details, too.

I'll drop a link to a couple of talks about that as well.

The Jake and Surma, the Google dev rel guys have some really cool talks where they go

into some of these details.

So I'll drop a link to a few of those talks.

We covered everything now.

Your apps will never have performance issues ever again.

Well, until next time.

Until next time.