spotifyovercastrssapple-podcasts

Optimizing Performance with Robin Hansen

We talk about Robin's work optimizing Elm, and the opportunities that remain for Elm performance improvements.
January 31, 2022
#49

Transcript

[00:00:00]
Hello, Jeroen.
[00:00:01]
Hello, Dillon.
[00:00:02]
And once again, we're back with Robin Hanson.
[00:00:07]
Thanks so much for coming back on, Robin.
[00:00:09]
Thanks for having me on.
[00:00:10]
What's it been, a year?
[00:00:11]
Yeah, I haven't talked to you all year long.
[00:00:13]
It's really exciting to sit down with you.
[00:00:16]
And Jeroen, I have a feeling you're going to be itching to ask a bunch of performance
[00:00:21]
related questions because today we're talking about performance with Robin, the Elm Performance
[00:00:27]
Guy and Jeroen, the guy who's trying to dethrone Robin as the one who's optimized performance
[00:00:34]
most in Elm.
[00:00:35]
Yeah, I'm looking forward to it.
[00:00:37]
I let you win one game and now you're all confident.
[00:00:44]
You will never win again.
[00:00:47]
Never.
[00:00:48]
Yeah, so yeah, this is really exciting.
[00:00:53]
I think it's kind of an exciting time for performance stuff in Elm.
[00:00:57]
I think maybe these things have been happening in back channels right now, but I think we
[00:01:03]
might be seeing some performance improvements in Elm Optimize Level 2, which we've talked
[00:01:08]
about in previous episodes.
[00:01:09]
It's just a sort of post processor that goes in and tweaks the Elm compiled output to do
[00:01:15]
some performance optimizations.
[00:01:18]
So I think we've got some exciting stuff coming.
[00:01:19]
So I'm curious, before we get into some of these details about these performance optimizations
[00:01:25]
and everything, you've got a long history of doing performance work in Elm, working
[00:01:30]
on these data structures and benchmarking things.
[00:01:35]
Why do you do it?
[00:01:36]
Like, why do you care about Elm performance?
[00:01:40]
Okay, so there are two answers to this question.
[00:01:44]
The first one is like what people want to hear.
[00:01:47]
And the second answer to this question is the truth.
[00:01:50]
I think what people want to hear is that performance is really, really important.
[00:01:56]
I think it's I think the worst thing that can happen to Elm is that someone sits down,
[00:02:03]
writes a production app, and then it's laggy.
[00:02:07]
And for a language with a relatively small following, like Elm, where people might not
[00:02:14]
know how to fix a laggy application, that would be bad for the reputation of the language
[00:02:21]
and further adoption of the language.
[00:02:23]
So performance should not be your primary concern when doing the stuff that Elm is good
[00:02:30]
at.
[00:02:31]
Because most of the time, optimizing for performance is simply not going to matter for the sort
[00:02:36]
of applications that you typically do with Elm.
[00:02:40]
But if you do get a performance problem, I think that would be very bad for Elm.
[00:02:45]
And so I've been working on performance things simply because I don't want people to have
[00:02:51]
a performance problem.
[00:02:52]
Wait, now, is that the truth?
[00:02:55]
Or is that what people want to hear?
[00:02:57]
The truth.
[00:03:01]
That is a true answer.
[00:03:03]
But really what got me into this is fixing performance things or improving performance
[00:03:10]
problems is a relatively simple and fun activity.
[00:03:14]
Because if you do it correctly, no one is going to notice anything.
[00:03:19]
And so you don't have to go through a lot of API design discussions.
[00:03:24]
There's a lot less things to consider.
[00:03:26]
So it's a relatively easy thing to get into.
[00:03:29]
And it's also a relatively easy thing to measure the improvements of.
[00:03:35]
And of course, if you can improve something...
[00:03:38]
And you can probably attest to this, Jeroen.
[00:03:40]
If you make something 10 times faster or 50 times faster, it feels kind of good.
[00:03:46]
Kind of.
[00:03:47]
Slightly.
[00:03:48]
Slightly good.
[00:03:49]
It's a hell of a drive.
[00:03:50]
It's super exciting.
[00:03:54]
So it's fun.
[00:03:56]
But it's also important.
[00:03:59]
I think to avoid that.
[00:04:01]
To avoid people having a bad experience with Elm.
[00:04:04]
Although in most cases, people won't have them.
[00:04:08]
Right.
[00:04:09]
So Elm is a pretty high level language.
[00:04:11]
Like you were describing, if people get painted into a corner and there's a performance issue,
[00:04:17]
they might not have much they can do about it with Elm because it's pretty high level.
[00:04:22]
Doesn't give you a lot of control about expressing low level things that would affect performance
[00:04:27]
in a way that a language like Rust would maybe.
[00:04:30]
But at the same time, on the other side of the coin, because it's this high level, very
[00:04:35]
declarative and pure language, does that give you the opportunity to do more with performance
[00:04:42]
because it's more constrained?
[00:04:44]
Both yes and no.
[00:04:45]
Like so in Elm you have...
[00:04:49]
Well for the HTML library specifically, you have the HTML lazy namespace which provides
[00:04:57]
functions which allows you to avoid computation in the cases where nothing has ever changed.
[00:05:03]
And the reason why that is a good optimization when you can apply it and the reason it works
[00:05:09]
and is very, very fast is because of Elm's purity.
[00:05:15]
So you can do the same things in React, but it requires that you have made sure that everything
[00:05:22]
is pure.
[00:05:24]
And when you do need such an optimization in React, I think you are going to have a
[00:05:30]
problem applying that optimization because things aren't pure by default.
[00:05:35]
And so there are definitely certain things which are much, much easier in Elm because
[00:05:41]
of purity.
[00:05:42]
But on the other hand, there are things which is harder because of purity as well.
[00:05:49]
Like a dictionary maybe.
[00:05:51]
Yeah.
[00:05:52]
So that doesn't necessarily mean that data structures can't be faster in a pure language
[00:06:00]
compared to a language which allows you to use mutable data structures.
[00:06:05]
So one example of this is the dictionary implementation in Clojure, the HashMap implementation in
[00:06:14]
Clojure more specifically.
[00:06:16]
It turns out that when reading from a Clojure HashMap, admittedly when you have a HashMap
[00:06:23]
consisting of maybe like five or six million entries.
[00:06:28]
Kind of big.
[00:06:29]
Which you do hardly ever.
[00:06:32]
But in the case you have such a big dictionary, it turns out that Clojure can actually be
[00:06:36]
faster for reading from said dictionary simply because of the tree structure which makes
[00:06:42]
it more cache friendly than your typical mutable HashMap, which is one continuous array.
[00:06:49]
So it can be faster by doing things in a purer way.
[00:06:55]
But you will normally struggle to make it as fast as mutable alternatives because you
[00:07:01]
have to copy a lot of stuff around.
[00:07:05]
Do you need to have immutability under the hood in an immutable language?
[00:07:12]
Because I mean, Richard has been talking a lot about these types of optimizations in
[00:07:18]
this rock language that he's been developing.
[00:07:21]
We'll link to a talk where he goes into some details on this.
[00:07:24]
But he uses some optimizations under the hood to perform mutation when possible in a way
[00:07:31]
where the user doesn't have the ability to mutate data.
[00:07:35]
But the compiler might see, well, the user won't notice that I've mutated something as
[00:07:40]
far as they're concerned.
[00:07:41]
They have the illusion of immutability.
[00:07:44]
And that's all we need.
[00:07:45]
So like, does that trade off apply to optimizing stuff in Elm?
[00:07:51]
Or for practical reasons, is that not a good approach?
[00:07:55]
Or for philosophical reasons, is that not the desired approach?
[00:07:59]
So if you can do it, then you can definitely get a lot of performance out of that.
[00:08:05]
And Rock has, at least from what I've seen, proven that you can have almost as fast code
[00:08:13]
written in a purely functional language as long as the compiler is able to utilize these
[00:08:19]
tricks under the hood.
[00:08:20]
And it's important to say that we don't really care about things actually being pure under
[00:08:26]
the hood, as long as you have the illusion of that being the case.
[00:08:30]
But currently in Elm, I don't think we make use of such optimizations.
[00:08:37]
No, that's kind of what I'm researching at the moment.
[00:08:41]
Like some of the optimizations that Rock does is kind of what I'm looking at at the moment.
[00:08:47]
There's some good results, but also it's like limited in what you can do, what you cannot
[00:08:51]
optimize.
[00:08:53]
And I think that Rock has much more solid foundations to do it at the moment.
[00:08:59]
Yeah, when it's baked into the core of what the compiler is attempting to do, then the
[00:09:03]
compiler can track information around where a mutation happens and optimize for that.
[00:09:10]
But another very important aspect is that Rock doesn't have to compile to JavaScript.
[00:09:17]
And so it has a lot more control over what it can and cannot do.
[00:09:22]
For good and bad, you know, compiling to JavaScript is a lot easier.
[00:09:28]
But you lose some control along the way.
[00:09:31]
One thing that I'm thinking of which Elm does do and which most functional languages do
[00:09:36]
is tail call optimization.
[00:09:39]
Now tail call optimization isn't done first and foremost for performance.
[00:09:44]
It's done for safety.
[00:09:46]
So for those who don't know, tail call optimization is when you have a recursive function call
[00:09:52]
where the recursive call, the result of the recursive call will be the result of the calling
[00:09:59]
function, if that's correct, will not actually be compiled down to a function calling itself
[00:10:06]
over and over.
[00:10:07]
It will be compiled down to a while loop.
[00:10:10]
And that is to avoid adding elements to the stack and eventually causing a stack overflow
[00:10:15]
exception.
[00:10:16]
That's the main use of it.
[00:10:18]
But because you avoid a lot of function calls, you also increase performance a lot.
[00:10:23]
So that's a case where the language only allows you to use functions and functions calling
[00:10:29]
functions.
[00:10:30]
But as long as we keep the illusion that that is what is happening, we don't really care
[00:10:35]
about how it's compiled.
[00:10:36]
And so compiling it down to a while loop is perfectly fine and faster and safer.
[00:10:41]
Yeah, so while loop plus mutations as well.
[00:10:44]
Otherwise it doesn't make much sense.
[00:10:47]
Yeah, Jeroen, I think you've been trying to make more opportunities for tail call recursion
[00:10:53]
so that the Elm compiler isn't as limited in where it can apply that optimization, right?
[00:10:59]
Exactly.
[00:11:00]
Yeah.
[00:11:01]
And very promising results so far, but that's all I will say at the moment.
[00:11:04]
So that's sort of like, when I think about all this performance stuff, one of the things
[00:11:08]
that I think about is this idea of a compiler.
[00:11:12]
So like, for example, Svelte and the creator of Svelte, Rich Harris, talks a lot about
[00:11:17]
this idea of, he talks about a compiler for JavaScript and for JavaScript front end apps.
[00:11:24]
And the way he talks about it, he says, hey, we've got, instead of just writing interpreted
[00:11:30]
code, what if we had something that could be more intelligent and could understand how
[00:11:35]
to help us do what we're trying to achieve by understanding things better?
[00:11:41]
That's kind of how he talks about a compiler.
[00:11:44]
In Elm, it's almost like water to a fish.
[00:11:48]
Compiler is just such a ubiquitous concept in Elm that we almost don't think of it.
[00:11:55]
But what can the compiler do, knowing what it knows, to make our job easier?
[00:12:00]
So ideally, we shouldn't have to know this particular way of writing something is more
[00:12:08]
efficient than this other way, because the compiler can deduce that, especially with
[00:12:12]
like a pure language.
[00:12:14]
And so I find this to be like one of the really interesting things in Elm in particular is
[00:12:22]
how sophisticated can we get with the work that the compiler can take on to optimize
[00:12:27]
things intelligently for us?
[00:12:29]
That's a very good point.
[00:12:30]
And there are a bunch of things that the Elm compiler can do, knowing the semantics of
[00:12:35]
the language.
[00:12:36]
So currently, if you do a simple operation like checking two objects for equality, say,
[00:12:43]
if you were to do a value based comparison, a value based equality check of two objects
[00:12:49]
in JavaScript, that would be hard, I guess, to get something that works fast, is safe
[00:12:57]
from a Stack Overflow perspective, because doing that isn't baked into the language.
[00:13:04]
The right code, making sure that all the contents of two objects are in fact exactly the same.
[00:13:09]
It also has to be unambiguous, like, do you check the prototype of the object?
[00:13:14]
Yes, exactly.
[00:13:15]
And so that is actually surprisingly difficult in JavaScript to get that working 100% of
[00:13:20]
every single case.
[00:13:22]
In Elm, it's very simple.
[00:13:24]
First of all, because it's baked in, but also because of not allowing mutation, the implementation
[00:13:31]
of equality checking can actually be a shallow comparison, because you know that two objects
[00:13:37]
who have the same identity are also equal.
[00:13:39]
And so you can skip a lot of the work necessary to check two objects for equality.
[00:13:45]
And so having a compiler that understands or which lays certain restrictions on how
[00:13:50]
you write code can in fact make certain things a lot easier and more performant when compiled
[00:13:59]
too.
[00:14:00]
So if you look at the output of the Elm compiler, the JavaScript it produces, if you look at
[00:14:06]
how equality is implemented, if you were to hand that over to a JavaScript developer and
[00:14:11]
say does this perform a deep equality check?
[00:14:16]
And he would say no, there are tons of issues with this.
[00:14:19]
But in the context of Elm, it works just fine, because it can rely on the fact that mutation
[00:14:25]
doesn't happen and these sorts of things.
[00:14:28]
Like having the same identity, if you have two objects with the same identity, that doesn't
[00:14:32]
necessarily mean that the object hasn't changed.
[00:14:35]
But in Elm that is in fact true.
[00:14:38]
So there's a bunch of stuff you can do knowing all the restrictions that Elm places on you.
[00:14:43]
Yeah, it's really interesting.
[00:14:45]
This blog post series you wrote about successes and failures in optimizing Elm's runtime performance,
[00:14:52]
which we'll link to in the show notes, you talk a lot about essentially how there are
[00:14:57]
all these optimizations baked into v8, which really it's sort of like a heuristics based
[00:15:02]
optimization, right?
[00:15:04]
Because their JavaScript is an interpreted language.
[00:15:08]
And then you have this sort of just in time compiler, which applies heuristics, which
[00:15:13]
can then get deoptimized.
[00:15:15]
That's why they're heuristics, because it's interpreting things as it goes and saying,
[00:15:19]
oh, hey, this will probably make it perform better.
[00:15:22]
And then it assumes that the shape of an object has these fields.
[00:15:27]
And then suddenly, boom, now there's a null in there that it didn't expect, or now something
[00:15:32]
is a string that was an int elsewhere.
[00:15:34]
And now it deoptimizes.
[00:15:35]
So it's doing these heuristics.
[00:15:37]
And as somebody doing these performance tunings in Elm compiler output, you're doing this
[00:15:46]
strange work of sort of trying to understand those heuristics and trying to activate the
[00:15:53]
heuristics in a way that they can predict Elm.
[00:15:56]
But you're not predicting it.
[00:15:58]
You know it because it's statically compiled code.
[00:16:02]
But you're trying to get this like just in time optimization to kick in in those places.
[00:16:08]
So it's like a weird it's a weird dance, isn't it?
[00:16:12]
Yeah, so like and that really boils down to the fact that the just in time compiler understands
[00:16:18]
JavaScript very, very well and has to account for all this sort of stuff so you can do in
[00:16:23]
JavaScript.
[00:16:24]
And there are certain things that you can't do in Elm and certain things you can do, which
[00:16:28]
the JavaScript just in time compiler naturally has no knowledge about.
[00:16:32]
So really a lot of the stuff that I've done with this performance work is so Elm makes
[00:16:39]
it so that these things are always true.
[00:16:43]
How can I tell that to the JavaScript just in time compiler?
[00:16:47]
How can I make a JavaScript engine understand these things?
[00:16:51]
And that is sometimes very hard.
[00:16:57]
I actually have no clue how you would do that.
[00:16:59]
Is it just you write you transform the code to something that is relatively simple or
[00:17:04]
something like that?
[00:17:05]
Yeah, so like so one thing that the Elm compiler does today, which is which wasn't which was
[00:17:13]
originally done to reduce asset size, but which has a very cool performance benefit,
[00:17:20]
is that when it reads your entire Elm project compiles into JavaScript, it compiles all
[00:17:26]
your Elm code and all the dependencies and the core library, the runtime, everything
[00:17:31]
into one single namespace.
[00:17:34]
And when you call functions, and if you don't run this through Elm optimize level two, if
[00:17:39]
you if you call single error functions, then there are two things that comes out of this.
[00:17:46]
One is that you can see the function in scope.
[00:17:49]
And so it knows that the function cannot be no because it's right there.
[00:17:54]
And second of all, it knows that it's actually a function and not some crazy evaluated thing
[00:18:00]
that evaluated to a function.
[00:18:01]
So by having functions in the same scope, and readily available, the JavaScript engine
[00:18:08]
can infer a surprising amount of things about that function.
[00:18:11]
It doesn't have to look it up in the window or global, for instance, that would have a
[00:18:17]
performance cost.
[00:18:18]
And so the natural way to do namespacing in Elm is to create an object with certain fields
[00:18:24]
and that those fields point to functions, say.
[00:18:28]
But in Elm, you're just referencing a local function, the function that rise within local
[00:18:32]
scope.
[00:18:33]
So you know, it's a function, you know, it's not no, you don't have to look it up in an
[00:18:38]
object, which means you don't have to check is this an object, is the object referencing
[00:18:44]
actually there?
[00:18:45]
And if that property exists, is it no, right?
[00:18:48]
So there are a bunch of things that the compiler just doesn't have to deal with, because it
[00:18:52]
can see the function in the local scope.
[00:18:55]
And V8 understands that it makes it run faster than if you had to go through objects with
[00:19:02]
a lookup, for instance.
[00:19:04]
Exactly.
[00:19:05]
So one thing that I've seen in when just asking the V8 engine to just tell me, what are the
[00:19:13]
steps you go through to, like, how do you optimize this plain regular JavaScript function
[00:19:18]
into assembly, then every time you do like an object lookup, it will produce this check,
[00:19:25]
which checks is this thing that I got from this object null.
[00:19:30]
And that will always happen because in JavaScript, you can always go into a REPL and then add
[00:19:36]
stuff which can change.
[00:19:38]
And so even though the just in time compiler can be reasonably certain at some point that
[00:19:43]
this thing isn't null, that doesn't mean it cannot be null later.
[00:19:47]
So it always has to like defensively add a bunch of checks.
[00:19:50]
Yeah.
[00:19:51]
And that's kind of annoying because we have all those guarantees and then we still have
[00:19:56]
to re prove it again, kind of like going through paperwork for administration.
[00:20:02]
You have to send a sign up form, send it over and then do the same one again for another
[00:20:09]
service or something.
[00:20:11]
Yeah.
[00:20:12]
Yeah.
[00:20:13]
It is very bureaucratic, isn't it?
[00:20:14]
In a way.
[00:20:15]
At least it's faster.
[00:20:19]
Yeah.
[00:20:20]
So that makes me think about WebAssembly.
[00:20:23]
And of course, I mean, I think that WebAssembly can become maybe a silver bullet where it
[00:20:30]
solves all the performance issues, right?
[00:20:33]
In people's minds.
[00:20:34]
And that's not necessarily, it's not as simple as that.
[00:20:39]
What is WebAssembly?
[00:20:40]
What is this WebAssembly that you're talking about?
[00:20:43]
Should we define WebAssembly?
[00:20:45]
Yeah.
[00:20:46]
So it is essentially, correct me if I'm wrong, but it is something that gives you lower level
[00:20:53]
control rather than this like interpreted language of JavaScript that can run natively
[00:20:59]
in the browsers.
[00:21:00]
It's something that can be executed natively in the browsers.
[00:21:03]
It actually has, it's actually typed.
[00:21:05]
So you write these sort of essentially byte code instructions, right?
[00:21:10]
And you can have it as a compile target so you can compile Rust or whatever languages
[00:21:14]
to that compile target.
[00:21:16]
And it gives you more low level control over memory management.
[00:21:20]
It doesn't come with built in garbage collection, things like that.
[00:21:23]
But it gives you more nuanced control over performance and doesn't rely as much on these
[00:21:29]
heuristics for just in time optimizations.
[00:21:32]
Is that a fair summary?
[00:21:34]
It's pretty correct.
[00:21:37]
So it's a very low level language in the same way that Java byte code and.NET byte code.
[00:21:47]
In fact, it's very similar to those sort of things, which most developers don't look at
[00:21:52]
at all.
[00:21:53]
But the big difference from WebAssembly and compared to something like Java byte code
[00:21:59]
is that there are way fewer instructions and there are way fewer built in things.
[00:22:06]
Like it doesn't have a garbage collector.
[00:22:08]
That is one thing it just doesn't have.
[00:22:10]
It doesn't have strings or any sort of data structure.
[00:22:14]
All you get is this one huge continuous array, some instructions to look into that.
[00:22:22]
And you get functions and you get four types.
[00:22:26]
Five if you include functions.
[00:22:28]
And those four types are 32 bit integer, 64 bit integer, 32 bit float, and 64 bit float.
[00:22:35]
And that's really all you have to work with.
[00:22:39]
Regarding the point that people think that WebAssembly will come in and solve all our
[00:22:42]
performance problems, that's not really true.
[00:22:46]
Like if you have a compiler that spits out very easily to optimize JavaScript and you
[00:22:53]
have a compiler that compiles into very performant WebAssembly, you can probably expect about
[00:23:00]
the same performance.
[00:23:02]
However, the thing about WebAssembly is that it doesn't, since it's not JavaScript and
[00:23:07]
since you don't have to do a lot of crazy stuff to get good performance, WebAssembly,
[00:23:13]
like there is no guesswork involved.
[00:23:15]
The compiler doesn't have to guess how do I compile this in the most optimal way.
[00:23:20]
It simply just, okay, these byte codes can be compiled directly into this.
[00:23:24]
And so it's much faster to compile and it doesn't have to guess how this should be compiled,
[00:23:30]
which means it doesn't get a lot of stuff wrong.
[00:23:35]
And the result of that is that you can expect to a much higher degree what the performance
[00:23:41]
of compiled WebAssembly will be compared to JavaScript.
[00:23:45]
Because in JavaScript, everything depends on what happens at runtime.
[00:23:49]
So if you have a very simple program, all it does is that it takes an array of a thousand
[00:23:57]
elements and wants to call the plus operation on them.
[00:24:02]
It's a very simple thing to write in JavaScript.
[00:24:05]
It's relatively simple to write in WebAssembly.
[00:24:07]
In WebAssembly, if that array contains integers or if it contains strings, it will be pretty
[00:24:16]
much the same performance if you implement it to support both.
[00:24:19]
You will get the same performance every time.
[00:24:22]
In JavaScript, if the just in time compiler only sees arrays of integers, you will get
[00:24:28]
very good performance.
[00:24:29]
But if it sees sometimes an array of integers and sometimes an array of strings, then you
[00:24:34]
will get worse performance than if it only sees integers.
[00:24:38]
It can't specialize the code as well.
[00:24:40]
So in WebAssembly, you can write code where I expect it to have this performance profile
[00:24:45]
and it will pretty much always have that.
[00:24:47]
Whereas in JavaScript, it all depends on what the just in time compiler sees when the program
[00:24:52]
is running.
[00:24:53]
Yeah.
[00:24:54]
And you also remove all those checks that we mentioned like, is this indeed an integer?
[00:24:58]
Is this indeed a string?
[00:25:00]
Those won't have to be done in WebAssembly, but they're done under the hood in JavaScript
[00:25:04]
all the time.
[00:25:05]
That's true.
[00:25:06]
But so there was this blog post and I don't remember the name of it.
[00:25:12]
I can try to find out later and maybe we can add it to show notes.
[00:25:16]
But there was this blog post where somebody wrote, I think it was the Firefox team, which
[00:25:23]
rewrote PDF reader, I think.
[00:25:27]
They rewrote it in WebAssembly and said, look, it's a hundred times faster or something because
[00:25:32]
the previous version was in JavaScript.
[00:25:34]
Well, that's promising.
[00:25:35]
Yeah.
[00:25:36]
A PDF viewer it was.
[00:25:38]
Yeah.
[00:25:39]
So it was the built in PDF viewer in Firefox.
[00:25:41]
They rewrote to WebAssembly and it was 50, 100 times faster, something along those lines.
[00:25:47]
And then there was a followup blog post to that where someone just changed the JavaScript
[00:25:55]
version and they got about the same performance.
[00:26:00]
But the thing is, so if you compile to WebAssembly, it is much easier for you to create WebAssembly,
[00:26:07]
which will give you the best performance.
[00:26:10]
Whereas in JavaScript, you have to not only know JavaScript very well, but you have to
[00:26:14]
know how the different JavaScript engines compile optimal code.
[00:26:19]
And so that is, it's much harder to create optimal JavaScript that compiles and optimize
[00:26:25]
as well than WebAssembly in theory, I guess.
[00:26:29]
So one thing that is pretty tricky with compiling to JavaScript and expecting good performance
[00:26:35]
is that you need to compare it to multiple implementations of engines.
[00:26:40]
So you need to run benchmarks on Chrome, on Firefox, on Safari, and they have very different
[00:26:47]
engines and therefore have very different results on benchmarks.
[00:26:51]
So if you change some code, sometimes you will have better performance on Chrome and
[00:26:56]
worse performance on Safari, for instance.
[00:27:00]
Would that also be the case with WebAssembly?
[00:27:03]
Would each browser have their own implementation of WebAssembly?
[00:27:07]
Well yes, they will.
[00:27:09]
But at the same time, there are only so many ways of compiling a WebAssembly program because
[00:27:14]
there are very few byte codes and there are very few data structures.
[00:27:18]
And essentially, there aren't many ways that a single byte code instruction can be compiled.
[00:27:25]
And so you are likely, so if you compile WebAssembly a specific way, you are likely to get the
[00:27:31]
best possible performance for that code.
[00:27:35]
And of course, the Firefox WebAssembly compiler could be a worse compiler than the Chrome
[00:27:41]
one, but at the very least, you're not relying on how good the compiler is at guessing how
[00:27:48]
it should optimize the code.
[00:27:50]
I'm guessing that will be true for the beginning, but maybe not later.
[00:27:55]
For instance, I'm guessing the V8 or actually the engines for the different JavaScript engines,
[00:28:02]
they were not trying to be smart at the beginning, but then they noticed, oh, we can try to be
[00:28:07]
smart to improve performance.
[00:28:09]
And then they just piled improvement over improvement and made it very complex and unintuitive.
[00:28:16]
And I'm guessing maybe that could be true for WebAssembly as well, maybe not to the
[00:28:20]
same extent.
[00:28:21]
So I mean, that's always possible, right?
[00:28:24]
You always run the risk that Safari adds another WebAssembly specialized compiler, which does
[00:28:31]
runtime profiling to improve code.
[00:28:35]
That can of course happen.
[00:28:36]
But one thing that has happened a lot in my performance work is that...
[00:28:43]
So when I was implementing the array data structure for Elm, one thing that surprised
[00:28:49]
me was that, okay, I was going to implement array.map.
[00:28:55]
And in my mind, the Elm array, for those who don't know, is a tree structure that if you
[00:29:01]
have 32 elements or less, it's just a normal JavaScript array.
[00:29:05]
If you have more than 32 elements, it will become a tree where each level of the tree
[00:29:10]
has 32 elements.
[00:29:12]
And so it will grow...
[00:29:15]
So if you have 60 elements, then the Elm array will be one array with two elements.
[00:29:21]
Those elements point to arrays where the first array contains the first 32 elements and the
[00:29:27]
second array contains the next 28.
[00:29:32]
And as you add more elements, the tree grows.
[00:29:35]
That was probably not the best summary of how an Elm array works.
[00:29:38]
But the important thing for this particular story is to know that an array consists of
[00:29:44]
multiple JavaScript arrays under the hood.
[00:29:47]
So when I was implementing array.map, the natural thing for me to do was to implement
[00:29:54]
that in terms of the built in JavaScript array.map instead of writing a for loop and kind of
[00:30:02]
like reimplementing array.map myself.
[00:30:07]
But it turned out that using the built in array.map for JavaScript arrays was very fast
[00:30:16]
in Chrome.
[00:30:17]
But compared to a for loop doing array.push, it was slower in Firefox.
[00:30:23]
In Firefox, writing the actual loop was way faster than using array.map.
[00:30:29]
And in WebAssembly, you wouldn't have such a difference.
[00:30:33]
If you were going to implement the array.map, you would do it pretty much the only way you
[00:30:38]
can in WebAssembly.
[00:30:40]
And even though the performance can be worse in one browser compared to another, there
[00:30:43]
wouldn't be...
[00:30:45]
You wouldn't do it...
[00:30:47]
You wouldn't get better performance by doing it in a less obvious way, I guess.
[00:30:52]
There aren't that many ways of doing the same thing.
[00:30:55]
And so you can just count on the most obvious thing also being the fastest thing.
[00:31:02]
So what did you end up doing with the array.map?
[00:31:04]
How did you make that choice?
[00:31:08]
Well, really, since Elm is supposed to be used...
[00:31:13]
If I were doing this and I only cared about Chrome, then I would do whatever is fastest
[00:31:17]
for Chrome.
[00:31:18]
But because Elm can be used in a lot of scenarios, I had to do it the way which overall gave
[00:31:25]
the best thing.
[00:31:27]
And if I remember correctly, the performance difference for Firefox was so big that I ended
[00:31:33]
up prioritizing what was fastest for Firefox because the difference in Chrome wasn't that
[00:31:38]
big.
[00:31:39]
So you kind of have to find one solution that works best when all browsers are considered.
[00:31:45]
Yeah.
[00:31:46]
So are you secretly hoping for Chrome to just win the competition used by everyone?
[00:31:54]
No, I think this is a slight departure from performance, but I think in the browser space
[00:32:02]
we're very well served with competition.
[00:32:06]
So I think the current...
[00:32:10]
I was sad to see Microsoft just adopt Chrome as their web browser, essentially, even though
[00:32:18]
I have no fond feelings towards Microsoft.
[00:32:22]
I think it's good with some competition in the browser space.
[00:32:26]
Of course, from a performance perspective, it would be nice if everything worked the
[00:32:30]
same way.
[00:32:31]
It would make my life a lot easier.
[00:32:35]
But I think for most people, it would be better with competition in the browser space.
[00:32:40]
Yeah.
[00:32:41]
So it seems like it comes down to control, like WebAssembly gives you more control over
[00:32:47]
performance.
[00:32:48]
Now, if you have more control over performance, that means it's not going to do an optimization
[00:32:54]
that you didn't build into it, which V8 or whatever SpiderMonkey's pre compilers are
[00:33:01]
going to do.
[00:33:02]
And to bring this back to Rock, one of the reasons why Rock can perform a lot of mutations,
[00:33:11]
which are safe to do in practice without losing purity, is because they have full control
[00:33:19]
of how the code compiles.
[00:33:21]
So in JavaScript, you have a garbage collector.
[00:33:25]
No matter what you do, you are going to create a language which on some level is garbage
[00:33:29]
collected.
[00:33:30]
So when you're compiling to WebAssembly or regular assembly, you don't have a garbage
[00:33:36]
collector, which gives you the freedom to implement memory management how you want to.
[00:33:40]
In Rock, one of the things they've done is that they use a reference counting sort of
[00:33:45]
garbage collection.
[00:33:46]
And while that, from a throughput perspective, is in general worse than a tracing garbage
[00:33:51]
collector, what it gives them is that they know when they have an object, they know exactly
[00:33:56]
how many is looking at that object.
[00:33:59]
And if the person who wants to change the object is also the only person who can observe
[00:34:05]
the object, doing a mutation is perfectly fine.
[00:34:09]
And so by using reference counting, they can actually get this performance optimization,
[00:34:13]
which is difficult to get with the garbage collected language.
[00:34:17]
And so that level of control, the problem with it is that you have to implement everything
[00:34:21]
yourself.
[00:34:22]
But the upside is that you can do a lot of things you wouldn't normally be able to do.
[00:34:28]
So whatever the future holds for Elm, Brian Carroll has done some really cool experiments
[00:34:38]
prototyping WebAssembly output for Elm, which it's sort of an early prototype.
[00:34:45]
We don't know if that would ever be production ready or if it's just a proof of concept,
[00:34:49]
but either way, it's very interesting work.
[00:34:51]
But whatever the future holds for Elm, I kind of wonder what, I mean, in particular, the
[00:34:56]
two of you, Robin and Jeroen, you've been digging into performance a lot.
[00:35:02]
Jeroen has been doing that as a passion project lately.
[00:35:05]
And I wonder, are we scratching the surface for performance stuff in Elm?
[00:35:11]
Or is there a lot more that we have left?
[00:35:14]
Because one of the really interesting parts of the Elm story to me is in the early days,
[00:35:19]
there was a blog post, I think, comparing performance between these different front
[00:35:24]
end frameworks.
[00:35:25]
And Elm was one of the top performers, right?
[00:35:28]
And that's very interesting when you have this very high level language, and you have
[00:35:33]
these things that, you know, I mean, if it's your cup of tea, things like immutability
[00:35:39]
are really exciting in terms of reducing the cognitive load of the developer being able
[00:35:43]
to easily trace what your code is doing.
[00:35:45]
And it seems like it would be a burden for performance, but then suddenly you're getting
[00:35:50]
better performance.
[00:35:51]
And that's one of the really fascinating things to me is how can you take these characteristics
[00:35:57]
of the Elm language and leverage them to actually be ahead of the pack with performance?
[00:36:02]
So where do you guys think we are with performance optimizations in Elm?
[00:36:07]
Because I'm seeing all these like blog posts that you're writing, Robin, and I'm seeing
[00:36:11]
Jeroen's messages about like his screenshots on Twitter with these large percentage improvements
[00:36:19]
on certain benchmarks.
[00:36:22]
So are those things going to keep happening for a while?
[00:36:26]
Or are we reaching the limit of how much we can optimize Elm's performance?
[00:36:31]
Go ahead Jeroen.
[00:36:33]
Yeah, we talked about this in private and Robin said, we probably did the easy stuff.
[00:36:40]
So what I'm doing, like I'm seeing a function and I see a way to improve it performance
[00:36:46]
wise.
[00:36:47]
It's mostly just about removing unnecessary work or duplicate work, which happens a lot
[00:36:55]
more often than expected.
[00:36:57]
Like if you loop over a list two times, then it's slower than looping over it once.
[00:37:04]
So I'd see it a lot in a few functions and that's just more about how you write those
[00:37:09]
functions.
[00:37:10]
So it's easy to optimize those.
[00:37:12]
On a more optimizer level, so a compiler or Elm optimize level two or any other tool to
[00:37:21]
make all those manual changes not necessary, that would be a lot more work.
[00:37:27]
So you could write it an optimizer that says, well, here we are unnecessarily looping over
[00:37:34]
the list two times and we could merge those into one or write using a while loop or something
[00:37:41]
like that.
[00:37:42]
But that's a lot more work.
[00:37:43]
You need some knowledge that you may or may not have about what every function does.
[00:37:51]
So yeah, it's more complex and there's also a bundle size that we need to care about in
[00:37:57]
Elm, which is a trade off.
[00:37:59]
So from my point of view of what I've seen, I'm still touching things that feel pretty
[00:38:06]
easy.
[00:38:07]
So yeah, I don't know what's remaining.
[00:38:10]
But I'm starting to see other areas of explorations and then the scientific papers become a bit
[00:38:18]
complex.
[00:38:19]
Let's put it that way.
[00:38:20]
Yeah.
[00:38:21]
That's when the postdocs start doing the optimizations.
[00:38:24]
Yeah.
[00:38:25]
And also since we're using a pure function language, I don't know if it's the most researched
[00:38:31]
thing.
[00:38:32]
I'm sure a lot more people have researched how to improve the performance of C code than
[00:38:39]
Haskell code or Elm code for that matter.
[00:38:44]
So yeah.
[00:38:45]
I think there are two very interesting...
[00:38:48]
I think I'll go as far as to say that we have a lot of knowledge and a lot of ideas about
[00:38:55]
how we can make Elm code compile faster.
[00:39:00]
And there are certainly...
[00:39:03]
Compile to faster output.
[00:39:05]
Yes.
[00:39:06]
Right.
[00:39:07]
Because it compiles pretty darn fast.
[00:39:10]
I mean, any more improvements are welcome.
[00:39:13]
Yeah.
[00:39:14]
Thank you for that.
[00:39:15]
Yeah.
[00:39:16]
So I think there are several people who knows a lot of easy wins, I guess we can say.
[00:39:23]
Elm optimized level two does this thing where it's able to compile a lot of stuff into direct
[00:39:28]
function calls instead of going through carrying helpers.
[00:39:32]
That happens today.
[00:39:34]
And from the benchmarks I've seen, that can easily increase performance by up to 20% in
[00:39:40]
some cases.
[00:39:41]
Of the overall program.
[00:39:44]
Yeah.
[00:39:45]
And then there are things that I've written about in the series of blog posts that I wrote
[00:39:50]
before Christmas where updating a record can be made up to eight times faster in some cases.
[00:39:58]
Yeah.
[00:39:59]
Which is huge.
[00:40:00]
Especially for applications that are continually looping over and up.
[00:40:05]
I mean, like games, for example, if you're on every frame updating game state and records.
[00:40:11]
Yeah.
[00:40:12]
Yeah.
[00:40:13]
So in that way, we are scratching the surface, I think, in what we can add.
[00:40:18]
We know that there are a lot of games that can be easily added to make Elm code run faster.
[00:40:25]
I know of several ways that the way that Elm is compiled in JavaScript could be changed
[00:40:30]
in order to increase the runtime performance of Elm code.
[00:40:34]
However, a lot of those optimizations would increase the JavaScript bundle size.
[00:40:41]
Sometimes by a lot.
[00:40:44]
So there are a lot.
[00:40:46]
So one of the things that make...
[00:40:47]
There are two things that kind of make performance work very difficult.
[00:40:51]
One of them is how much of a code size increase are we willing to accept in order to get optimal
[00:40:58]
performance?
[00:40:59]
And that is not going to be an easy thing to answer because that's always going to change
[00:41:03]
depending on what you do.
[00:41:05]
Like if you're writing a single page application, then as long as the characters the user types
[00:41:14]
on his keyboard arrives in a timely manner, performance isn't a concern.
[00:41:18]
And so asset size is probably the most important thing.
[00:41:22]
But for people writing games and physics engines and, you know, WebGL stuff, they would probably
[00:41:30]
accept pretty big code size increase in order to get most optimal performance.
[00:41:36]
And so that is a question which is very difficult to deal with when doing performance optimizations.
[00:41:42]
Right.
[00:41:43]
And the same for tooling like Elm Review and Elm Pages both do pretty heavy lifting in
[00:41:50]
a Node.js environment in your command line or your build step.
[00:41:55]
And if they can have big performance gains, whatever, give it 50% larger bundle size.
[00:42:02]
For a CLI app, that's an easy win.
[00:42:05]
For something that's running in your browser, that's probably not the right trade off.
[00:42:11]
Exactly.
[00:42:12]
And then, of course, another thing is the guesswork involved by the JavaScript just
[00:42:18]
in time compiler.
[00:42:19]
So there are certain things we could do, which, like in most languages, is the way to increase
[00:42:25]
performance like function inlining.
[00:42:29]
That would add most likely would increase the code size of the output.
[00:42:35]
But the thing is that the JavaScript just in time compiler already has inlining enabled.
[00:42:41]
So we could go through the hassle of creating a function inlining pass, but it wouldn't
[00:42:48]
necessarily give us better performance because the JavaScript engine might already do those
[00:42:53]
exact things.
[00:42:55]
And so that's one area where WebAssembly would be an easier thing to work with.
[00:43:01]
It wouldn't be easier because you'd have to implement a lot of stuff yourself.
[00:43:05]
But you would, to a much larger degree, understand if something was worth looking into because
[00:43:12]
it's a more predictable target.
[00:43:14]
Yeah.
[00:43:15]
So, like, a lot of the time that I've spent looking into performance has simply been,
[00:43:21]
so in theory, this should give better performance.
[00:43:25]
But in actuality, that may not be the case.
[00:43:29]
And so there are a bunch of experiments which I've done which sounds reasonable or sounds
[00:43:34]
completely unreasonable.
[00:43:36]
And I've been surprised by the result on more than one occasion.
[00:43:41]
So regarding bundle size, do you have a sense, Robin, because for anyone who doesn't know,
[00:43:46]
you've been working on Stabble.
[00:43:48]
It's a, what's it called, a stack language?
[00:43:52]
It's a stack based programming language, or stack oriented.
[00:43:56]
Stack oriented.
[00:43:57]
And it's really interesting, like, I know that one of the things that you wanted to
[00:44:03]
experiment with for that project was just outputting something to WebAssembly.
[00:44:07]
And so that's what it does.
[00:44:09]
And so you have a grasp of some of these real world applications of WebAssembly.
[00:44:16]
And how does it, how is it for bundle size?
[00:44:20]
Are WebAssembly output, is the bundle size larger, smaller, could go either way?
[00:44:27]
So it's difficult to know.
[00:44:29]
It depends on the language you want to compile to.
[00:44:32]
But I believe Brian Carroll posted some numbers on this.
[00:44:37]
Because in theory, WebAssembly bytecode instruction takes potentially just a byte.
[00:44:46]
So doing plus one two is smaller than writing one plus two in JavaScript.
[00:44:54]
Because it's compiled very efficiently.
[00:44:56]
On the other hand, you have to reimplement garbage collection, strings, carrying in the
[00:45:03]
case of Elm.
[00:45:04]
So it's not necessarily a clear win.
[00:45:08]
But I believe Brian Carroll has posted numbers on this sometime in the past.
[00:45:13]
And I believe with the garbage collection and with, admittedly not with all the semantics
[00:45:19]
of Elm in place.
[00:45:20]
But I think it had proof of concept garbage collector.
[00:45:24]
And I think a Hello World app or like the counter, the button counter example in Elm.
[00:45:31]
I think that compiled to, I'm taking this from memory so I could be very wrong.
[00:45:36]
But I believe it was something in the order of 12, 13 kilobytes before GZIP.
[00:45:42]
Oh, before GZIP.
[00:45:44]
Yeah.
[00:45:45]
So, and of course, the larger application becomes, the more in favor of WebAssembly
[00:45:51]
implementation becomes.
[00:45:53]
So I believe, and also with my experiments with Stavel, I believe that asset size would
[00:46:00]
be the one clear win from WebAssembly.
[00:46:03]
Yeah.
[00:46:04]
I didn't know it was, that the instructions were so condensed.
[00:46:11]
So WebAssembly has two formats.
[00:46:13]
There's a text format, which is meant for like, it's meant for, you can handwrite it,
[00:46:20]
but usually it's for viewing, debugging, sanity checking, that sort of stuff.
[00:46:26]
But the actual WebAssembly format is binary and it is very dense.
[00:46:31]
Like one of the things it does is that all integer literals are encoded using variable
[00:46:39]
sized encoding.
[00:46:41]
So even though you are representing a 32 bit integer, if the int literal is the number
[00:46:47]
10, it only takes up eight bits in the WebAssembly output.
[00:46:51]
So it's a very, very dense and optimized for size format.
[00:46:56]
That's huge.
[00:46:57]
I mean, the tiny bundle size potential is huge.
[00:47:01]
Well, or tiny, I don't know, but it's, that could be just as interesting as any performance
[00:47:12]
gains there.
[00:47:13]
So that's, that is super interesting.
[00:47:15]
Yeah, it's super interesting.
[00:47:16]
But of course, like Brian Carroll has been working on this for years and I don't think
[00:47:21]
is close to like a production ready compiler, which kind of goes to show that, you know,
[00:47:27]
WebAssembly has a lot of potential benefits, but working with it is very difficult.
[00:47:33]
Well not difficult, but very time consuming.
[00:47:36]
And I think with the current state of the compiler, you would have to do a lot of work
[00:47:40]
to get anywhere close to what Brian Carroll has got running today.
[00:47:44]
Absolutely.
[00:47:45]
And one thing that's easy to do, not easy, one thing that's important to keep in mind
[00:47:50]
is that the Elm compiler is not an optimizing compiler.
[00:47:54]
Even though it type checks your code, it doesn't actually retain that information to the code
[00:48:00]
generation stage.
[00:48:02]
So there are a ton of things you would have to improve or complicate, I guess is a better
[00:48:08]
word.
[00:48:09]
Like there, you would have to add a ton of complication to the Elm compiler in order
[00:48:12]
to be able to output WebAssembly, and that is very likely to come at a cost to compiler
[00:48:19]
speed.
[00:48:20]
Yes, which Evan has painstakingly optimized, I think largely by just reducing the amount
[00:48:28]
of memory that's passed around and that would be additional memory that you're passing around.
[00:48:33]
So yeah, it would have a cost for performance.
[00:48:35]
So yeah, it's not like WebAssembly is interesting.
[00:48:40]
It is super interesting, but it's also, it's not easy.
[00:48:46]
And of course, the JavaScript has a lot of faults, but it has a world class garbage collector
[00:48:52]
built in and it is pretty good at optimizing high level code.
[00:48:56]
So you wouldn't necessarily get better performance.
[00:49:00]
You will get a lot of complications in JavaScript interop.
[00:49:03]
You would probably get smaller asset sizes, but to get there would be a huge amount of
[00:49:09]
work.
[00:49:10]
But it's not a clear improvement over what we have today.
[00:49:16]
Yeah.
[00:49:17]
Well, one of the things that has always fascinated me is like when you can have a paradigm that
[00:49:24]
you just slightly changed the way you're working and it has huge implications.
[00:49:29]
Like for example, I always found it really interesting how you take Elixir and this web
[00:49:37]
framework Phoenix and simply by having this one property of immutability, which actually
[00:49:43]
it feels fairly similar to writing something like Ruby.
[00:49:47]
You can even rebind variables and under the hood it's using immutable data, but it can
[00:49:53]
feel very familiar for somebody who's used to writing Ruby.
[00:49:56]
But you take Ruby on Rails and Elixir Phoenix and suddenly you can get this incredible request
[00:50:03]
throughput because the optimizations they can perform under the hood largely with trivial
[00:50:09]
parallelization.
[00:50:12]
You have this immutability that you can rely on and suddenly this very challenging problem
[00:50:17]
of parallelization, which requires a lot of work, including by the application developer
[00:50:24]
to manage how to safely share memory.
[00:50:28]
Those problems suddenly all just go away.
[00:50:31]
And I think that there's similar potential in Elm.
[00:50:35]
This is big picture, long term, who knows what will happen.
[00:50:40]
But when I look at the big picture of trends of programming languages, everything becomes
[00:50:46]
a question of parallelization rather than brute performance.
[00:50:51]
So like CPUs aren't getting any faster.
[00:50:55]
For five, 10 years they haven't gotten any faster.
[00:50:59]
The clock speed is not improving because it would start to get to the temperature of the
[00:51:04]
surface of the sun just the way that the physics of increasing clock speed works.
[00:51:10]
But what you can do by getting more transistors on a chip is you can have more parallel processing,
[00:51:17]
but you can't do it at a faster clock speed.
[00:51:20]
That's just a limit that we hit a long time ago and that's not going to change.
[00:51:24]
Can't they just improve physics?
[00:51:27]
Maybe.
[00:51:28]
Maybe quantum computers.
[00:51:31]
So when we're on the topic of Elixir, Joe Armstrong, who is one of the creators of the
[00:51:36]
Erlang programming language, which Elixir compiles down to, said that like so Erlang
[00:51:43]
has this notion of easy parallel writing, parallel programs is very easy.
[00:51:48]
Part of that is immutability.
[00:51:49]
Part of that is isolated actor processes.
[00:51:53]
It's a super interesting language.
[00:51:54]
So if you haven't checked it out, do.
[00:51:56]
But he worked on a project where they had an Erlang program and then they swapped out
[00:52:02]
the hardware from like a four core CPU to a 64 core CPU.
[00:52:08]
And then the same exact same program just ran, I believe it was 34 times faster or something.
[00:52:16]
And the product manager said, well, we got 64 cores.
[00:52:21]
Shouldn't it run even faster?
[00:52:23]
And his response was, well, if you were to take a C++ program and just swap out the CPU,
[00:52:29]
it would be zero times faster.
[00:52:34]
So it's yeah.
[00:52:35]
Wait, yeah.
[00:52:36]
Zero times faster or one time faster?
[00:52:41]
I don't know math.
[00:52:42]
It would be one extra speed and a 0% performance increase.
[00:52:47]
All right.
[00:52:49]
I mean, you could say it just crashes and then it's just zero times faster.
[00:52:56]
That's also likely, I would say.
[00:52:59]
Yeah.
[00:53:00]
Yeah.
[00:53:01]
Yeah.
[00:53:02]
This is to me.
[00:53:03]
I mean, Yaron and I had this sort of episode in the new year where we talked about what's
[00:53:08]
working for Elm.
[00:53:10]
And that was like one of the points that came up was, hey, we've got this language with
[00:53:15]
some really unique characteristics.
[00:53:16]
And how can we, instead of saying, oh, performance is really hard with immutability, how can
[00:53:22]
we say, well, but these things become easier and these things we have more opportunities.
[00:53:26]
I think parallelization is one of them.
[00:53:29]
And I don't know, looking 10 years down the road, are web apps going to be leveraging
[00:53:33]
parallelization more?
[00:53:34]
I don't know.
[00:53:35]
Maybe.
[00:53:36]
And I believe WebAssembly has primitives for delegating things in a parallel way.
[00:53:41]
So if I'm not mistaken.
[00:53:44]
So that could be an interesting space, long term, big picture.
[00:53:49]
And I think if you if you I forget if it is in 0.19, it could be 0.18.
[00:53:57]
But I believe if you look into Elm core and look at the process namespace, then you will
[00:54:04]
get to that.
[00:54:05]
There will be a comment there in the documentation that refers to in the future, we might have
[00:54:12]
multiple actors or multiple mailboxes or something along those lines, which is a clear reference
[00:54:17]
to Erlang actors.
[00:54:19]
And so this aspect has actually been thought about by Evan since multiple years.
[00:54:30]
So yeah, that might be like one aspect we tap into.
[00:54:34]
And of course, when just to underline the point even more, one of the big things when
[00:54:39]
Clojure came out, Clojure was like the first functional program that I it wasn't the first
[00:54:45]
functional program that I learned.
[00:54:46]
It was the first immutable by default language.
[00:54:50]
And one of the big draws to Clojure was that because of immutability, concurrency is suddenly
[00:54:56]
super easy.
[00:54:58]
And so even though you have to pay the price of immutable code, adding concurrency to program
[00:55:03]
is so easier that in a lot of cases, you actually get more correct and better performing programs.
[00:55:09]
Right.
[00:55:10]
And on the web, in a web browser, your code is single threaded.
[00:55:13]
So if you are doing work on the main thread, which if you just open up an index.js and
[00:55:21]
load that and do some work, that is that is blocking the main thread, including if a user
[00:55:27]
tries to scroll or tries to click a button, and there's an animation from a built in button
[00:55:32]
element on the page that's blocked, the render thread needs the opportunity to run.
[00:55:38]
And you're running on that same thread.
[00:55:40]
So you can you can use worker threads to do work, you do need to send send memory back
[00:55:48]
and forth.
[00:55:50]
But this is another potential space that could be very interesting for Elm because this sort
[00:55:55]
of Elm architecture is a very natural fit for performing the main work off of the main
[00:56:01]
thread, and then sending messages back to tell the main thread to update.
[00:56:06]
Who knows if anything like that will ever happen.
[00:56:08]
But these are these are the things that, again, it's like Elm is a compiler, and what can
[00:56:14]
we do make taking advantage of that.
[00:56:17]
And so I from this whole conversation, I really do get the sense that whatever the future
[00:56:23]
holds, there's more opportunity.
[00:56:26]
And we were not done picking off the low hanging fruit even but there are who knows, maybe
[00:56:32]
there's some big thing in the future that could could even blow those out of the water.
[00:56:37]
So it's it'll be interesting to see what happens.
[00:56:39]
Exactly.
[00:56:40]
And there are multiple cases of this also.
[00:56:43]
Like we talked a little bit about Elixir.
[00:56:46]
I mentioned Clojure, right, like immutability by default enables concurrency, or easy concurrency.
[00:56:53]
There is there is also like so one interesting thing is JavaScript itself.
[00:56:59]
One of the reasons node.js took off was because it has this event loop built in.
[00:57:06]
And so even though you can't perform computationally expensive things, because you will block the
[00:57:13]
thread, the node runtime or the JavaScript runtime makes it very easy to do to do event
[00:57:22]
based programming.
[00:57:23]
And if you write servers that, you know, call a database, and then they just wait for the
[00:57:28]
results, node was really, really good at utilizing the one thread it has, which languages like
[00:57:35]
Java and dotnet, which is spawn threads, wasn't that good at
[00:57:40]
same with Ruby, Ruby's had a lot of issues with blocking file, oh, IO operations.
[00:57:46]
Yeah.
[00:57:47]
So really, the reason why node took off was because in practice, you managed to get servers
[00:57:53]
which could handle more load without like without careful engineering, right, just by
[00:57:59]
default, you could handle tons of requests, as long as those requests weren't doing anything
[00:58:04]
expensive.
[00:58:05]
And that's like JavaScript has a lot of flaws, but even JavaScript because of the limitations
[00:58:09]
it has, was able to outperform naive implementations in the server space, which is partly why it
[00:58:17]
took off.
[00:58:18]
So today, there are better alternatives, but back in 2009, or whatever it was, it was,
[00:58:27]
it was very interesting how you could handle a lot of requests on a single node.js server
[00:58:32]
compared to naive Java program, right, which is actually I believe why Ryan doll chose
[00:58:39]
JavaScript as the target language.
[00:58:41]
It wasn't originally his intent.
[00:58:43]
I can't remember maybe it was go or something else that he had in mind, but that event driven
[00:58:49]
architecture was just such a good fit for for JavaScript that he went with that.
[00:58:53]
Yeah, I never actually understood whether it was part of JavaScript or just part of
[00:58:59]
the implementations of JavaScript, that it was limited to a single thread.
[00:59:02]
I mean, I think that's the semantics of JavaScript basically that any anything you do runs on
[00:59:10]
a single thread, but then there's this concept of being able to have a queue up callbacks,
[00:59:17]
the callback queue and stuff.
[00:59:19]
Like I think the concept of like a callback queue and everything is baked into the semantics
[00:59:24]
of JavaScript.
[00:59:25]
And then the specifics of the things that can be done in a non blocking way are specific
[00:59:30]
to the node runtime or to the web runtime, like set timeout, for example, set timeout
[00:59:36]
is not part of JavaScript.
[00:59:38]
Set timeout is part of a runtime like the browser runtime or the nodejs runtime.
[00:59:42]
It doesn't exist independent of that.
[00:59:44]
But it uses the same mechanisms that you mentioned before that are built in.
[00:59:48]
Yes.
[00:59:49]
Part of the spec, I guess.
[00:59:51]
Yeah, exactly.
[00:59:52]
Those same semantics of a callback queue.
[00:59:55]
Yeah.
[00:59:56]
So yeah, so having languages which have limits, those limits can enable certain features that
[01:00:03]
can be very well suited to certain kinds of programs.
[01:00:06]
And Elm definitely, if there's one thing that Elm has a lot of, it's limits.
[01:00:10]
Right.
[01:00:11]
Exactly.
[01:00:12]
And those exact limits can be utilized to some pretty interesting results.
[01:00:16]
HTML lazy, which we talked about earlier is one example of that.
[01:00:20]
Doing the similar kind of optimization in React takes a lot more planning.
[01:00:26]
I guess like you need to know that you do not perform mutation in this component or
[01:00:31]
it will be slow or it will produce buggy behavior.
[01:00:35]
Whereas now it's very likely that you can just tap into that optimization.
[01:00:40]
I wouldn't say it is limited.
[01:00:43]
I would say it has limitations and those enable you to have no limits.
[01:00:49]
Oh, that's great.
[01:00:50]
Hey, that's another t shirt.
[01:00:52]
Love it.
[01:00:54]
So Robin, when you're sitting down to write Elm application code, I mean, I'm sure performance
[01:01:00]
is this thing that you can't help but think about no matter what you do.
[01:01:05]
But are you typically just focused on writing the application code or do you run into places
[01:01:11]
where as an Elm application developer, you find that you need to really think about performance
[01:01:16]
and tune performance?
[01:01:17]
Does that happen very often?
[01:01:18]
You are correct in that when I write Elm code, it's very difficult for me to not think about
[01:01:24]
this is suboptimal from a performance perspective.
[01:01:27]
Fortunately, that's something I've become better to ignore as I've grown older.
[01:01:33]
So I would say that today I don't focus too much on performance normally.
[01:01:39]
Now I take it like if we have a performance problem, that's when I'm called in.
[01:01:48]
So like the reasons Elm CSS improvements are a result of that.
[01:01:52]
This application is laggy.
[01:01:54]
You know Elm very well.
[01:01:55]
How can you improve the situation?
[01:01:57]
And we improved it by using HTML lazy.
[01:02:00]
And then I got home and thought about how could we have avoided that optimization in
[01:02:06]
the first place?
[01:02:07]
Like could we have changed the framework to not have needed HTML lazy in that case?
[01:02:11]
So that's how it works now.
[01:02:13]
But one thing that I have learned is that there are certain things which do improve
[01:02:17]
performance but which also at least I think improve readability of the code.
[01:02:24]
There are many cases where that is opposite.
[01:02:26]
Like improving performance worsens code.
[01:02:30]
But I've found several things that improves performance and increases readability.
[01:02:36]
And usually this involves data structures.
[01:02:41]
Most often you can recognize a pattern, realize that this would be more efficient and more
[01:02:48]
readable by using the correct data structure.
[01:02:52]
And really in Elm we have this mantra, making possible states impossible.
[01:02:57]
And in a lot of cases making impossible states impossible also improves performance.
[01:03:03]
Because there's less error handling and it's easier to get exactly what you want with safety
[01:03:08]
guarantees but also performance guarantees.
[01:03:11]
Less checks as well.
[01:03:14]
So one simple thing that I also use in other programming languages like Java and Kotlin
[01:03:22]
is whenever I see list.find or something similar to do like a give me the item with this key.
[01:03:34]
To me that's like this should be a dictionary.
[01:03:37]
Like why isn't this a dictionary?
[01:03:40]
Like sometimes using a dictionary would be worse overall but in many cases it just screams
[01:03:46]
associative lookup.
[01:03:47]
You have a dictionary for this.
[01:03:50]
Yeah, usually I see list.map and list.head and I'm thinking I should reach out for find.
[01:03:59]
And then maybe I should reach out for dicts.
[01:04:02]
But in the case that you mentioned list.map, list.head, that's a perfect use of just using
[01:04:09]
a different data structure which gives you both performance and it improves the intent.
[01:04:14]
What did you, what are you trying to do?
[01:04:17]
So that's also a valid case.
[01:04:19]
Using dictionaries or using sets instead of like manually or through some other means
[01:04:25]
deduplicating your code usually also improves performance and makes it very clear what the
[01:04:30]
intent is.
[01:04:31]
And then using sippers or nonempty lists, same thing.
[01:04:36]
But retrieving the head of a nonempty list lets you avoid a case of which has performance
[01:04:42]
implications.
[01:04:43]
Now granted in many cases the performance improvement we're talking about is small and
[01:04:48]
insignificant.
[01:04:49]
But the true benefit is clearer code.
[01:04:54]
It's nice to realize that you can actually have both.
[01:04:57]
Right.
[01:04:58]
And you have to really consider the cost if you're doing performance optimization that
[01:05:02]
makes the code harder to reason about.
[01:05:05]
Also what's to prevent someone in the future from looking at that code and saying, oh,
[01:05:09]
this is kind of ugly and then tweaking it and then breaking the performance up.
[01:05:13]
But if it's the most elegant way to express it, it's a lasting improvement that's good
[01:05:18]
for your code base.
[01:05:21]
That's also kind of like what motivated me to improve performance of Elm CSS.
[01:05:26]
Because where I work, a lot of the people who write Elm code are working on their first
[01:05:31]
Elm application.
[01:05:32]
They learned Elm because they were hired at V or in some other back project.
[01:05:39]
And then we teach them Elm in a day or two and then we throw them out into the deep waters
[01:05:44]
of an Elm application.
[01:05:46]
Now figure it out.
[01:05:48]
Exactly.
[01:05:49]
And so there aren't many people that I work with on a day to day basis who has years of
[01:05:55]
Elm experience.
[01:05:56]
And so expecting them to not mess up code that involves HTML lazy is kind of a stretch.
[01:06:07]
So not having HTML lazy, like if we didn't need HTML lazy, it is less likely that performance
[01:06:13]
will degrade at some point.
[01:06:17]
Robin, how do you go about finding your next opportunity?
[01:06:21]
Is it like you were kind of describing with this Elm CSS case, scratching your own itch
[01:06:27]
where you're driving home from work and you're like, hmm, can we avoid doing an HTML lazy
[01:06:34]
there?
[01:06:35]
Is that usually where you find your next opportunities for improvements?
[01:06:41]
I would love to say yes, because that's the way it should be.
[01:06:47]
But that is only something I realized once I turned 32.
[01:06:54]
Before that, I was probably where Jeroen is now.
[01:06:58]
Like he has discovered that performance work is really fun.
[01:07:03]
And so he starts looking at, well, maybe I can make this faster.
[01:07:09]
And oh, I could.
[01:07:10]
Maybe I should make this faster.
[01:07:12]
And there's nothing there's not necessarily anything wrong with that.
[01:07:18]
I don't mean to single you out.
[01:07:21]
No, I turned 32 in like three months.
[01:07:24]
Excellent.
[01:07:25]
Excellent.
[01:07:26]
Looking forward to it.
[01:07:29]
Prepare for wisdom.
[01:07:32]
But really, I did the same thing.
[01:07:33]
So the way I got into performance work was that I re implemented Elm arrays for zero
[01:07:40]
18, I think.
[01:07:43]
And the main reason for that was because Elm arrays were buggy.
[01:07:48]
They were written in JavaScript entirely.
[01:07:53]
And then there was like a very thin layer of Elm code to expose it to Elm.
[01:07:57]
And it had like it did have in certain cases, mutability, like if visible mutability, it
[01:08:05]
could cause runtime exceptions.
[01:08:08]
It wasn't good.
[01:08:09]
It wasn't pretty.
[01:08:11]
So the main reason was to rewrite it in as much Elm code as possible to make it safer.
[01:08:19]
But for it to be acceptable, it had to have at least the same ballpark of performance
[01:08:28]
to what was already there.
[01:08:29]
And so that's how I got into performance work.
[01:08:32]
I was trying to make an Elm array replacement, which didn't come at the cost of a huge performance
[01:08:39]
decrease.
[01:08:40]
And like having a benchmark and seeing those numbers go up when you make changes became
[01:08:46]
addictive and then I just started looking around like the Elm core library seeing what
[01:08:52]
else can I make faster.
[01:08:54]
But really the biggest performance, the most important performance improvements are the
[01:08:59]
ones you notice is a problem.
[01:09:02]
Because I realized that I've spent a lot of time fixing things which aren't an issue and
[01:09:08]
which aren't necessarily likely to be an issue.
[01:09:11]
Wait, are you saying that string.pad improving the performance of that function is not big?
[01:09:23]
I'm just saying unless you have a performance problem, fixing a performance problem isn't
[01:09:29]
necessarily going to bring value to someone.
[01:09:33]
That's not to say that making something faster just for the sake of making it faster won't
[01:09:39]
be very useful somewhere down the line.
[01:09:45]
And if you enjoy optimizations, especially optimizations which doesn't make code look
[01:09:50]
worse and is harder to grasp, then there's no harm in it.
[01:09:54]
But if you want to be entirely certain that the work you do has meaning, then ideally
[01:10:00]
you should just come over something where you think this should be faster and then fix
[01:10:05]
that.
[01:10:06]
And I might add, fix that in a scientific way.
[01:10:13]
Don't just think that, oh, if I replace this list fine with dict.get, then it will be much
[01:10:19]
faster.
[01:10:20]
And while it's probably faster now, do measurements and be certain that you are in fact making
[01:10:27]
something better.
[01:10:28]
And in an inevitable way.
[01:10:30]
Yes, yes.
[01:10:32]
So like a thousand times improvement is cool on paper, but if it in practice doesn't change
[01:10:39]
anything, then not saying that you should stop doing what you're doing, you're doing
[01:10:45]
awesome stuff.
[01:10:46]
I'm currently working on something that I think has users as well as improving performance.
[01:10:54]
So I'm very happy about that.
[01:10:56]
Okay, good, good.
[01:10:57]
But yeah, I remember that in some places I thought like list that appends is faster than
[01:11:03]
plus plus and I started using it everywhere.
[01:11:07]
And then I ran a benchmark just to just on list append versus plus plus.
[01:11:12]
Yeah, no difference.
[01:11:14]
So I did a lot of changes that were unnecessary and that didn't read much better.
[01:11:20]
So yeah, benchmark it.
[01:11:22]
And ultimately those things don't last.
[01:11:24]
You know, I mean, like, again, like somebody could refactor it because some something looks
[01:11:29]
ugly or it's a hack or if some if some code is using list dot append and it's a little
[01:11:35]
bit awkward and they're like, why doesn't this use plus plus and change it?
[01:11:39]
They're probably going to change it.
[01:11:41]
Maybe maybe it changes which one's faster than the other.
[01:11:45]
So there's always a cost to to making code uglier, right?
[01:11:50]
It's like a make it work.
[01:11:52]
Make it right.
[01:11:53]
Make it fast.
[01:11:54]
But that should be the last resort if you if you need to.
[01:11:57]
And if you benchmark it and see there's a problem.
[01:11:59]
Yeah.
[01:12:00]
So don't do this at home, kids.
[01:12:05]
Only do it at work.
[01:12:12]
So yeah, performance work is is a hobby.
[01:12:17]
It doesn't always bear fruits sometimes to do.
[01:12:19]
And that's great.
[01:12:20]
So it's it's as long as you're not hurting anyone.
[01:12:25]
Yeah.
[01:12:26]
Well, we do know that we've gotten a lot of amazing performance improvements from your
[01:12:31]
work, Robin.
[01:12:32]
So thank you for your work.
[01:12:34]
Thank you for being on to talk about this with us.
[01:12:36]
And yeah, thanks so much for coming back on.
[01:12:38]
Oh, my pleasure.
[01:12:39]
If anybody wants to to find out more, where should they where should they follow you?
[01:12:44]
Where can they go to read more?
[01:12:47]
Any any resources to leave people with?
[01:12:48]
I think perhaps the I think the best way is to follow me on Twitter.
[01:12:55]
That's at Rob Higgin.
[01:12:58]
Yeah, we'll drop a link in the show notes for people to do that.
[01:13:02]
Because sometimes I do when I do when I do stuff that's related to work or relatable
[01:13:09]
to work, then I post on the Beck blog.
[01:13:12]
And when I do stuff that's purely my own invention, I do it on my own dev2 account.
[01:13:18]
In either case, it ends up on Twitter.
[01:13:19]
So that's probably the best way to.
[01:13:21]
Perfect.
[01:13:22]
All right.
[01:13:23]
Thanks again, Robin.
[01:13:24]
Jeroen, until next time.
[01:13:26]
Until next time.