Security in Elm

We discuss what makes Elm's security vulnerability surface area smaller, and what frontend security considerations are important in Elm apps.
January 2, 2023


Hello Jeroen.
Hello Dillon.
I will admit, today's topic feels very vulnerable to me.
I'm feeling a little insecure about talking about this.
Well, you know what? I'm used to fixing vulnerable things.
You would never exploit my insecurities, would you Jeroen?
I didn't prepare puns, okay?
What are we talking about today?
I feel so much at loss here.
We are talking about security and Elm.
Well, the Elm part I think most people know, so they could have guessed that.
But we're talking about security.
That's the only word that we have agreed upon.
So what comes next depends on what we want to talk about.
Just in general, Elm feels like a pretty secure language in the sense that
I don't ever wonder about, is my code secure?
Which is a surprisingly fresh feeling.
And maybe I should feel bad about it, but I kind of know I don't have to worry about it.
Well, we should explore that question a little bit.
Where do we feel a sense of security and where is it a false sense of security
that we have about our Elm code?
Where do we need to take precautions to make sure we're writing secure Elm code?
Also, security is a large topic.
So just to clarify, I think we're not talking about backend security here.
I think we're talking about frontend security.
Yeah, I think we're going to talk about security for whatever Elm is used for.
And Elm is used for frontend.
I guess with tools like Lambdara, it extends to the backend somewhat.
But yeah, I'm going to stick to talk about the frontend for now, at least.
Right. Yeah, to me, the entire concept of frontend security is really,
I think, pretty interesting and subtle.
Like in a way, it's pretty obvious when you think about backend security,
you think about Bobby Tables, you think about SQL injection attacks.
And DDoS attacks as well.
DDoS attacks, right.
Which are denial of service, distributed denial of service.
Right. And to me, those are pretty easy to wrap your head around, right?
Because you're like, well, the user makes a request.
I need some data to know what they're searching for that I'm going to use
in a database query to perform a search or to do a user lookup or something.
It's like, well, if I don't scrub my inputs, then they can just drop the database.
And that, like, yes, we don't want arbitrary code executed on our database.
Right? Yeah, exactly.
But when you think about frontend security, the web is this strange place
where the user can open up dev tools in their browser,
and they can run JavaScript, right?
So making it secure, like in the backend, it's this protected, sacred space
where you're like, I run code, and when you give me data,
I need to make sure it doesn't cause code to run that I didn't intend to.
That's pretty obvious.
But the user can just run JavaScript code anytime they want
and do anything in their frontend application.
So why does frontend security matter?
Right. Well, I think it's always the same thing for the frontend and the backend.
It's basically you don't want people to be able to run arbitrary code
where they shouldn't, or rather where it would have an impact on someone else.
So if you run a request and send it to the server,
and it drops a database table, well, that will impact a lot of people.
So we don't want you to be able to do that, right?
For the frontend, you can run any arbitrary JavaScript code in your browser.
Like you open your console, as you said, and then you can do whatever.
You can mess up the internal state of the application,
and the application might crash, and that's kind of bad,
but you did it upon yourself, right?
If it's only for yourself, who really cares?
You actively try to mess with a machine.
So it's not something that we care about.
But if you're trying to execute some code on someone else's machine,
or on the server, or somewhere else where it might affect someone
who didn't want to be affected, then that's a security issue.
Right. Yes. That's a great point.
It's about certain trusted parties.
And when you're on a client page, a web page,
then the user is a trusted client,
and they can do whatever they want, and that's fine.
But, yeah, if you're at,
and you're logged in, and you're browsing through your bank's chat forums
that lets people post statements about how cool their bank is
and what fun financial things they're doing,
then if they can embed HTML in there
to put little emojis and bold things,
and you let them put whatever code you want,
and whoops, I let them embed a script tag,
now when Yaroon logs in to
and Dillon put in a script tag that can access Yaroon's cookies,
that is an untrusted party accessing, running code on your machine.
So it's a little more subtle,
but it comes down to, as you're saying,
trusted parties and trusted environments.
So obviously a backend server is a trusted environment
that untrusted parties need strict restrictions
on how they can use that,
and arbitrary code is the opposite of strict restrictions.
Yeah, and when you say trusted parties,
trusted parties for specific actions as well.
Right, right.
You as a user are allowed to create other users maybe,
if you're an administrator, if you're not, then you're not allowed to,
and you're creating some kind of content on the website
that will be seen by other people,
while you impact their application,
because they now see your content or your actions,
but that is allowed, right?
But injecting a script, not so much.
Right, totally.
Yeah, so it is all about trust and privileges,
because yeah, that's a great example.
You could be an admin user,
and the thing is, it does require trust to have an admin user,
and that trust can be abused.
So if the bank wanted to,
they could put, they can steal your data if they want to,
like they have it, and they have control,
like they could certainly inject a script tag on your site, right?
Like that's not even injecting,
that's just serving you HTML that does a malicious thing.
So the bottom line is, like, it is a trusted party.
We are trusting the bank, right?
Which is why, like, if you go to a phishing site
that's going to steal your data,
or you're trusting to put your credit card information into a forum,
it's important that the page is who it says it is,
and is not contaminated by other untrusted parties being that.
But it does require you trusting them.
Like, we think that Amazon is probably not going to go
take our credit card information we submit there
and, like, make fraudulent payments, but, like, they could.
We're assuming they won't because we're trusting them,
and that seems like a pretty safe thing to put our trust in.
So that's the reason that front-end security matters
is because when you go to a website,
you are trusting that party.
And if code can be executed by other parties,
you might not trust that party.
So that contaminates the trust by allowing untrusted parties
to be executing code on your browser.
So, like, it's kind of an amazing thing to me,
that the browser is this, like, very open, transparent environment
where, like, you can set cookies and you can present data
and you can, like, submit forms with passwords and things.
And somehow, like, in this very transparent, open thing,
we're able to build an environment where we can sandbox things enough
and come up with these conventions where trusted parties,
like, you can submit a password and it's pretty secure.
That's kind of amazing.
So the web is like this fragile ecosystem built on these tools for trust
that are largely, like, tools that the browsers provide
through these standards.
But there's nothing that the browser can do
if they provide these standards and these conventions for us.
But if we're not scrubbing our input properly,
then there's nothing they can do because they can only trust our assumptions
from the HTML and code we give them.
But if we don't sanitize it, then we're giving them code
that is going to do bad things.
So front-end security, I think, largely comes down to sanitization
is a lot of what it comes down to.
Yeah, so when you say sanitization, what you mean is sanitizing the data
that is coming from the back-end, the responses that you get
from HTTP requests, like, I want to render this HTML.
Well, we need to sanitize it so that we don't inject scripts, mostly, right?
Yeah, well, let's talk about that.
There are different sanitization contexts.
And actually, this is one of the things that makes web security difficult
and front-end security difficult,
there are so many different languages and contexts.
So in a sense, frameworks can help with sanitization
because if you're doing React and you use a prop somewhere,
then it's going to sanitize it for you unless you say,
yeah, unless you say dangerously set innerHTML,
and then it says, oh, I'm going to trust the context this comes from
and arbitrarily put HTML tags rather than escaping HTML characters, for example.
When you render HTML but not when using props or...
Okay, I think I got the idea at least, yeah.
In certain contexts.
So you can only do so much because you're trying to allow
programmatic things to be done, right?
So if you're like...
But you can at least say, well, if I'm setting an attribute,
this is the Bobby Tables thing, right?
We'll link to the Bobby Tables in the show notes
for anyone who hasn't seen that XKCD comic.
But basically, if you can get like a semicolon or whatever
into your SQL query, then you might be able to start a fresh SQL query
and escape out of that box that you were put in where it's like,
this box is for find user by username.
The user gets to provide their username,
but you break out of that box by using these meaningful characters.
So if you escape those meaningful characters,
you can't break out of that box because there's no way
to terminate a SQL statement or terminate the HTML attribute.
So that's how frameworks can help by escaping those things.
Yeah, this is actually one of those things where I let the backend
do most of the security checks.
For instance, if you somehow have to send a SQL query to the backend
or something that will be transformed to create a SQL query,
you can put as many checks in the frontend code that you want,
like making sure that you can't delete a user
or that you don't craft any weird SQL queries.
But if the backend allows that, then at some point
that is going to happen, either because someone just did a curl request
to the backend, which means it doesn't go through the frontend.
So whatever security checks you have in the frontend don't matter at all,
or they ran some JavaScript to bypass all the security checks
that you did in the frontend, and you're at the same point again.
So the backend anyway has to make the necessary checks.
So if you don't want someone to delete a user or delete some content,
then you're going to have to do that at the backend.
You might do it in the frontend for a nicer user experience,
but it's not going to be necessary from a security point of view,
in my opinion.
Absolutely. I definitely agree that this is a very important point
that sometimes people write frontend validations because it's like,
well, you need to have frontend validations
because you need a good user experience.
So I'll check for these things.
But then I already validated this on the frontend,
so I don't need to validate it on the backend again.
That would be like duplicating things.
But of course, as you say, it is anything that's coming from the client
cannot be trusted by the backend.
So that's like another dimension of this is there's like the frontend
to backend security, which essentially you need to treat any client
as untrusted.
And again, to me, that's this sort of magical thing
that we've been able to create something where you can create trust
in this environment where you don't really know,
like the client could be anyone,
and somehow we can create trust within that context.
That's pretty interesting.
Yeah. Like the backend even, it can't really trust much of its code,
except when you have opaque types, right?
If the backend has opaque types, then you can absolutely trust that code.
But when it comes from the frontend, well, that's going through HTTP,
that's going through GraphQL or some other kind of protocol.
And therefore, it's raw data that has no inherent security
or validation on it, right?
Right. Exactly.
So you need to check it again, at which point you turn it
into an opaque type, blah, blah, blah, blah, blah.
You know the drill.
Absolutely. Yes. Yeah.
And so then there's just the stuff that happens purely on the frontend.
And again, I think if my allows my post
on their message board, which it's a bad idea,
they shouldn't have built that feature, but they did.
They let me post arbitrary HTML in there, and they don't scrub the inputs,
and they let me put script tags, and they let me modify the DOM
of the page, right?
Now they're allowing me to essentially open up the dev tools
in another user's browser that's looking at their bank
in a logged-in session, or maybe a logged-out session,
and I go and take advantage of that by essentially I can now execute
arbitrary code on another user's page through posting on that message board.
I add a script tag that says, okay, hide all of the stuff
on the original page, and then put in my HTML that is going to,
when they log in, it's going to do on submit, post it to my phishing site,
and now I've stolen their credentials when they try to log in.
So we're putting a lot of trust in, like, typing in a password
in an input field in a web page and then hitting enter
is kind of a move of extreme trust.
And so we've been sort of, we haven't said the term,
but we've kind of been talking about cross-site scripting attacks
or XSS attacks, and so that's why it's so important,
because we are trusting that, like, we're putting a lot of trust
in just, like, the HTML elements that are presented on that page.
And if we abuse that, like, if we allow a malicious user
to take control of another user's, like, web page,
like the HTML they see on their page and script tags they see on their page,
then all of that trust is now tainted and we can do malicious things
because they're trusting us as if we were that authoritative trusted source.
So that's why finance security is really important.
We don't necessarily think about it a lot,
but it's an important thing to be aware of.
Absolutely. Yeah, the results can be quite disastrous, right?
So in my opinion, or at least in my understanding,
what it all comes down to is that you shouldn't trust
any arbitrary data from the back end either.
The back end should try to avoid saving arbitrary HTML
or arbitrary JavaScript code, because it will then be likely
to send it to the front end, but the front end should also try
to avoid rendering and executing that code.
And that's where multiple frameworks and languages have different strategies for this, right?
React has one that you mentioned where we sanitize things.
Most JavaScript frameworks do this or will tell you,
oh, you should always sanitize, never forget to sanitize.
Have you sanitized yet? Have you sanitized it once, twice, three times?
The charm.
And Elm has a different strategy for this, right?
Where you are not allowed to do this.
You're not allowed to render arbitrary HTML or run arbitrary JavaScript code
unless you, through some escape hatches, allow it to.
Let's dive into a few strategies for attackers
and how they inject things, right? Or how do they do bad things?
Because that will also open up to how they can do things.
So in the beginning, there was JavaScript.
I might be going forward a little bit like when browsers were like five years old,
but let's say in the beginning.
ECMAScript, perhaps?
Not yet, I think.
Didn't it start as ECMAScript?
Hmm. We're going to need to look this up.
I think ECMAScript was later and that was something,
like they couldn't call it JavaScript because Oracle had a patent or something.
I'm not sure either.
So there was JavaScript.
And browsers, they added support for JavaScript, all of them.
And JavaScript is super useful.
We've talked about it before.
It has a few quirks.
We've talked about those before.
But basically in JavaScript, if you want to make a HTTP request,
it is pretty easy.
Well, somewhat easy.
You do new XML HTTP requests, pass in some arguments.
And basically in one line of code, you can run arbitrary JavaScript code
or run send arbitrary HTTP requests.
And that doesn't have any checks on it.
It does not have any sandboxing around it.
You just write that code, execute it, and boom, the HTTP request is going.
And that can happen in any piece of code.
Like if you write that in a line of JavaScript code, it runs.
If you run that in a getter method, it runs.
If you run that in a promise, it runs wherever.
And that is quite dangerous as well because that means that you can
potentially run this in prototypes or methods that have been overwritten
by prototypes.
But maybe I'm going too far ahead.
Are you telling me that you can just write code and it just executes
side effects?
It's bad.
That's so strange.
Did I tell you it's bad?
It's bad.
So yeah, there's no control over those effects, right?
And that's why we call them side effects even though in a way they are
done on purpose.
They're not side effects.
They're purposeful effects.
But they're not managed effects like the way that we like to use them now.
Right, because we are performing effects but it's constrained where they
can occur and what can trigger them.
Yeah, exactly.
And so because you can do pretty much any kind of code, any kind of effects
when running JavaScript code, the strategy that Elm has taken to prevent
security issues is to prevent JavaScript because that is surprisingly much
easier to control everything.
And for that Elm has chosen a few strategies.
So how do you prevent JavaScript from running?
Well, first of all, you don't have a direct interop.
You don't have FFI.
You can't just run arbitrary JavaScript functions inside Elm code.
So that has been a pain for a lot of people especially when going from
18 to 19.
But that does mean that you don't have any security issues and that is
quite cool.
So just to clarify, to put on our malicious hacker hat again for a second,
what is the malicious hacker trying to do in this case?
Like how are they, what's their attack factor they're trying to exploit?
Are we talking an NPM package that I, NPM install is even and it makes
an HTTP request?
Yeah, so you've got multiple ways of doing things.
So as we said before with XSS, cross-site scripting, where if you have
the backend returning some arbitrary JavaScript code that someone else,
someone malicious has entered, then executed that would trigger
HTTP requests which can now send your cookies and other important
information to the malicious attacker.
An attacker.
But yeah, you also have NPM as you said where if you have a JavaScript
application and you have installed a malicious package, well if you
execute that code, then it can run arbitrary JavaScript code, right?
So it can do that somewhat explicitly like, oh, you call that function
from that library and then it does some HTTP request or it can do it
in weird ways like it can change the prototype of core functions
from JavaScript.
So if you try to access window.array.from, whatever, then now
suddenly it starts making HTTP requests in a very unexpected way.
And that is why you get a lot of these security issues about prototype
pollution is because now people can do weird things about the prototype,
meaning anything that looks even normal can do anything weird.
And this is not theoretical.
Like there have definitely been like supply chain attacks where a
package that's used by millions of packages upstream, it's a dependency
in the node modules of this one dependency, either the author of that
package or maintainer of that package decides to put something malicious
in or somebody convinces the maintainer of an NPM package to give them
the keys because they're stepping down as maintainer, they're looking
for new maintainers or they maliciously take control of an NPM account
or whatever.
These things have happened many times, increasingly so.
And in some cases you've got a little bit of both, like where a package
doesn't do anything malicious, but it somehow has the capability of
changing the prototype something because it can set an arbitrary value
on an arbitrary position, like lodash for instance, I think had an issue
like that where I don't know the details, but I imagine like the underscore
dash set function, you can change the prototype of something.
So now if a package uses lodash and uses the set method and the path at
which you set something is supplied by the user, well now the user can
set an arbitrary value at a prototype and that could be some JavaScript
code running HTTP requests.
And that's where it gets like, well, it becomes really hard to audit
things and some people do hard work, but yeah, it looks like, does it
really matter?
Well, in some cases it doesn't, in some cases it does.
Right. So most of these types of problems don't exist in Elm, not all
of them, but most of them.
Yeah. They don't own their own, right?
Right. Because you can always do interop, you can do JavaScript interop
from Elm using user input.
You can install NPM to packages in an Elm app.
Yeah, exactly. And if they do something weird, well then that impacts
your Elm code as well, but it's mostly because you installed a lot of
package which has this vulnerability.
So it's not because Elm is vulnerable in this aspect.
So yeah, the ways that Elm prevents you from doing all this weird
things that lead to problematic results is it doesn't have FFI and you
also can't just render plain HTML like you can with React where you
have the dangerously set inner HTML.
Right. And as a workaround, sometimes Elm users will use something like
an HTML parser package.
They'll take user input, they'll parse it, and then they'll render that.
And then they get something of the HTML type in Elm that they can render
on their page.
At least, though, in that case, there are a few attack vectors that are
closed off.
You cannot include script tags there.
You cannot include on-click handlers or these types of things and put
arbitrary JavaScript as a string as those attributes.
So some of these attack vectors for...
Essentially, you actually just can't put JavaScript in there.
So you can put HTML, and you actually can do malicious things with only
HTML tags if you're accepting untrusted HTML, but you can't execute
JavaScript from that.
So Elm shuts off that attack vector.
So to continue on the ways around that that you mentioned, so yeah, you
can parse the HTML and you can re-render it using plain Elm functions,
or you can use ports or web components to route that to JavaScript and
let it render the whole thing.
Potentially, and hopefully, with some sanitization.
So you mentioned that you couldn't render script tags.
Why not?
Well, I mean, partially, I think it's just the Elm language's philosophy
of pure functions, and it introduces a break in that mental model, a
leak in that model, if you have script tags, because now you say, well,
this is a pure function, but it renders a script tag that makes an HTTP
And we're back at square one.
So I think that, I mean, I think security is one motivation, but perhaps
it's almost a side effect, if you will, of that.
I would call it a controlled effect of not wanting side effects.
So the reason why, the technical reason why we can't have these script
tags is because Elm's virtual DOM, which is what the Elm slash HTML
package and Elm UI and Elm CSS all use, and the Elm virtual DOM is the
one responsible for creating the DOM nodes.
And basically, whenever you call one of the primitive functions, you
pass in the tag name, so that can be a div, that can be an a tag.
And if you try to pass in a script tag, then Elm will actually look at
that and see, that looks very much like a script tag.
Let me replace that by a p tag, so a paragraph.
And therefore, you now have a p tag with some JavaScript code inside of
it, but when it's rendered to the browser, the browser will not execute
it because it's not a script tag.
That's embarrassing.
You've got your script tag rendered in the DOM in a p tag?
Yeah, like you at Sacker are making a fool of yourself.
Look at the XML HTTP request that you're trying to show to your users,
to your fellow users, right?
Kind of embarrassing.
And yeah, so Elm does quite a lot of these similar tricks where it tries
to prevent you from declaring and executing JavaScript code through
different ways that the browser would understand it.
So script tags are one.
Unclick, yeah.
Unclick or event handlers in general, it prevents those.
And basically the way that it does that is not through sanitizing the
JavaScript or the inputs.
It's by doing something a little bit easier performance-wise by disabling
the tag, like replacing the script tag by p tag, for instance.
So the only check is looking at the tag name or checking the attribute
name or the property name.
So for instance, if you have an event handler, like whenever you click
on something, you could have something that says console log or make
some HTTP request.
Well, the Elm virtual DOM will look at the name of the attributes.
And if it looks like something like unclick, then it will change that
to data-unclick.
And that way that is just data and the browser doesn't try to execute
it, doesn't look like an event handler for it in its point of view.
So yes, these kind of small tricks, small checks that the Elm virtual
DOM does, and that makes it really hard to inject JavaScript.
I know there are a few ways we can still do it.
And I know that some issues have been opened recently, but I think that
those will be solved through some changes to Elm virtual DOM.
You had some that were merged in recently as well, right?
In the last year or so.
Sometime this year.
Oh, actually, when is this episode released?
Sometime in 2022, I had.
So I made a blog post about that, that explained kind of what I'm talking
about now and the different vulnerabilities that were found and fixed.
So I think just to clarify this point, injecting JavaScript is most of
those vectors are closed off in Elm, which is kind of a unique thing in Elm,
which is really great.
It simplifies how you reason about these injection vulnerabilities.
One thing that I want to point out is that so the web does give you, again,
like these sort of trusted handshakes that somehow fit together, even though
there are all these points where we're giving a lot of trust to like, yeah,
you can enter your password and send it to this place and it'll probably
work out fine.
We're putting a lot of trust, right?
Another place that we put our trust is like, if you, for example, if you
inject an image tag, it will perform a get request.
So that's actually a way to inject a get request.
It's a bit subtle, but you put that in the page and the browser injects it.
The browser performs a get request.
Yeah, which is what some people call a pixel tracker, I think.
Well, I would say that's a special case of it.
A pixel tracker is one way of abusing that.
So one of the reasons why, so it performs a get request, not a post request.
So one of the reasons why the HTTP method matters, this is actually relevant
on the backend side, but there's an interplay between backend and frontend
So let's say you have a logout endpoint that accepts a get request.
Now you post an image to slash logout, like image source equals
[00:33:50] slash logout.
You somehow managed to inject that image source on a page, which there are
many ways to successfully do that.
And now you've logged out the user.
That's inconvenient.
Maybe you even have effectively locked them out of that account where they
can't enter their account.
That's not good.
So that's something that you have to understand, these handshakes and
protocols and how these different tools and conventions are giving trust and
authority by using them.
And so because you're essentially saying, like, I will allow, like, within my
web page, if you make a get request, I will perform that get request and send
along my cookies.
Now, usually the way it's set up, it's only going to send those cookies to the
same domain.
So usually that's not going to be an issue because you trust sending those
cookies to that same domain.
But you need to be aware that you're, by using these web standards, you are
putting trust by doing certain things.
So that's why, like, you need to be sure to not perform side effects when a get
request is done, more or less.
You can, okay, I can give this data.
I can do analytics and yeah.
There's a bit of a gray area, but you need to be careful about that.
You need to be aware of what you're essentially authorizing by choosing to
accept the get method by performing some action.
So we've talked about people injecting things to NPM packages, for instance.
We haven't talked about doing it through Elm packages.
But you can probably do similar things, right?
So if you want to make an HTTP request from an Elm package, there are
multiple ways to do it.
But the main one is to do it through the Elm slash HTTP package and to call
the HTTP functions and therefore return either a task or a command.
So already the nice thing about this in Elm is that you have a type that
tells you, hey, this is doing something.
This is an effect.
And these are pretty much the only things that you have to look at.
If you want to audit things carefully, make sure that no one is sending
weird requests over the wire.
Look at whatever is returning a task or a command or whatever that contains
one of these.
And you can only use them in the context of an update function.
If these are used in a view function, it doesn't matter.
They won't get executed.
That's something that differs from something like React, where even if you
call this function, it's not going to get executed.
But in the context of React, yeah, you might want to worry about this
because calling this function will trigger the HTTP request.
And the other way that you can create HTTP requests is, like you said,
with HTML.
And this is something that is actually possible in Elm, but doesn't happen
in practice.
And I'll talk about that, why.
So someone can create some div or HTML elements and publish that as part
of an Elm package.
And if that one contains an image with a source that leads to a malicious
attacker's URL endpoint, then automatically we'll be sending data,
cookies potentially, to the malicious attacker.
Although cookies are...
Yeah, not cookies.
I think over HTTPS, cookies will never be sent cross-origin.
Ah, maybe, yeah.
You know what?
No, I think when you set HTTP-only cookies, you can set your policy.
So the default HTTP-only cookie policy is lax, which sounds like it would
be relaxed, but it's actually fairly strict.
So you can do lax, strict, and I think the third option is none.
But only the none option, which is not the default, will allow cookies
to be sent in cross-origin HTTP requests.
So generally, you're not going to have cookies being sent to other domains.
I have heard of people using this technique to notice when the policy
was not set the right way.
So there is a pixel tracker or something that just sends requests to the
malicious attacker's servers.
And the only thing that it actually tells the attacker is that, hey,
this website here is vulnerable.
Yummy cookies.
It's not correctly protected.
If there's one thing hackers love, it's yummy cookies.
So yeah, this you can do in Elm.
So if you really want to do an audit, you need to check for tasks,
commands, and HTML.
Although for tasks and commands, you only have to care about whether
if the package depends on Elm slash HTTP.
Otherwise, it doesn't matter.
There's nothing that I think people can do without that.
Yeah, or anything that influences the strings in the URL in your HTTP
So you have to look at, in general, the flow to these insecure
sort of endpoints.
What are the points in your application where you can do potentially
malicious things, HTTP requests, image tags, things like that?
How is data flowing to them?
So if you perform an HTTP request and it's a hard-coded URL and you can,
you know, add a query parameter from the user, that's very different than
if you're taking the entire URL from the user input.
That can potentially be more malicious.
So you have to consider, like, the flow of trusted and untrusted inputs
and trusted and untrusted code to these potential attack vectors.
Absolutely, yeah.
In practice, I found that to be quite rare, but yeah, absolutely.
And in Elm, there are fewer of these attack vectors and there's, like,
a cleaner, clearer flow to them.
So it's just there's less attack surface area and an easier way to analyze
how untrusted things may go to them.
You can't just say, well, replace the HTTP or XML HTTP requests
implementation by this, and now the URL will always be this.
Right, exactly.
And Richard Feldman gave a talk.
He was at Oslo Elm Days, and I'll link to the talk.
I've mentioned it before, but he made a really nice point there.
He also talked about NPM package vulnerabilities and all that stuff.
But he was talking about in Elm, you know, why would you even want to install,
like, Richard's like, I trust Luke Westby, but why would I use
the Elm HTTP builder package?
Why would I use any third-party code to construct an HTTP request when,
if you look at the API, just, like, copy-paste the parts of it that you want,
look at the code, make sure it looks good.
And then not only is it, like, secure stuff that you don't have to trust
and you don't have to think about if I'm updating the version,
did another maintainer take over and push commits in there or whatever,
but now you can say, well, this is the hard-coded URL for our API.
So the HTTP builder API, you can custom-tailor it for your needs
and make it safer where you're not even thinking about what URL does it go to
because it only goes to one URL, or it goes to here's a custom type
of the three different possible URLs it can go to, and you choose one.
So it's a tool for reasoning and constraining these things
so that you can analyze the flow even more easily
and not have to depend on third-party code.
Because so much of this third-party code,
just vendor it or bring it into your own codebase, build it yourself.
Yeah, and because of this, we tend to not have many packages.
We don't tend to have many dependencies in our applications.
And that makes it much easier to audit our code if we really want to,
because there's, I don't know, maybe a few dozen dependencies
for very large applications, and that's it.
And you only have to look at the ones that depend on HTML,
Elm Virtual DOM, and Elm HTTP.
And that reduces them by, maybe you only have two or three left at most.
And auditing, I think, is where this stuff you were talking about before
really comes into play, the prototype pollution and things like that,
where you can set global variables and have effects happen
in unexpected places that get triggered in strange places.
You just don't have to think about those things with Elm.
So auditing code is so much more straightforward.
Yeah, just like it's also very simple code, right?
It's Elm, there's no mutations, there are no global effects, blah, blah, blah.
But also because we have so few dependencies, right?
That's very different from the NPM ecosystem,
where people use thousands of NPM dependencies,
often without knowing it, because there are so many indirect dependencies.
And there's this one thing where we're lucky that the Elm community is quite small,
and that is security.
Because if there are 10 million JavaScript developers,
and like, I don't know how many we are.
I don't want to make it too small nor too high.
I don't know, how does that work?
Less than 10 million.
Yeah, less than 10 million developers.
Well, then it's not going to be very interesting for attackers
to create these malicious packages.
Like maybe one of them or two of them will do it.
But also the things that they're going to be able to do with it
is going to be very restricted compared to what you do in JavaScript.
And also to make it possible is much harder
because you are much more constrained.
Oh, you want to make an HTTP request, then you have to go through
an update function to return something with a task or a command.
And that's going to be a lot more obvious.
Or you need to do something like a pixel tracker,
and then you're kind of limited in the kind of information
that you can send to the malicious attackers' servers.
And also, you don't have access to cookies in Elm in practice, right?
So there's almost no information that you can send.
You can't do document.cookies and reach into the cookie jar to get, yeah.
And we should mention those are so that document.cookies
allows you to get cookies that are visible to JavaScript.
One practice that I really like is using HTTP-only cookies.
So they're not visible through document.cookies.
They're only visible, the server receives them as a request header
when you make an HTTP request to the same origin, to the URL you're on.
And it will not send any of the other cookies to any other places?
Right. Yes.
If you're using the default permission type security policy
for HTTP-only cookies, then it will not send them to other domains.
So that's a very secure way to do it.
It does, well, we'll talk about this in the future,
but you sort of need to be able to have a server-side story
where you're deciding what data to send, right?
So that could be a Rails application where you're maybe grabbing,
like deciding that the user is logged in in a Rails application
that can read the cookies in the incoming request
and then say, okay, the user is logged in.
Here's the user-specific data that I'll send
that can be passed in as flags to the Elm application that's rendered.
Elm pages v3, you can do similar things because you can, in pure Elm,
look at the incoming server request cookies and do a logged in user session.
So, but those types of approaches where you have a server involved in the process
rather than a client-only application
that doesn't have an opportunity to look at HTTP cookies,
you can do some of these practices that are just easier to reason about
that you're doing in a secure way.
There's less to think about protecting the cookies from an attacker.
So we've talked a lot about making HTTP calls,
but I would generally say that we don't want anything unexpected
or malicious to happen in the context of security, right?
And something that unexpected that can happen is that your code crashes.
Like you start depending on a package and now everything crashes because of it.
And now your whole application is made unusable.
So a malicious attacker could want to do that
if they want to sabotage your application.
But that is much harder to do because it's Elm, right?
Like there's not many things that you can do that will cause runtime errors.
You still can do things like infinite loops or infinite recursions,
but it's going to be very restrictive in what you can do.
Yes, it's very restrictive.
I think one interesting exercise is to think about
where does Elm delegate directly to JavaScript
and where does it protect that or not?
So for example, you can create a regex in Elm
that creates a JavaScript regex under the hood.
JavaScript regexes have DDoS vulnerabilities,
meaning you can create a regex that's basically going to crash the page.
Do you mean they will crash it or do you mean that it will take so much time
that it will make the site unusable?
Exactly, exactly.
It'll grind it to a halt to the point where it stops responding.
That's only a problem for the server, right?
See that? It's a gray area.
Again, I think you really have to consider,
to me this is the mindset.
It's like thinking about the flow of untrusted inputs
to potential attack vectors and untrusted code to potential attack vectors.
So if you have a package that allows you to build regexes,
you could potentially say,
does an attacker really stand to gain from that?
Maybe it's not that big of a risk,
but maybe it's something you'd be aware of
depending on how important that would be.
If they can crash your cat GIF site,
then maybe you're like, well, there's not that much in it for them.
But if they can crash your site for submitting taxes
and they do that on tax day,
then maybe that's more important for you to be careful about.
What is the flow of untrusted input and untrusted code to attack vectors?
And what would happen if that untrusted data or untrusted code
did something with that attack vector?
So you have to think about that with regexes.
Now, if you have user input and you use that directly to do a regex,
and now that means the user can DDoS themselves and crash the page,
you're like, okay, well, if the user is being malicious to themselves
and it causes them to crash their own page,
then maybe I'm fine with that.
Maybe they deserved it, you know?
But if the input is coming from another user,
so another user can cause somebody's page to crash,
maybe that's not good.
So you really have to think about the flow.
It's really like just being aware of these things and thinking them through.
There's no silver bullet for these things.
There is.
Yeah, you take that input and then you validate it by making the opaque type,
like regex that won't crash the user application.
Yeah, sure, sure.
And then you can execute that if you have successfully created that kind of regex.
And otherwise, you don't execute it, you return an error to the user.
Right, yeah.
So opaque types, man.
I agree.
You know what they always say, right?
Always bet on opaque types.
Yeah, as the saying goes, there are no silver bullets except opaque types, as we know.
Elm radio t-shirt coming soon.
It could be.
It could be.
Hey, listener, do you want it?
Let us know.
So are these the main attack factors that Elm users should be aware of?
In my opinion, yeah.
Or at least in my, from my viewpoint, yeah, there's nothing more than I can think of.
You can have someone run an exceedingly number of computations through infinite recursions
or through regexes, if you want.
And that's one category of issues that you can, that you might want to look at.
The other ones are rendering HTML that will trigger HTTP requests, and the other ones
are making HTTP calls through dependencies that will send something to the attacker.
And that's pretty much it, in my opinion.
So like, even the concept of sanitizing doesn't really apply much in practice.
Yeah, most of the time.
I mean, I think with sanitization, like, I think just to kind of put a simple process
to that, I think it's like, what are the special characters that can break out of the box that
you're expecting the user input to be constrained to, right?
So if it's HTML, it's like closing, you know, closing caret for a tag and closing quotes
and these sorts of characters can escape things.
So there are a handful of special characters that will break out of the intended context.
There are a finite number of them.
There is a way to escape them, to remove their special behavior or meaning, so that you know
that the user will be kept in that box.
That's what sanitization is.
The thing is, you do have to think about, what is the box I'm trying to keep them in?
So if like something belongs as a URL query parameter, and you're trying to keep them
in that box, now in that context, an ampersand will break them out of that box and allow
them to add more URL query parameters.
Is that a security issue?
It depends on your context.
Like if that query parameter can be an attack vector, the way that your API works and you're
trusting input that comes from a place that could allow a malicious user to exploit that,
then it is.
So you need to know, what box are you expecting?
Anytime you're taking untrusted input and you're like, well, this data can only do this
one thing, you have to make sure it doesn't go outside of that one thing you think it's
going to do.
Or if it does, it's not going to cause a huge exploit.
Or also just give a very bad user experience.
Because that can also, it's not an attack factor, but it's also something that we should
care about.
So I think the thing with the web is that there are a lot of different sort of encoding
Like there's a nice article that I'll link to that talks about like a lot of these, what
they call output encoding contexts.
So they say like HTML entities is one output encoding context, HTML attributes, URLs, and
in particular, HTML attributes that accepts JavaScript, which isn't an attack vector in
But again, even within like the context of a URL, maybe you're trying to constrain where
they put that thing to be a query parameter.
So you just have to, I think you just have to think that through.
That to me, that's the thought process.
Anytime you're taking untrusted input, you need to do that thought process.
Well, so are there any resources we should point our listeners to?
I wrote a blog post about fixing these security issues in Elm Virtual DOM.
We could point out that.
If you're interested, look at the implementation of Elm Virtual DOM.
It's quite short and it can be quite interesting.
Otherwise you should probably learn about cross-site scripting issues.
So XSS, what they are and how they matter.
But yeah, they tend not to matter too much in Elm, in my opinion, my experience, which
is so freeing.
Like I don't have to care about it so much.
If you want to explore anything about security, I'm sure there's a lot of things to cover,
but not that I'm too knowledgeable about.
Do you have any other ones?
So the one I want to mention is there.
So first of all, there's like a little cheat sheet, a little article from OWASP about XSS
So I'll drop a link to that.
That's a nice little resource.
But Firas in the JS community did a Stanford course, CS53 on web security, and it is online,
available for free on YouTube.
It's very long, but you can go and look at the specific lectures you might be interested
It's very well done.
It's a very good summary.
There are some modules on cross-site scripting and cross-site scripting defenses.
So I'll drop a link to all of that.
You can even just like peruse the slides and they're short and concise and explain these
potential attack vectors very nicely.
So yeah, I mean, I think the web is sort of a magical thing that has all of these conventions
that have a lot of meaning.
Like when do cookies get sent places?
And most of the time we sort of like maybe the defaults kind of work out pretty well
or certain backend frameworks or frontend frameworks make assumptions about these things
so we don't have to think about them as much because they're built in and taken care of
for us.
But I think it's good to be aware of some of these things just so you can bring your
attention to any places that you really need to be careful about.
Talking about Feroz again.
So he made a company called and he has a very nice blog post, which he has
also made talks about called What's Really Going On Inside Your Node Modules Folder.
And it's a very scary article about all the things that NPM can screw you over.
And it's really nice because reading through that, I'm like, all these things don't apply
to Elm.
Like mostly because Elm doesn't have any post install scripts when you run a package.
Like adding a package is not an effectful thing in practice or it's not anything that
can be an attack vector as far as I know.
So that one is really nice as well.
We have not talked about this as an attack vector, but it absolutely is.
And I think if someone wants to make an attack vector for NPM, currently this is the way
that they tend to do things mostly.
But it will mostly target developers rather than users, but maybe a little bit of both,
depending on what they do.
And basically like send your data to the attacker or start...
Mines bitcoins on your machine.
Yeah, mine bitcoin.
One of them was kind of stupid.
Like they started mining bitcoin.
They did just all the way, like they took all your CPUs to mine bitcoin.
And therefore people could notice that their fans were going.
They're like, wow, this is a totally normal NPM install step.
It's taking many hours with 100% CPU.
Oh no, the NPM install finished, but then like, yeah, the CPU was 100% all the time.
And they're like, well, something must be going on.
Oh, I have something that originated from a script that is doing something.
And people investigated it and like, okay, well, it's mining bitcoin.
I see.
So they could have gotten away with it.
Yeah, absolutely.
If they were a bit more sneaky, like just using 10% of their CPU,
like maybe they wouldn't notice it for a few months.
So yeah.
So I'm generally not worrying too much about security, as you might have noticed.
Right, right.
Because I fixed all the issues.
No, it is beautiful to have this like mental model that just, I mean, for me,
this is how I feel about so many things in Elm.
Like it reduces the surface area.
So it's so much easier for me to just focus on solving the problem at hand
instead of like all of the junk around it that I don't care about.
Just the fact of not never worry about sanitization.
But I would not underestimate attackers.
Like if anything, they're part of the most resourceful kind of people
that you can encounter.
Well, all it takes is one exploit to be a problem, right?
So they can throw the kitchen sink at it.
But yeah, security is one of those features that Elm has
and that I think we don't talk about enough.
Like Elm makes small bundles.
It is very fast.
It makes code very maintainable,
but it also has very good security features.
And that is very much not the case with other JavaScript frameworks.
And we tend not to mention that one too much.
And maybe it's because we don't care about it,
like in the sense that we don't think about security,
and therefore we don't use it as a, we don't mention it as a feature of Elm.
I don't know.
Yeah. Yeah. It's definitely one of the things that makes Elm delightful.
All right. Well, follow us on Twitter.
Give us a review in Apple Podcasts.
Let us know if you want that opaque type T-shirt.
Yeah. And Jeroen, until next time.