Aaron VonderHaar joins us to share the fundamentals and best practices for his high-level Elm testing package, elm-program-test.
February 15, 2021


Hello, Yaron.
Hello, Dillon.
And again, today we're joined by a special guest, Aaron Vonderhaar.
Welcome Aaron.
Hey, how's it going?
It's going well.
Thank you for joining us.
You want to say a quick hello and tell us a little bit about yourself?
I am Aaron Vonderhaar.
I have been, I guess, a full stack developer my whole professional career and did a lot
of mobile iOS and Android development in the past and have been doing Elm, I guess it's
almost eight years ago now that I started writing Elm format and was working at KnowRedDNK
for about four and a half years doing Elm there and now doing some Elm consulting work.
You're hired.
And we had a lot of fun discussing Elm format in our last episode and it's fun to have you
on to discuss another tool of yours.
So today we're going to be doing a deep dive on Elm program test.
So let's start with the definition.
What is Elm program test?
So Elm program test came about as filling a gap for me in doing Elm development and
doing it in a test driven development style way of being able to write high level tests
that are written basically from the perspective of a user or an outside entity interacting
with your Elm program.
And it basically gives a nice high level API to write Elm tests in that fashion.
In practice, I found the main benefit that that gives you is the ability to have test
coverage that is resilient to refactorings and specifically like significant refactorings
if you need to restructure how your application works.
If you're converting like a single page app into a spa app, bigger things like that, or
even just like sophisticated new features or changing workflows for your users, the
tests that you write with Elm program tests tend to give you test coverage that allows
you to do safe refactorings that are much larger than refactorings you'd have coverage
for using normal unit tests.
You mentioned that you were writing Elm tests.
So this works with the regular Elm test binary, right?
Yep, exactly.
This is basically just a sophisticated set of helper functions that can be used with
normal Elm tests.
So just to give like a high level overview of like, what the actual wiring looks like,
you pass in an init update and view function to Elm program tests.
So that's how it works is it's actually simulating, initializing and going through the updates
and responding to events in the view.
Yep, exactly.
Essentially I'm program test was extracted from some helper functions that I started
like adding throughout no red inks, a test code.
And the very first version of that basically just looked like we were creating a model,
piping it through a bunch of update calls to the update function with different messages.
And eventually that got fancier where I realized, Oh, we can use the HTML testing features of
Elm test so that we can actually inspect the view that is based on the current state of
the model and make sure that the view has ways to interact that we expect to produce
the message that we want to feed through the update loop.
So essentially that's the complexity in Elm program test is creating an API that essentially
is speaks in a different language than view model update, which is like the technical
implementation of how Elm programs are written.
And Elm program test lets you use functions like, Oh, click the button with this label
or pretend that this HTTP request you made came back with this error code, things like
So let's talk a little bit about some use cases and when somebody would use this compared
to a regular test.
So we covered test driven development and Elm test in general in our Elm test episode,
but this is something different.
This is a higher level form of testing.
And so when would you reach for this rather than a more vanilla unit test?
So something you two talked about in your previous episode was about the testing pyramid
and kind of the different layers of that.
I guess there's a couple of things that the main thing is that if you're happy with unit
tests you have Elm program test isn't trying to stop you from writing unit tests that you're
already happy with.
Elm program test provides the level above that and in particular, if you look at end
to end tests using something like Cypress or Selenium web driver based testing, my hope
has been that Elm program test will allow you to write tests in the same language as
those where you're talking about what the user does, what they see on the screen.
But if you take the assumption that you trust the Elm compiler and the Elm packages to actually
do what their documentation says they do when they're compiled and running in a browser,
if you can trust Elm as a framework to do those things, then you can use on program
test to write tests that cover those same high level features, but that run much faster
are completely not flaky, completely deterministic.
Yeah, no, if you can exercise some functionality at a low unit level, then that's ideal.
But that doesn't always give you confidence.
So if you, you know, one of the rules of thumb that I try to follow is the more high level
my testing becomes, the less comprehensive it should be.
So if I'm noticing that I'm trying to exercise every combination possible in every edge case,
and it's in a high level test, then I'm thinking there's something that either needs to be
split out into a module that's testable or simply exists that way, but isn't being properly
unit tested to give me confidence because that's really not, it's going to be extremely
verbose to have to create a whole test scenario where you're describing clicking through things
and all these permutations.
Because you know, just mathematically, if you think about a higher level test, you're
putting more pieces together.
So there are more variables, there are more things operating together on that page.
And so if you try to exhaustively check them, now you've got a combinatoric explosion.
And it's just not going to be effective, because you can't do that.
But you can at the unit level and you want to.
So that's one thing I try to pay attention to.
But then, like, so why not just test everything in isolated unit tests, right?
And the reason for that is that you don't get the same confidence about things interacting
together and about the system functioning at a high level.
Like what is the user's experience when they come in and they click on byproduct?
Do you get money?
That's an important thing to know.
And a unit test might tell you that the credit card information is being validated correctly.
It strips out white space appropriately, formats it, gives the right client side validation.
That would be a great unit test.
But is it making the right API request and handling the response, showing error messages?
You want to know that when a user goes to do some important business flow, that you
want confidence that it's actually going to do that.
And unit tests don't necessarily give you that.
Yeah, that's exactly right.
And in Elm, being a statically typed and strongly typed language as well, the compiler gives
you a lot of protection for some types of refactorings.
But Elm program test is really at the level that you can't get type safety because it's
looking at like, oh, when you have this sequence of events happening in this order, something
goes wrong.
And especially like in Haskell, you can get a bit fancier with types.
But in Elm, you like definitely can't write a type for your update function that says
that, oh, once this message happens, you can never get these initialization messages anymore.
You only get this later class of messages.
So Elm program test allows you to kind of in a more natural language, I would say, than
writing unit tests allows you to get coverage of those things.
I don't know if you've ever ended up with tests where you're trying to like initialize
a model, send it through a bunch of update calls, and then check the state of the model
at the end.
You can definitely write those as unit tests.
But I find that it's often very verbose.
And you have to end up putting a whole bunch of like, fake data and things that are really
implementation details.
Whereas when you write a test of that nature, what you want to be thinking about is, okay,
what is the flow of events?
And you don't want to care about how to construct the record of the data that represents the
JSON that comes from this endpoint and all those lower level details.
So you say that those tests are supposed to be easy to read.
Have you looked at Cucumber?
Or are you inspired by Cucumber and those kinds of testing frameworks?
Like I think it's BDD?
Yeah, I've been really interested in those techniques.
There's a talk called something like test driven development the way it was meant to
be done by Matt Wynn from and is probably like 15 years old now.
So L program test, I think is a layer that's needed if you wanted to write tests in that
Because that BDD approach is really that you develop a domain specific like application
specific language to write these high level tests in, whereas talking about things that
are relevant to your business domain, like in no red ink, it would be like talking about
teachers and classes and assignments and creating new assignments.
Yeah, I think that's that's kind of just like a higher level DSL on top of like to implement
some of those features like to implement or in a BDD style test, you might say when the
user logs in, and then dot dot dot other stuff to implement that how does the user login
step of your BDD testing, you would, I think, ideally use something like program test to
say, Okay, click the login button, then fill in the name of the current user fill in the
password of the current user, click this other thing, then simulate the back end responding
to the post request with an okay status.
So that's kind of implementation specific.
But the way I just described those steps is still high level, it doesn't depend on the
details of how the form validation is implemented, or how they like what UI framework they use
to create the form.
So on program test is kind of high level for the technical side, but it's still lower level
than something like cucumber tests,
I have an opinion about cucumber, which is that now I may be totally wrong, or, you know,
reasonable people may disagree, but I'll share my opinion for what it's worth.
So what you were just discussing there, Aaron, with it being elm program test being a high
level way to express a flow that's not necessarily as coupled to the actual technical implementation,
not being as likely to change if you change implementation details, right?
In my mind, that is exactly what I want.
And the cucumber part of writing it in in actual plain language, I don't like and that's
what cucumbers for anyone who's not familiar with the Gherkin cucumber syntax, the sort
of idea of, you know, one of the concepts of behavior driven development BDD is that
you actually have customers or customer, you know, product managers, that type of thing,
in their writing cases, and it's not code, it's just text.
And you use regular expression, or various forms of parsing to do like language parsing
and say, when I log in as email address, and then you use some regex to capture that email
address and then log in.
And in my in my experience, that just creates a layer of indirection that doesn't make it
any more high level, it actually just makes it more confusing what's going on.
If you're, if you are not a programmer, and you're writing that, to me, it doesn't seem
like it's making it any easier just because it's an English language, because it's still
a specific syntax that you have to write, it's just one that's been built with this
layer of indirection of regular expression, capturing and stuff.
But you still have to know how to formulate a valid one that the regular expression will
capture and what it's going to do with that.
And to me, that that's not any easier than just writing high level instructions that
say, go to, you know, maybe you have to like learn what the pipe operator is, and then
you have to like learn a few of these things.
But then you just say, click this thing, do this thing, do this thing.
So you know, it's sort of like AppleScript went this route to of like, if we write it
like English, it will be easy for people to write, but it's actually not because it creates
this layer of abstraction that just makes it harder to understand what it's actually
going to do and how it's going to interpret it.
Kind of sounds like graphical codes, like, like dark, for instance, where you connect
things visually, like, yeah, but you have more rules, like, do you, do you know more
people who can write code that compiles or more people who can write English perfectly
That's going to be interpreted by a specific set of code instructions, and if it doesn't
fit that format, it won't do what's intended.
And how do you debug that?
So it's just adding a layer of indirection.
So anyway, for what it's worth, I'm very happy with just the high level way of expressing
things with Elm program tests.
I think it's, I think it's really good.
And I really enjoy being able to like, you know, write tests from the user's perspective.
I know a lot of French developers who would be better off writing code than English.
So well, so to just one thought on top of that is that I think in practice, especially
if you have a large application and you start using Elm program test that ideally you'll
end up with a module or maybe some different modules of helper functions that build on
top of Elm program test, some application specific helpers like the login example that
we said, or maybe in the ed tech domain, creating an assignment, which might do a whole bunch
of steps like, Oh, click the whatever the button to go to the create assignment page,
fill out all this data, click Submit, simulate the response.
So you can still have some of those benefits of having even higher level concepts.
And Elm program test is kind of like a support layer that does all of the generic stuff,
basically all of the reusable things about writing high level tests on program tests
does so that as an app developer, you can write your app specific tests much more quickly
and efficiently and correctly, which I guess is a point we should get to at some at some
point of all the things that Elm program test does behind the scenes.
Right, right.
And to that point of doing it correctly, I think that that's a really key point that,
you know, having this encapsulated in this library, I'm sure you could, you know, call
the update function yourself from an Elm test, but you don't have the same level of confidence
that things are actually being wired up in a way that's equivalent to what's going to
happen when you do browser dot application or whatever type of program you're creating.
So there's a lot of value to having that encapsulated in something that you can trust to be equivalent
so that you know you're not you're testing something that's realistic and going to reflect
the reality.
And to hit on that, just a little bit more about the types of refactorings that these
programs written in Elm program tests can help you refactor.
These are things like, oh, you have this set of messages and you want to change what the
messages are so that you can whatever centralize your logic, maybe in a certain way in your
update function and extract some data type specific functions that are then used rather
than having all that logic in your update function.
That's the type of refactoring that is very tedious to do if you don't have these high
level tests, because essentially what you have is a bunch of unit tests calling an update
function that are written in the language of that module that knows about its messages,
knows about its internal model.
And now you want all those unit tests to move to a different module that's specific to this
smaller data type doesn't know anything about messages.
So if you have the high level test coverage, you have some safety.
And if you don't have these higher level tests, you basically have to translate all your tests
and move them over, which is, I don't know, that's that's one of the things I I fear the
most when I'm doing coding.
And one example is like form validation is actually something fairly common and also
something that you tend to do incompletely the first time you implement it like, oh,
we'll just need to validate a few things, we'll move on.
And then over time, you need to add more validations, maybe at some point, you end up extracting
a validation helper module or using some package to help you with validation.
Those again, are things that can completely change the flow of events.
Like maybe you used to validate things on when the message happened, but now you're
switching to store the unvalidated data in your model and validating it when you send
the form and showing error messages in a different way.
Those are changes that ideally should be simple to make.
But if you're writing unit tests, they become extremely tedious, because those unit tests
that touch your update function or directly refer to messages are extremely brittle to
those kind of changes.
So if you were doing something like an optimistic update in the UI where you're interacting
with server responses, you would have to fill in a lot of the pieces to simulate that in
a plain old unit test.
But if you're driving it through on program tests, then you can say, click this button,
you know, enter this information, hit send, simulate a server response and make assertions
about the view while it's loading before the server responses come back.
So you can you can decouple it and and not rely on wiring it in a very specific way that
you can't trust as much.
So yeah, there's a lot of value to that.
One thing I don't think we wrap this up yet, but you had mentioned earlier, Dillon, about
using unit tests to cover all the different edge cases and combinations.
And I would definitely agree with that.
I tend to start with maybe like a happy path program test.
Then you can jump down and do unit tests for all your edge cases.
That's cool.
So you like to do like a more outside in approach?
Yeah, yeah, exactly.
That's cool.
Because that so that's a really interesting topic that, you know, in the sort of test
driven development world, there, there are a lot of conversations about do you do you
write a unit test first, and then work your way up to building to fitting that unit into
the application, which is in a way, you know, that's delayed integration, you haven't fit
the piece into the hole.
So you spend time building up all these different cases and then fit it into the hole.
And you don't know if it's actually going going to solve the problem that you set out
to solve when you built that unit.
So you don't know if it's going to fit fitting fitting and integrating the piece into the
hole is the hard and risky part.
So the sooner you can do that, the better the outside in school of thought is more that
start with building something end to end.
And it starts with testing from the outside in from the user's perspective, and then builds
the unit as needed.
But it sort of uses more of a fake it till you make it approach, like you said, sort
of getting that happy path.
So that's more the approach that you like to take.
Well, that's interesting.
I don't know that I have a specific preference for either of those in Elm.
But I do tend to kind of write the high level test first, and then often I'll comment it
out or stash it or something and build those smaller pieces, then bring back the failing
high level test and plug the pieces in.
I think I would tend to do that most of the time.
However, there are cases if I'm not sure how things will be structured that I would write
the failing high level test implemented, like do the fake it till you make it and just write
a really simple thing where it's like hard coded to always show the stuff that's needed
to make it pass refactor.
I think there is a pitfall in doing that approach with on program tests that you can often skip
the refactoring step, or you can end up with working stuff that you've refactored, but
you haven't exactly extracted the coherent modules yet.
And you can end up with kind of a mess relying on only on program test if you aren't disciplined
about looking for the smaller pieces that you're going to pull out and write lower level
tests for those pieces.
Because when you use a regular Elm test, you tested an API and then by using that API,
you can kind of feel the pains with it.
But with Elm program tests, you don't feel those pains because those APIs are technical
detail and you can forget it.
Is that what you meant?
That they are a technical detail.
Yeah, sort of.
I think that tell me if this is related to what you're saying, like in Elm in I think
the preferred way and the way people are settling on is that if you have smaller pieces of your
own program, especially if they're related to view stuff or update logic, there's a lot
of different patterns you can use.
Like you may have a module that just has a function that returns HTML, or you might have
one that has a full update function that can produce commands or somewhere in between where
maybe it returns like some specific set of things that can result from its actions or
special functions to process the messages it can produce.
I think like that's a case where Elm program tests can be used to write unit tests for
modules like that that are not completely programs in the full architecture sense of
the word, but are also more complicated than just some functions.
However, Elm program test doesn't let you interact directly with its API.
So in like, and as you two talked about in your previous episode, writing tests in in
test driven development, or test driven design, I almost said, it's like a design tool.
And I think Elm program test doesn't really give you those design tool benefits, because
yeah, I think you're not working directly with the API, but it does help you.
It gives you other benefits, the test coverage we talked about that lets you do bigger refactoring
And it also helps you think about the user perspective more if you're really doing user
centered design.
And a process like that, it can really help get developers aligned with the kind of product
level thinking as well.
But for designing an API, Elm program test really doesn't provide those benefits of testing.
Right, it's almost like the, you know, user interface based testing in Elm program test
keeps you honest about making sure that you're designing something that's going to have a
nice experience for from the user's perspective.
And doing unit level tests with Elm test helps keep you honest that the API you've designed
is going to be nice to work with because that's the direct thing that you're exercising and
getting feedback on.
So you sort of want a mix of those high level and low level tests to make sure those things
are both being designed nicely.
It's really nice that you can write both using Elm tests that you don't have to write unit
tests in Elm and integration tests using JavaScript, using Cypress, for instance.
Even though that's an amazing tool, but that's a different thing.
Yeah, which is really one of the it's really leveraging the strength of Elm in that everything
in Elm is pure functions.
There's a virtual DOM, which is essentially a data structure.
The commands and subscriptions are essentially just data that represents what the runtime
is going to do.
So program test essentially is simulating or like reimplementing in Elm in kind of a
test specific way, the runtime of Elm where it's creating a model, it's, it's maintaining
it, if there's external effects, it's like keeping track of those.
Remembering what things are in progress, what are waiting for responses, and can render
the view from the current state at any time.
So yeah, it's all of that is possible and like was relatively easy to implement.
I mean, it's a lot of work, but it, it was easy to implement compared to other language.
Like actually, I've worked on similar test frameworks for Android, the Robo electric
framework at Pivotal Labs, we started writing a similar framework for iOS, which I think
never got finished.
I've like done similar things in the past and it's like, this is by far the easiest
to write in the most sophisticated of any similar attempts I've worked on.
That's cool.
It's like a, it's like a pure Elm implementation of simulating the Elm architecture and the
Elm runtime.
Yep, exactly.
So let's, let's get into a couple of things.
So first of all, I think we, we should, we should make it clear we've sort of implied
this but not said it explicitly.
You can simulate HTTP requests with Elm program tests.
And that's one of the really killer features because you know, you would be hard pressed
to find, find a way to do that if you were sort of rolling your own thing.
It's really something you're, you're going to have to do a lot of work to get that same
result otherwise with that Elm program test.
So there's an API where you can sort of expect a certain HTTP requests to be performed.
You can say simulate responding with this HTTP response with this status code, and then
continue to make assertions in your, in your view after that.
So just to compare what that would look like without Elm program test, you could do things
like that.
But you would in the first place be like saying call update with this exact message with all
this data.
And then you would just add it to the specific message type for your application.
Yes, exactly.
And you also wouldn't buy like, it starts to get extremely tedious going that manual
route, which is what I started with before I extracted on program test is that, okay,
well, maybe you also want coverage to make sure that the message that produced the HTTP
requests happened, because like, maybe you whatever disconnect the code that makes the
button appear that lets people initiate the request, or maybe the validation failed, and
the request wasn't even sent, things like that, Elm program test handles for you.
And then also, if your test fails, having a nice error message about what is happening
is another part that is a lot of tedious work to do on your own, which Elm program test
provides for you.
So if you say, Oh, simulate the response to this endpoint, and it was a timeout error,
if that request wasn't made on program test will say, Oh, that request was never made.
Here's the list of requests that were made.
In contrast, if you were just doing it manually sending the message, there's no guarantee
that like, maybe the message was sent, maybe it wasn't, you haven't validated it, validated
it all you know is like, maybe there was a typo in the URL, and it didn't match for that
So that's, I guess, something I take pride in, in Elm program test.
It's like, yeah, it's, it's a centralized place where and hopefully other people can
contribute to this to to have a nice error messages for these cases that like, ideally,
you want to have this, but you'd never have time to do that work to do that on your own,
if you didn't have this reusable package,
Instead of everybody individually building their own thing and taking their time to do
that, hopefully, you know, you've, you've done the bulk of the work, but hopefully others
can then invest some of that time they would have spent rolling their own thing to improving
this community resource.
Yeah, so there's a whole bunch of stuff.
So on program test, like you mentioned, can do simulations of HTTP responses, it can partially
do the simulation of time passing, like if you have tasks that have delays, it can simulate
that and you say like, Oh, advanced time by this much, and it'll trigger all the all the
delayed tasks, you can simulate ports, both incoming and outgoing ports, and you can simulate
some of the browser and Dom API, that's an area I still need to improve quite a bit.
So what is it that you can't simulate with Elm program test at the moment?
Let's see.
So specific things like browser focus, like focusing on particular IDs in the DOM viewport
scrolling is something that I haven't touched on.
There's some things related to time, I think, like the subscriptions for time, like time
dot every aren't implemented, but I would like to implement.
I think that's, that's the basics of it.
I don't know if there's any other kind of niche packages, I guess, like the file and
bytes API is something that I haven't looked at yet.
But those are things that like, kind of the internals are set up where a lot of that stuff
could be implemented.
You just have to think about, okay, what's the API of different error conditions that
people may want to simulate in tests?
What's the internal model that would represent this?
What do we want the error messages to look like?
So those are definitely things I'd love to have contributions for if someone needs those
features for testing whatever their application is.
So there's nothing really blocking those things from existing.
It's just more time and money, or more time.
Yeah, it's free.
I'd say there's one, like some of the things like, okay, keyboard focus is something that
ideally you'd want to be able to simulate, but the logic that browsers use to maintain
that is so complicated.
Yeah, I don't know if that could ever be safely done.
Maybe I like that's something I'd love to be able to simulate here.
But it's also of like, limited use.
Like that's the point where maybe you just have to track things manually and say, okay,
well, here's the state of the DOM.
Here's what's going to happen.
Let's like hack together a test that gets things into that state.
Or there at some points, you should probably just use a better tool, like Cypress, a tool
more suited to the task.
Yeah, I mean, Elm program test is not, it seems like Elm program test is not intending
to be something that is actually performing HTTP requests, actually sending ports and
executing JavaScript.
And that's like, you know, by design, which, so sometimes I, you know, it's, it gets a
little confusing, but people use different terms for this.
I think of it as like end to end testing versus integration testing, where, you know, I would,
I would consider Elm program test to be an integration testing framework.
It's not ever going to make an HTTP request, which is a feature or a bug depending on what
you're trying to do, right?
It is, you have to pick the appropriate testing level and you have to keep that in mind, but
obviously it's going to be faster.
It's going to be more deterministic if you're not actually making HTTP requests, right?
So you need to have an awareness of what the tool is suited for and use it for those effects.
So, so like I want, I wanted to make the API a little more concrete for people maybe.
So like when it comes to like simulating HTTP effects, so essentially you have this sort
of program test data type.
If you think about, if you think about the Elm language, it is not a dynamic language.
It is a static language.
It's not a language where you can go in and tweak the internals of something, reach in
and change global variables or, or that sort of thing.
So everything needs to be sort of injected rather than reached in and modified.
If it was a language like Ruby or JavaScript, a lot of these frameworks work by, you know,
monkey patching, you know, in Ruby you can like override existing classes and actually
reach in and modify them.
Elm doesn't work that way.
Monkey patching is not a thing in Elm language.
And so, but everything is pure functions.
And so the way that that works in, in Elm is you, you pass, you pass things in explicitly
and effects are actually just a type of data.
And so when you're simulating HTTP effects, what you do is you have this program test.
So an Elm, an Elm test case is just, it's just a single expectation ultimately under
the hood.
And you know, it's just this one expectation type and a program test sort of builds up
this program test and you sort of chain on a sequence of things.
So it's inherently a sort of imperative flow that you're describing.
The user goes in and does this and does this and does this, but you're actually building
up a single expectation that describes that sequence of events.
So it's a single pipeline and you can chain on.
So you sort of initialize your program test and give it your update function, your view
function, your init function, all that it needs to sort of create its mini simulation
of the Elm runtime.
And then you can simulate HTTP requests and say you expect to get a particular HTTP request
and it's going to essentially mock that out.
Let me just like, as Elm programmers, we like to think about the data types.
So actually this program test type is pretty straightforward.
You can think of it as a result.
It's like either in an error state or in a success state.
In the error state, there's kind of a whole different set of different kinds of errors.
Like it's maybe we failed because we expected an HTTP request.
Here's the function that you were trying to call.
Here's this, here's the data of other requests that were in flight at the time.
Basically whatever's needed to produce a nice error message.
And in the success case, we've got the current model of your program.
And then we have some other state about the world of like what HTTP requests are in flight,
what delayed effects are you like scheduled in time?
What ports exist?
What outstanding like browser requests have been triggered?
So it's, it's really just a data structure like that.
And then the functions that you call like click button, all it does is it's going to
render the view based on the current state of the model.
Look for the button that you wanted to click, grab the message from that, call the update
So I guess the update and view functions are also stored in program test.
But yeah, if you like a lot of it, you can kind of understand if you think about that
as the data structure of what's in there.
It's just a result with your program state, the functions for your program and information
about external effects that are in flight.
And there's a, there's one convention that maybe is like a little tidbit that might be
helpful to talk about for people.
There's a convention of like these insure function calls and expect function calls.
So you can insure that an HTTP request was made.
So there's program test dot insure HTTP request, and then there's program test dot expect HTTP
You want to talk about that distinction?
This is one of the rough edges with the API here, because there's a school of testing
thought where you should have one expectation per test, and that should be the end of your
So those are the primary functions here, the functions that take your program test value
and return an expectation that basically ends your test.
However, when you're writing these high level things, a lot of times you want these intermediate
Like, oh, your first step in your test is to have the user log in.
Well, maybe you're not on a page that has the login form, so you need to fail there
with an error message.
So there's basically a copy of all the expect functions have an insure function that basically
does the same thing, but it returns a program test so that you can do an intermediate assertion,
continue your test and get to the end.
I wish there was a way to unify those into a single check, but I haven't come up with
a good API design approach to do that.
It almost seems like it could be like a builder pattern where you just only have either, you
know, pick a keyword, insure or expect, but only have the current behavior of insure,
which means that you can keep chaining things on.
And then at the end, you say to expectation, and that takes the whole test scenario and
turns it into a single expectation, which is the same as currently what expect HTTP
request does versus insure.
So you can write your tests in that way.
There's a program test.done, which is the final step.
But I didn't want that to be the recommended approach because I think the thought of like
having a single expectation is, in my opinion, the preferred way to think about your tests.
So I didn't want to like make that recommended way be the second class citizen in terms of
how the API was designed.
But then, so like if you wanted to, oh, okay.
So there are also like simulate forms.
So, so there's like insure HTTP request and that's sort of making an expectation and continuing
on kind of what you're recommending against.
You're saying maybe this is a smell and if you have too many insurers, maybe think about
the way you're splitting up your test cases.
If you're inserting too much.
Yeah, I'm kind of on the fence about that because I think like we were saying earlier,
you don't want a ton of these high level tests to check every edge case.
You kind of want to go through your happy path, which is going to do a lot of things.
So you tend to want to assert things along the way.
Having intermediate assertions also helps you with debugging when something goes wrong.
You can kind of tell what step failed.
So yeah, that's something I wish I had a better answer for that.
But unfortunately you kind of learn that thing that's specific to on program tests.
If you have expect and insure, they're basically the same thing that you use throughout your
I mean, in a way there are implicit expectations in certain things.
If you say click button with text, then you're asserting that there's a button with that
text and it will fail, right?
Yes, exactly.
You don't have to say expect.
So that in your book wouldn't count as like too many assertions in the test.
It's just...
In fact, yeah, those are things where on program tests is helping you get a nicer error message
when something goes wrong.
So actually like I think you could probably realistically use a rule saying that your
normal tests should never use the insure functions.
But if you're writing application specific helper functions for your tests, like you're
writing a helper function that's like to the login form, you can use insure functions in
that helper function, but then your actual test will just call that login thing.
It's essentially a higher level version of the click button type helper.
And then your actual test only has expect at the end.
That's very cool.
Tell me if you want to work on that rule with me.
Because I could be on review rule.
That sounds great.
Yeah, so just to reiterate what you're saying here, like when you say click text and we're
saying there's an inherent expectation and clicking the text because you're asserting
that that text exists on the page, that you're saying that you could use, if that didn't
exist as a building block, you could create that your own building block like that where
you say log user in.
And if there's no login button, you could create the validation that that exists in
your own helpers and use insure in that context.
Yep, exactly.
And for instance, the the API of own program test actually enforces certain best practices
like it has a click button function that you can call, but it can only click things that
are actually buttons or that are divs that have accessibility attributes that indicate
their buttons.
So if for whatever reason, your program is just a bunch of divs with on click handlers,
the built in click button function in on program test is not going to work.
I could make it work, but I've chosen not to.
But you can always write your own helper function that uses the like, find DOM element and do
this with it to kind of build your own helper functions that are needed to do whatever you
want that your app needs.
Yeah, that's an amazing feature.
So I guess we should maybe talk about some more of the rough edges.
So program test, I think the big one, which is kind of the most unfortunately, the most
complicated thing is if you're if you're testing a program where you want to test these external
effects and simulate HTTP responses, or simulate interacting with your JavaScript ports, things
like that, there's a limitation in in Elm at the moment where there's these data types
command and subscription, or sub that in theory, like, theoretically, those things just are
a piece of data that represents some effect that the on runtime is going to perform on
your behalf.
So in on program test, I need to take a command that your program produces and know was there
an HTTP request made in that command and all the various different commands.
So unfortunately, that's not possible in Elm at the moment.
So well, okay, so I know that Evan has had a concern.
And this is a similar reason to why the HTML type is not directly inspectable or destructurable
is that Evan's been conservative about allowing for those kinds of destructurings in an attempt
to prevent packages getting published that do extremely complicated or like not explicit
So I think it's about the explicit as of Elm.
So there's no internal technical reason they couldn't be inspectable.
And I think it's something that like, it's not even a decision that they should never
be inspectable.
But it's that in production code, I think specifically, Evan has wanted to avoid allowing
the use of packages that can like take a command and transform it into some other command,
which could allow for things like oh, sending all of your HTTP requests, if you're using
this caching package, it transforms all your HTTP requests to go to some other server and
redirect all your private information to a man in the middle attack or something like
So I think like, my understanding is it's something that Evan would be open to and does
make sense, but only in the context of testing, and he would not want an API that's usable
in production code to be able to inspect at that level.
And you couldn't do a test that says this HTTP request, so HTTP.get with a URL, and
then say expect equals to something else?
In general, no, because so HTTP specifically, there are functions in that HTTP request type.
And most commands in fact, do have fun, like even your port commands, which are the simplest
ones, there's a function that takes the port value and turns it into a message.
Yeah, functions pretty much everywhere.
I guess functions can be equivalent by reference, but it gets to the area where it's like not
entirely reliable.
I believe Elm will, if you say function one double equals function two, if it's the exact
same reference, then it will equal true.
And otherwise, it will crash the program.
It'll give a runtime exception.
I think that's one of the things that they want to change in Elm 0.20.
Not getting those crashes when you compare functions or JSON regexes.
Yeah, maybe.
Well, the regex one has been fixed, but I think the idea maybe would be to somehow disallow
comparison of functions or something.
So we should talk a little bit about the effect pattern and how it relates to Elm program
Yeah, that's where we're going with that is to work around that is that I've had to, you
basically have to refactor your program first to define your own data type that represents
the effects you're going to produce.
Then you have in production, a function that turns those effects into commands.
And in the test side, you have basically a parallel function that does the exact same
thing but it just returns a different type, which is this simulated effect type that's
specific to Elm program test.
So like, HTTP dot get is not inspectable.
But if you had your own, you know, version of simulating an effect, so you have, like,
you click a button and it says, you know, that button fetches the latest to do items.
So you have like an effect fetch to dos.
So instead of that just being a command, HTTP dot get slash to dos dot JSON or something,
it's going to be an effect called fetch to dos.
Your update function is going to be wrapped in a little helper that is actually your the
direct update function you you write is going to be returning a model comma effect.
And then you're going to have to translate that effect in your main production code into
HTTP dot get slash to dos for the fetch to dos effects type.
So it's just a custom type type effects equals fetch to dos and then all the other possible
But then in the test one, you write a translator that turns that effects not into HTTP dot
get but into simulated effects dot HTTP dot get.
So it's a drop in replacement except instead of importing HTTP, you're importing simulated
effect dot HTTP as HTTP.
Otherwise it's the same API, but it gives you something that you can inspect in on program
And unfortunately, the HTTP package has a similar restriction where you can't inspect
the HTTP body type or the HTTP expect type that's used for parsing.
Maybe you can inspect body.
But anyway, there's kind of this chain of things where you can't if you have an HTTP
expect type that like represents the decoder and all of that, you can't actually use that
for anything directly.
So that's another thing that I have to have a parallel version that's used in the test.
So just to run through how I think about doing that in the least annoying way possible is
you want to you have to look at the function that you're calling in the real world that
is producing the command.
So in your example, HTTP dot get returns a command.
So your effect type should really just take the parameters to that.
So a little different than what you were saying you were saying you could have a fetch to
do is effect.
I think it's actually better to keep your effect type more generic where it's just representing
the functions that are going to get called.
So you'd have I'd recommend like it doesn't reflect the message type.
It's more akin to the command type, but it's an indirect a level of indirection.
So you can translate it into a simulated or a real command.
So like get takes the URL as a string.
It takes the HTTP expect, which again, we we have to fake.
So then to create your expect, there's like the JSON body or expect JSON or whatever.
So the parameters to the to that is what you'd stick in your effect type.
So then on the real side, you just call HTTP get with the parameters and build it up.
And then the simulated version should look exactly the same because there's a whole bunch
of like there's simulated effect that HTTP which is a module.
It has exactly the same API as the API as the HTTP module.
It just returns simulated effects instead of commands.
So it's basically a bunch of boilerplate.
That's pretty annoying and takes up a whole lot of documentation of Elm program.
But theoretically, it's possible to remove that limitation.
It's kind of like a detailed project.
If anyone out there is looking to help with this, ideally, on program tests shouldn't
need any of this, it should just be able to inspect your commands and read the data without
you having to do anything to your program to be able to start using it.
That would be a game changer.
Yeah, I've actually in Elm 18.
I prototyped a package that could do that and it needs kind of some integration with
the test runner itself to swap test only JavaScript to make that possible.
But ideally, it should happen.
It just hasn't had a chance to get implemented yet.
I'm wondering whether the effect pattern still has some merits if you're able to do that,
or if it's just uses boilerplates when we get to that point.
Yeah, I think like this is, it's a question that I've seen asked more on the Haskell side
of things.
Where in Haskell, there's a type called IO, which basically is similar to Elm's command,
where it's saying that this can have external effects of any type.
But then if you're really into the strong typing and limiting the scope, letting your
types limit the scope of what functions can do, then you look at saying I want a function
that like can't do everything, but it can it's allowed to talk to the database, let's
say and it can maybe it's allowed to send log information, but it can't do anything
else like it can't read and write from the terminal console.
It can't I don't know what other effects are I can't like sleep the computer or call the
halt command, whatever.
So in that environment, there's a lot of conversation about the best way to model that.
And this effect approach is one style of that where you can have a data type that represents
just the limited set of things that are allowed, you can end up with a whole bunch of different
types that represent different contexts that are allowed.
So you could do a similar thing in in which personally, I think has an occasional use
where you maybe have some complicated UI component that wants to like ask for certain things
like maybe it needs to trigger a focus event somewhere else in the DOM.
So it but you don't want to allow it to make HTTP requests.
So you could do something like that where that module defines a type of things that
the parent component should do on its behalf.
And that makes sense.
But I think as a general, like, I think it makes sense in specific cases where you have
a component that like very clearly needs that responsibility.
But as a general pattern, I think it just gets overly complicated.
And in Elm program test, I would avoid even using that pattern at all if it were possible
to do.
Yeah, that makes sense.
I mean, it could potentially have the same benefit of, you know, how type signatures
in Elm allow you to reason about what's going on.
Is this changing the whole model?
Is that even possible?
Or is it narrowed down into, oh, this can only change this one data type.
So if I'm looking at this one piece of the model changing, I can narrow my focus and
ignore these areas of code.
You know, you can use the effect pattern to do similar things.
But it's so, as you're saying, it's sort of so heavyweight to do that, that it might not
be worth the benefit because managed effects in Elm are already so cleanly isolated that
it's in a pretty good state as it is.
I've thought about using a pattern like this for a sort of plugin architecture for authors
creating packages for Elm pages, because if you're performing a static HTTP request or,
you know, things like that, if you set up a package that generates an RSS feed, it would
be nice to know, is it allowed to make HTTP requests or not and be able to explicitly
give it permission for what it can do.
There are other ways to achieve that effect, though.
Like you could just not perform any requests that it performs except a particular wrapped
type and you have to pass in the request it can perform.
So you could pass in a reference to HTTP.
And anyway, there are other ways to achieve that effect.
And one other use of that pattern is if you need to process those effects in different
ways in different contexts.
So an example would maybe you have some kind of like, admin tool that lets you like, configure
the forms.
And actually, we did this at No Red Ink, we had an assignment that students could do.
And normally it would like, you know, send data to the back end, here's the work they've
done, get data from other students.
But we also wanted a preview mode where teachers could play around with the assignment, and
also like simulate other students doing work that would like get sent to you and things
like that.
So that was a case where having an effect type as data was useful, because we can interpret
those effects in different ways in the real mode for the students doing the work, we would
send the HTTP requests in the preview mode, we have a different interpreter that basically
like uses a fake preview data structure.
And we can trigger different effects from like an extra panel of buttons for the teacher
to explore.
So that's that's a case where you need this pattern, but also a relatively rare one, like
occasionally, you'll be doing where you need that capability, but not often.
That's cool.
I can imagine like wanting to have like security audit log or something of all of the particular
effects that have been performed.
And so you could have a loggable chain of those things and be able to trace them.
That's that's a really interesting thing to think about.
So okay, so I want to I want to loop back and talk a little bit more about this thing
you brought up with the effect pattern with Elm program test where you're describing using
the effects type that you define as a sort of equivalent of a command.
So you've got like type effects equals and then you can have no effect, which would be
the equivalent of command dot none.
And then you can have sort of instead of get to do's that specific message to make that
HTTP request for getting to do's, you're saying you could have a more general one.
And so you can have like a get data effect that has that takes a URL as part of its payload
of that constructor and it takes a decoder.
So there's an example that we'll link to in the show notes from the Elm program test examples
folder that sort of shows effects using HTTP simulations.
But how do you avoid having type variables in your effect type or or do you just have
to bite the bullet and have a get data for every specific data type that gets returned?
Yeah, because right, you need different like in one case, maybe you have one request where
you have a decoder that decodes users and in another request, you have a decoder that
decodes account information or yeah, to do to do it.
Yeah, so I need to I need to make a good example.
The example we're like to is kind of written as the simple way to do things and straightforward.
So there's not a ton of discussion, because the HTTP API in Elm is unfortunately a bit
tedious to work with.
There's a whole bunch of different types and like if you're doing if you're making tasks
versus commands, there's different there's like the expect type, but then there's also
some other type that I'm forgetting the name of.
So anyway, there is a way to deal with that.
The way you end up doing it is the effect type would have for instance, like an HTTP
request constructor, right, it would have the payload that it need the headers have
the method and the way you end up doing it is you have a decoder that decodes messages.
And then you also have a separate function that takes an HTTP error and returns a message.
And then you also have the body that takes a JSON value.
So you basically like hidden all of your request specific types behind JSON and behind message.
So that tedious to call directly, but then you make a helper function that can that basically
does take the decoder of type a, the function of type a to message or alternatively the
function of result HTTP error a to message, and then it builds the actual parameters to
the effect based on those and it would be like composing the decoder with the function
that takes the a value and turns it into a message and ends up just storing the decoder
of message in the effect type.
So that's that's the trick you can use to essentially hide things that like in Haskell,
you could deal with that with rank two types and things like that.
But in now, you have to have a helper function that has those extra parameters, but then
collapses them and return something that doesn't care about those parameters anymore.
Mm hmm.
Yeah, I have I have an example that I can link to.
I don't I don't think there's an example of this in the own program test examples at the
moment, but I've got an example that I can link to that uses that pattern that you just
So yeah, that's sort of the conclusion I came to as well that you essentially the the trick
is that you're using instead of the specific types you're decoding to, you're decoding
everything to a JSON decode value, and then you're having a function that takes that JSON,
JSON decode value and turns it into a message.
Hard to wrap your brain around, but yeah.
And finally, it's like way more tedious than I would like because the HTTP API just has
a kind of convoluted way of dealing with errors, depending on your exact call.
So but again, this is something that would just completely go away once the work to like
directly work with commands under test is done.
So yeah, again, if anyone wants to help get you, it would be incredible, easier to finish
that work than to figure out all the stuff we were just talking about, about how to deal
with the current API in your tests.
So we've gone pretty in depth on the effect pattern, why that's needed.
We've talked about simulating effects and the APIs for that.
We've talked about the testing pyramid, when to use unit tests versus high level tests.
Let's sort of do a quick round of best practices and tips and tricks to, you know, using Elm
program test effectively.
So one thing we didn't, we didn't touch on a lot, but kind of a huge amount of work is
behind the scenes in Elm program test is the types of error messages that can report about
your view.
So for instance, if you say click a button with this label, it will look for buttons
in a whole bunch of different ways.
Like it'll check for accessibility tags, it'll check for things that are actual buttons,
it'll check for a button that has an image with all text in it of that text.
So I don't know if this is a best practice so much as just the best practice is to use
Elm program test.
It can help encourage you to write code in an accessible way.
Like another thing it will do is verify that you have a label for your checkbox.
If you want to click a checkbox and make sure that the label is hooked up in a way that
actually works, which there's like three different ways to do it, but you can also make mistakes
and have it not actually work in the browser.
Yeah, no, those are great tips.
I mean, I think those are good debugging tips for just how to understand if things aren't
wired up, what should you be learning more about or double checking that you've done
Do you have any opinion on how to select tags or elements?
So you say, select them by label, select them by text, but I know that a lot of people like
to use end to end tests ID.
So a specific attribute, for instance, at Humio, we call it end to end dash ID, and
we only use those for tests.
So I think like the goal of Elm program test is to avoid needing those things in a lot
of cases, specifically like with buttons.
Elm program test is smart enough to be able to search for the labels.
And there's like five different ways a button could get label text.
And Elm program test does that in a reusable way.
So I think often a lot of the reason that people add those IDs is because the testing
framework they're using isn't smart enough to find the thing by the label, all the different
ways that labels can be attached.
So it's just easier to start attaching IDs everywhere.
So if you're using Elm program test, I think a lot of those cases, you won't need those
IDs because Elm program test is smart enough to find what you mean by some user viewable
information that's on the page.
However, there are some limitations to that.
I think one reason that I heard also is that people don't want to have their test fail
when the text of the button changes.
You consider that as part of the spec from what I'm hearing?
I mean, I would say the approach I'd recommend in Elm program test if you wanted that type
of safety is to define some constants somewhere that have the text.
And then you can refer to that both in your test and in the real code.
Because you are sharing code between your...
It's just Elm code between the tests and the code.
I'd say like that, the reason for that is because of my goal of wanting the tests to
read like something that a user could understand, or maybe someone understands what HTTP requests
are, but they can read it and see like, oh yeah, we're clicking the go back button or
So that if you look at it, basically the scenario that in the workflow that you're testing is
clearly visible and readable as opposed to being hidden behind button IDs.
However, like if someone really wanted to have an ID focus thing, they could make a
test module of helper functions that do that.
Like there's some lower level things in Elm program test where you could implement the
set of helper functions that you want for your application if the ones that Elm program
test provides do some preferences that you don't really want.
But I think in general, like I'd like to see more focus on thinking from the user perspective
and haven't in practice seen a lot of issues with text labels changing where it was hard
to replace.
You just go and change it or extract a variable in a constant if that's something that you
have happening a lot.
So for the, we talked about these sort of effect handlers that are taking your custom
type representing the effects in your application and turning it into a real command and a simulated
command or a simulated effect.
How do we keep those in sync?
Because so like one thing that I think can be helpful is, you know, kind of like I hinted
at earlier, import simulated effect.http as HTTP.
If you import it with that import alias, then the handler function in your test code and
your production code will look exactly the same.
Yeah, that's exactly right.
And all the simulated effect dot whatever modules in program test, if you look in the
docs, it says, this is meant to be an exact parallel of this real module.
So that's really the key as we talked about briefly earlier, the constructors in your
custom effect type should basically, like, ideally just be the parameters that you're
directly going to pass to the function that that produces the command.
It almost seems like that could be an opportunity in people's like build or test setup scripts
to just, I don't know, have certain modules where you define your real effect handlers.
And then you could just derive it from a pretty simple like copy paste of that module and
then amend the imports of HTTP to import simulated effect dot HTTP as HTTP.
So that could be an interesting thing to explore too.
Yes, but we're starting to get to the level of effort where personally, like, you're going
to invest doing that work for your application.
You're getting in touch with me and working out a plan where we can actually get rid of
the need for this effect type completely.
That'd be so great.
I get the feeling that you want some contributions.
Yeah, well, yeah, there's just a whole handful of things that I can see the potential for
the API from program tests to be even nicer than it is and much easier to understand.
So yeah, there's a couple big projects that if folks are interested in, get in touch with
me although it would require a bit of commitment to kind of work through some design issues
and get this implemented.
So getting rid of the need for that effect wrapper is the big one.
Adding support for more commands that aren't currently represented as simulated effects,
like keyboard focus, the viewport scrolling, things like that.
If folks are interested in helping with that, I'd love to hear from them.
And it seems like you track these as issues in the GitHub.
Is that the place to look for them?
Yeah, partly.
Certainly if there's something that's missing, either if you are going to work on contributing
to it or not open a GitHub issue if there isn't one already.
But also not everything is in there.
Some of the bigger projects I haven't tracked there yet, not because I don't want them there
just because I don't have enough clear information that I want to put down yet.
But yeah, that another big one is that the test dot html module that's part of the Elm
test package has some missing features and is really composable in a nice way.
So there's some improvements there that could allow the simplification of some of the Elm
program test stuff.
Like currently, if you click a link, you have to both provide the text of the link and the
URL that it's supposed to be.
But theoretically, you should be able Elm program tests should be able to get the URL
from the virtual top.
But it's just not possible the way that test dot html is implemented right now.
So that's another kind of big project.
But if anyone's excited about it, I'd love to chat about it.
And small stuff is like you're interested this morning, if you have any improvements
to the documentation, or even just issues about some part of the documentation that
is unclear, file an issue about it or make a PR with improvements.
If anyone's been using Elm program test, or is going through the process of learning it
and is inspired to write a blog post, that would be really useful because I've put a
lot of work into the documentation, but also I wrote it so I don't really have that perspective
of someone coming to try to use it.
Some more examples of that would be would be great to have.
So let's give people some resources to get started here.
So you mentioned the docs that you wrote, which are very thorough and well worth a read.
Even if another blog post would be helpful, they've got a lot of really good information.
So check out the guidebook, which is in the show notes, and just the Elm documentation
for the package itself.
Also be sure to, you know, there's a lot going on in the HTML test assertion helpers in the
Elm test package.
So we've got a link to that in the show notes as well.
Definitely check out that and familiarize yourself with the API.
And any other resources?
Is the Elm test Slack channel a good place to ask questions or discuss that?
Yeah, I think that actually is a good place.
Is it the testing channel?
Oh, yeah, just called test.
It's called hashtag testing in the Elm Slack.
On the docs, I did want to mention two big thanks to some of my former, you know, Red
Ink colleagues, specifically Katie Hughes and Michael Hadley and Brooke Angel, who helped
review the documentation, give suggestions, helped me improve that when I did the big
3.0 release.
Great stuff.
And Vanessa also gave a really great talk about writing testable Elm.
She talked quite a bit about using Elm program tests and some of the accessibility features
that you were discussing, too.
So that's worth a watch as well.
So cool.
Well, I mean, there's so much more we could get into, so many great details to talk about
But this was a lot of fun.
Thank you so much for chatting with us, Aaron.
Yeah, thanks for having me.
And until next time, talk to you later.
You're in.
See you.