elm-program-test

elm-program-test

Aaron VonderHaar joins us to share the fundamentals and best practices for his high-level Elm testing package, elm-program-test.

PublishedFebruary 15, 2021

Episode#24

Aaron VonderHaar (github) (twitter)
elm-format episode
elm-program-test
elm-test
elm-test episode
BDD As It's Meant To Be Done video by Matt Wynne
BDD (Behavior-Driven Development)
Gherkin syntax
Outside in vs. inside out testing
ensure vs. expect functions
The Effect Pattern
Http simulation example from examples folder
An elm-program-test example using a generalized Effect type
elm-program-test GitHub issues for contributing
The Official elm-program-test Guidebook
The elm-test HTML querying API
#testing channel in Elm Slack
Writing Testable Elm keynote by Tessa Kelly

Transcript

And again, today we're joined by a special guest, Aaron Vonderhaar.

Hey, how's it going?

It's going well.

Thank you for joining us.

You want to say a quick hello and tell us a little bit about yourself?

I am Aaron Vonderhaar.

I have been, I guess, a full stack developer my whole professional career and did a lot

of mobile iOS and Android development in the past and have been doing Elm, I guess it's

almost eight years ago now that I started writing Elm format and was working at KnowRedDNK

for about four and a half years doing Elm there and now doing some Elm consulting work.

And we had a lot of fun discussing Elm format in our last episode and it's fun to have you

on to discuss another tool of yours.

So today we're going to be doing a deep dive on Elm program test.

So let's start with the definition.

What is Elm program test?

So Elm program test came about as filling a gap for me in doing Elm development and

doing it in a test driven development style way of being able to write high level tests

that are written basically from the perspective of a user or an outside entity interacting

with your Elm program.

And it basically gives a nice high level API to write Elm tests in that fashion.

In practice, I found the main benefit that that gives you is the ability to have test

coverage that is resilient to refactorings and specifically like significant refactorings

if you need to restructure how your application works.

If you're converting like a single page app into a spa app, bigger things like that, or

even just like sophisticated new features or changing workflows for your users, the

tests that you write with Elm program tests tend to give you test coverage that allows

you to do safe refactorings that are much larger than refactorings you'd have coverage

for using normal unit tests.

You mentioned that you were writing Elm tests.

So this works with the regular Elm test binary, right?

This is basically just a sophisticated set of helper functions that can be used with

normal Elm tests.

So just to give like a high level overview of like, what the actual wiring looks like,

you pass in an init update and view function to Elm program tests.

So that's how it works is it's actually simulating, initializing and going through the updates

and responding to events in the view.

Essentially I'm program test was extracted from some helper functions that I started

like adding throughout no red inks, a test code.

And the very first version of that basically just looked like we were creating a model,

piping it through a bunch of update calls to the update function with different messages.

And eventually that got fancier where I realized, Oh, we can use the HTML testing features of

Elm test so that we can actually inspect the view that is based on the current state of

the model and make sure that the view has ways to interact that we expect to produce

the message that we want to feed through the update loop.

So essentially that's the complexity in Elm program test is creating an API that essentially

is speaks in a different language than view model update, which is like the technical

implementation of how Elm programs are written.

And Elm program test lets you use functions like, Oh, click the button with this label

or pretend that this HTTP request you made came back with this error code, things like

So let's talk a little bit about some use cases and when somebody would use this compared

to a regular test.

So we covered test driven development and Elm test in general in our Elm test episode,

but this is something different.

This is a higher level form of testing.

And so when would you reach for this rather than a more vanilla unit test?

So something you two talked about in your previous episode was about the testing pyramid

and kind of the different layers of that.

I guess there's a couple of things that the main thing is that if you're happy with unit

tests you have Elm program test isn't trying to stop you from writing unit tests that you're

already happy with.

Elm program test provides the level above that and in particular, if you look at end

to end tests using something like Cypress or Selenium web driver based testing, my hope

has been that Elm program test will allow you to write tests in the same language as

those where you're talking about what the user does, what they see on the screen.

But if you take the assumption that you trust the Elm compiler and the Elm packages to actually

do what their documentation says they do when they're compiled and running in a browser,

if you can trust Elm as a framework to do those things, then you can use on program

test to write tests that cover those same high level features, but that run much faster

are completely not flaky, completely deterministic.

Yeah, no, if you can exercise some functionality at a low unit level, then that's ideal.

But that doesn't always give you confidence.

So if you, you know, one of the rules of thumb that I try to follow is the more high level

my testing becomes, the less comprehensive it should be.

So if I'm noticing that I'm trying to exercise every combination possible in every edge case,

and it's in a high level test, then I'm thinking there's something that either needs to be

split out into a module that's testable or simply exists that way, but isn't being properly

unit tested to give me confidence because that's really not, it's going to be extremely

verbose to have to create a whole test scenario where you're describing clicking through things

and all these permutations.

Because you know, just mathematically, if you think about a higher level test, you're

putting more pieces together.

So there are more variables, there are more things operating together on that page.

And so if you try to exhaustively check them, now you've got a combinatoric explosion.

And it's just not going to be effective, because you can't do that.

But you can at the unit level and you want to.

So that's one thing I try to pay attention to.

But then, like, so why not just test everything in isolated unit tests, right?

And the reason for that is that you don't get the same confidence about things interacting

together and about the system functioning at a high level.

Like what is the user's experience when they come in and they click on byproduct?

Do you get money?

That's an important thing to know.

And a unit test might tell you that the credit card information is being validated correctly.

It strips out white space appropriately, formats it, gives the right client side validation.

That would be a great unit test.

But is it making the right API request and handling the response, showing error messages?

You want to know that when a user goes to do some important business flow, that you

want confidence that it's actually going to do that.

And unit tests don't necessarily give you that.

Yeah, that's exactly right.

And in Elm, being a statically typed and strongly typed language as well, the compiler gives

you a lot of protection for some types of refactorings.

But Elm program test is really at the level that you can't get type safety because it's

looking at like, oh, when you have this sequence of events happening in this order, something

And especially like in Haskell, you can get a bit fancier with types.

But in Elm, you like definitely can't write a type for your update function that says

that, oh, once this message happens, you can never get these initialization messages anymore.

You only get this later class of messages.

So Elm program test allows you to kind of in a more natural language, I would say, than

writing unit tests allows you to get coverage of those things.

I don't know if you've ever ended up with tests where you're trying to like initialize

a model, send it through a bunch of update calls, and then check the state of the model

You can definitely write those as unit tests.

But I find that it's often very verbose.

And you have to end up putting a whole bunch of like, fake data and things that are really

implementation details.

Whereas when you write a test of that nature, what you want to be thinking about is, okay,

what is the flow of events?

And you don't want to care about how to construct the record of the data that represents the

JSON that comes from this endpoint and all those lower level details.

So you say that those tests are supposed to be easy to read.

Have you looked at Cucumber?

Or are you inspired by Cucumber and those kinds of testing frameworks?

Like I think it's BDD?

Yeah, I've been really interested in those techniques.

There's a talk called something like test driven development the way it was meant to

be done by Matt Wynn from and is probably like 15 years old now.

So L program test, I think is a layer that's needed if you wanted to write tests in that

Because that BDD approach is really that you develop a domain specific like application

specific language to write these high level tests in, whereas talking about things that

are relevant to your business domain, like in no red ink, it would be like talking about

teachers and classes and assignments and creating new assignments.

Yeah, I think that's that's kind of just like a higher level DSL on top of like to implement

some of those features like to implement or in a BDD style test, you might say when the

user logs in, and then dot dot dot other stuff to implement that how does the user login

step of your BDD testing, you would, I think, ideally use something like program test to

say, Okay, click the login button, then fill in the name of the current user fill in the

password of the current user, click this other thing, then simulate the back end responding

to the post request with an okay status.

So that's kind of implementation specific.

But the way I just described those steps is still high level, it doesn't depend on the

details of how the form validation is implemented, or how they like what UI framework they use

to create the form.

So on program test is kind of high level for the technical side, but it's still lower level

than something like cucumber tests,

I have an opinion about cucumber, which is that now I may be totally wrong, or, you know,

reasonable people may disagree, but I'll share my opinion for what it's worth.

So what you were just discussing there, Aaron, with it being elm program test being a high

level way to express a flow that's not necessarily as coupled to the actual technical implementation,

not being as likely to change if you change implementation details, right?

In my mind, that is exactly what I want.

And the cucumber part of writing it in in actual plain language, I don't like and that's

what cucumbers for anyone who's not familiar with the Gherkin cucumber syntax, the sort

of idea of, you know, one of the concepts of behavior driven development BDD is that

you actually have customers or customer, you know, product managers, that type of thing,

in their writing cases, and it's not code, it's just text.

And you use regular expression, or various forms of parsing to do like language parsing

and say, when I log in as email address, and then you use some regex to capture that email

address and then log in.

And in my in my experience, that just creates a layer of indirection that doesn't make it

any more high level, it actually just makes it more confusing what's going on.

If you're, if you are not a programmer, and you're writing that, to me, it doesn't seem

like it's making it any easier just because it's an English language, because it's still

a specific syntax that you have to write, it's just one that's been built with this

layer of indirection of regular expression, capturing and stuff.

But you still have to know how to formulate a valid one that the regular expression will

capture and what it's going to do with that.

And to me, that that's not any easier than just writing high level instructions that

say, go to, you know, maybe you have to like learn what the pipe operator is, and then

you have to like learn a few of these things.

But then you just say, click this thing, do this thing, do this thing.

So you know, it's sort of like AppleScript went this route to of like, if we write it

like English, it will be easy for people to write, but it's actually not because it creates

this layer of abstraction that just makes it harder to understand what it's actually

going to do and how it's going to interpret it.

Kind of sounds like graphical codes, like, like dark, for instance, where you connect

things visually, like, yeah, but you have more rules, like, do you, do you know more

people who can write code that compiles or more people who can write English perfectly

That's going to be interpreted by a specific set of code instructions, and if it doesn't

fit that format, it won't do what's intended.

And how do you debug that?

So it's just adding a layer of indirection.

So anyway, for what it's worth, I'm very happy with just the high level way of expressing

things with Elm program tests.

I think it's, I think it's really good.

And I really enjoy being able to like, you know, write tests from the user's perspective.

I know a lot of French developers who would be better off writing code than English.

So well, so to just one thought on top of that is that I think in practice, especially

if you have a large application and you start using Elm program test that ideally you'll

end up with a module or maybe some different modules of helper functions that build on

top of Elm program test, some application specific helpers like the login example that

we said, or maybe in the ed tech domain, creating an assignment, which might do a whole bunch

of steps like, Oh, click the whatever the button to go to the create assignment page,

fill out all this data, click Submit, simulate the response.

So you can still have some of those benefits of having even higher level concepts.

And Elm program test is kind of like a support layer that does all of the generic stuff,

basically all of the reusable things about writing high level tests on program tests

does so that as an app developer, you can write your app specific tests much more quickly

and efficiently and correctly, which I guess is a point we should get to at some at some

point of all the things that Elm program test does behind the scenes.

And to that point of doing it correctly, I think that that's a really key point that,

you know, having this encapsulated in this library, I'm sure you could, you know, call

the update function yourself from an Elm test, but you don't have the same level of confidence

that things are actually being wired up in a way that's equivalent to what's going to

happen when you do browser dot application or whatever type of program you're creating.

So there's a lot of value to having that encapsulated in something that you can trust to be equivalent

so that you know you're not you're testing something that's realistic and going to reflect

And to hit on that, just a little bit more about the types of refactorings that these

programs written in Elm program tests can help you refactor.

These are things like, oh, you have this set of messages and you want to change what the

messages are so that you can whatever centralize your logic, maybe in a certain way in your

update function and extract some data type specific functions that are then used rather

than having all that logic in your update function.

That's the type of refactoring that is very tedious to do if you don't have these high

level tests, because essentially what you have is a bunch of unit tests calling an update

function that are written in the language of that module that knows about its messages,

knows about its internal model.

And now you want all those unit tests to move to a different module that's specific to this

smaller data type doesn't know anything about messages.

So if you have the high level test coverage, you have some safety.

And if you don't have these higher level tests, you basically have to translate all your tests

and move them over, which is, I don't know, that's that's one of the things I I fear the

most when I'm doing coding.

And one example is like form validation is actually something fairly common and also

something that you tend to do incompletely the first time you implement it like, oh,

we'll just need to validate a few things, we'll move on.

And then over time, you need to add more validations, maybe at some point, you end up extracting

a validation helper module or using some package to help you with validation.

Those again, are things that can completely change the flow of events.

Like maybe you used to validate things on when the message happened, but now you're

switching to store the unvalidated data in your model and validating it when you send

the form and showing error messages in a different way.

Those are changes that ideally should be simple to make.

But if you're writing unit tests, they become extremely tedious, because those unit tests

that touch your update function or directly refer to messages are extremely brittle to

those kind of changes.

So if you were doing something like an optimistic update in the UI where you're interacting

with server responses, you would have to fill in a lot of the pieces to simulate that in

a plain old unit test.

But if you're driving it through on program tests, then you can say, click this button,

you know, enter this information, hit send, simulate a server response and make assertions

about the view while it's loading before the server responses come back.

So you can you can decouple it and and not rely on wiring it in a very specific way that

you can't trust as much.

So yeah, there's a lot of value to that.

One thing I don't think we wrap this up yet, but you had mentioned earlier, Dillon, about

using unit tests to cover all the different edge cases and combinations.

And I would definitely agree with that.

I tend to start with maybe like a happy path program test.

Then you can jump down and do unit tests for all your edge cases.

So you like to do like a more outside in approach?

Yeah, yeah, exactly.

Because that so that's a really interesting topic that, you know, in the sort of test

driven development world, there, there are a lot of conversations about do you do you

write a unit test first, and then work your way up to building to fitting that unit into

the application, which is in a way, you know, that's delayed integration, you haven't fit

the piece into the hole.

So you spend time building up all these different cases and then fit it into the hole.

And you don't know if it's actually going going to solve the problem that you set out

to solve when you built that unit.

So you don't know if it's going to fit fitting fitting and integrating the piece into the

hole is the hard and risky part.

So the sooner you can do that, the better the outside in school of thought is more that

start with building something end to end.

And it starts with testing from the outside in from the user's perspective, and then builds

the unit as needed.

But it sort of uses more of a fake it till you make it approach, like you said, sort

of getting that happy path.

So that's more the approach that you like to take.

Well, that's interesting.

I don't know that I have a specific preference for either of those in Elm.

But I do tend to kind of write the high level test first, and then often I'll comment it

out or stash it or something and build those smaller pieces, then bring back the failing

high level test and plug the pieces in.

I think I would tend to do that most of the time.

However, there are cases if I'm not sure how things will be structured that I would write

the failing high level test implemented, like do the fake it till you make it and just write

a really simple thing where it's like hard coded to always show the stuff that's needed

to make it pass refactor.

I think there is a pitfall in doing that approach with on program tests that you can often skip

the refactoring step, or you can end up with working stuff that you've refactored, but

you haven't exactly extracted the coherent modules yet.

And you can end up with kind of a mess relying on only on program test if you aren't disciplined

about looking for the smaller pieces that you're going to pull out and write lower level

tests for those pieces.

Because when you use a regular Elm test, you tested an API and then by using that API,

you can kind of feel the pains with it.

But with Elm program tests, you don't feel those pains because those APIs are technical

detail and you can forget it.

Is that what you meant?

That they are a technical detail.

I think that tell me if this is related to what you're saying, like in Elm in I think

the preferred way and the way people are settling on is that if you have smaller pieces of your

own program, especially if they're related to view stuff or update logic, there's a lot

of different patterns you can use.

Like you may have a module that just has a function that returns HTML, or you might have

one that has a full update function that can produce commands or somewhere in between where

maybe it returns like some specific set of things that can result from its actions or

special functions to process the messages it can produce.

I think like that's a case where Elm program tests can be used to write unit tests for

modules like that that are not completely programs in the full architecture sense of

the word, but are also more complicated than just some functions.

However, Elm program test doesn't let you interact directly with its API.

So in like, and as you two talked about in your previous episode, writing tests in in

test driven development, or test driven design, I almost said, it's like a design tool.

And I think Elm program test doesn't really give you those design tool benefits, because

yeah, I think you're not working directly with the API, but it does help you.

It gives you other benefits, the test coverage we talked about that lets you do bigger refactoring

And it also helps you think about the user perspective more if you're really doing user

centered design.

And a process like that, it can really help get developers aligned with the kind of product

level thinking as well.

But for designing an API, Elm program test really doesn't provide those benefits of testing.

Right, it's almost like the, you know, user interface based testing in Elm program test

keeps you honest about making sure that you're designing something that's going to have a

nice experience for from the user's perspective.

And doing unit level tests with Elm test helps keep you honest that the API you've designed

is going to be nice to work with because that's the direct thing that you're exercising and

getting feedback on.

So you sort of want a mix of those high level and low level tests to make sure those things

are both being designed nicely.

It's really nice that you can write both using Elm tests that you don't have to write unit

tests in Elm and integration tests using JavaScript, using Cypress, for instance.

Even though that's an amazing tool, but that's a different thing.

Yeah, which is really one of the it's really leveraging the strength of Elm in that everything

in Elm is pure functions.

There's a virtual DOM, which is essentially a data structure.

The commands and subscriptions are essentially just data that represents what the runtime

is going to do.

So program test essentially is simulating or like reimplementing in Elm in kind of a

test specific way, the runtime of Elm where it's creating a model, it's, it's maintaining

it, if there's external effects, it's like keeping track of those.

Remembering what things are in progress, what are waiting for responses, and can render

the view from the current state at any time.

So yeah, it's all of that is possible and like was relatively easy to implement.

I mean, it's a lot of work, but it, it was easy to implement compared to other language.

Like actually, I've worked on similar test frameworks for Android, the Robo electric

framework at Pivotal Labs, we started writing a similar framework for iOS, which I think

never got finished.

I've like done similar things in the past and it's like, this is by far the easiest

to write in the most sophisticated of any similar attempts I've worked on.

It's like a, it's like a pure Elm implementation of simulating the Elm architecture and the

So let's, let's get into a couple of things.

So first of all, I think we, we should, we should make it clear we've sort of implied

this but not said it explicitly.

You can simulate HTTP requests with Elm program tests.

And that's one of the really killer features because you know, you would be hard pressed

to find, find a way to do that if you were sort of rolling your own thing.

It's really something you're, you're going to have to do a lot of work to get that same

result otherwise with that Elm program test.

So there's an API where you can sort of expect a certain HTTP requests to be performed.

You can say simulate responding with this HTTP response with this status code, and then

continue to make assertions in your, in your view after that.

So just to compare what that would look like without Elm program test, you could do things

But you would in the first place be like saying call update with this exact message with all

And then you would just add it to the specific message type for your application.

And you also wouldn't buy like, it starts to get extremely tedious going that manual

route, which is what I started with before I extracted on program test is that, okay,

well, maybe you also want coverage to make sure that the message that produced the HTTP

requests happened, because like, maybe you whatever disconnect the code that makes the

button appear that lets people initiate the request, or maybe the validation failed, and

the request wasn't even sent, things like that, Elm program test handles for you.

And then also, if your test fails, having a nice error message about what is happening

is another part that is a lot of tedious work to do on your own, which Elm program test

provides for you.

So if you say, Oh, simulate the response to this endpoint, and it was a timeout error,

if that request wasn't made on program test will say, Oh, that request was never made.

Here's the list of requests that were made.

In contrast, if you were just doing it manually sending the message, there's no guarantee

that like, maybe the message was sent, maybe it wasn't, you haven't validated it, validated

it all you know is like, maybe there was a typo in the URL, and it didn't match for that

So that's, I guess, something I take pride in, in Elm program test.

It's like, yeah, it's, it's a centralized place where and hopefully other people can

contribute to this to to have a nice error messages for these cases that like, ideally,

you want to have this, but you'd never have time to do that work to do that on your own,

if you didn't have this reusable package,

Instead of everybody individually building their own thing and taking their time to do

that, hopefully, you know, you've, you've done the bulk of the work, but hopefully others

can then invest some of that time they would have spent rolling their own thing to improving

this community resource.

Yeah, so there's a whole bunch of stuff.

So on program test, like you mentioned, can do simulations of HTTP responses, it can partially

do the simulation of time passing, like if you have tasks that have delays, it can simulate

that and you say like, Oh, advanced time by this much, and it'll trigger all the all the

delayed tasks, you can simulate ports, both incoming and outgoing ports, and you can simulate

some of the browser and Dom API, that's an area I still need to improve quite a bit.

So what is it that you can't simulate with Elm program test at the moment?

So specific things like browser focus, like focusing on particular IDs in the DOM viewport

scrolling is something that I haven't touched on.

There's some things related to time, I think, like the subscriptions for time, like time

dot every aren't implemented, but I would like to implement.

I think that's, that's the basics of it.

I don't know if there's any other kind of niche packages, I guess, like the file and

bytes API is something that I haven't looked at yet.

But those are things that like, kind of the internals are set up where a lot of that stuff

could be implemented.

You just have to think about, okay, what's the API of different error conditions that

people may want to simulate in tests?

What's the internal model that would represent this?

What do we want the error messages to look like?

So those are definitely things I'd love to have contributions for if someone needs those

features for testing whatever their application is.

So there's nothing really blocking those things from existing.

It's just more time and money, or more time.

Yeah, it's free.

I'd say there's one, like some of the things like, okay, keyboard focus is something that

ideally you'd want to be able to simulate, but the logic that browsers use to maintain

that is so complicated.

Yeah, I don't know if that could ever be safely done.

Maybe I like that's something I'd love to be able to simulate here.

But it's also of like, limited use.

Like that's the point where maybe you just have to track things manually and say, okay,

well, here's the state of the DOM.

Here's what's going to happen.

Let's like hack together a test that gets things into that state.

Or there at some points, you should probably just use a better tool, like Cypress, a tool

more suited to the task.

Yeah, I mean, Elm program test is not, it seems like Elm program test is not intending

to be something that is actually performing HTTP requests, actually sending ports and

executing JavaScript.

And that's like, you know, by design, which, so sometimes I, you know, it's, it gets a

little confusing, but people use different terms for this.

I think of it as like end to end testing versus integration testing, where, you know, I would,

I would consider Elm program test to be an integration testing framework.

It's not ever going to make an HTTP request, which is a feature or a bug depending on what

you're trying to do, right?

It is, you have to pick the appropriate testing level and you have to keep that in mind, but

obviously it's going to be faster.

It's going to be more deterministic if you're not actually making HTTP requests, right?

So you need to have an awareness of what the tool is suited for and use it for those effects.

So, so like I want, I wanted to make the API a little more concrete for people maybe.

So like when it comes to like simulating HTTP effects, so essentially you have this sort

of program test data type.

If you think about, if you think about the Elm language, it is not a dynamic language.

It is a static language.

It's not a language where you can go in and tweak the internals of something, reach in

and change global variables or, or that sort of thing.

So everything needs to be sort of injected rather than reached in and modified.

If it was a language like Ruby or JavaScript, a lot of these frameworks work by, you know,

monkey patching, you know, in Ruby you can like override existing classes and actually

reach in and modify them.

Elm doesn't work that way.

Monkey patching is not a thing in Elm language.

And so, but everything is pure functions.

And so the way that that works in, in Elm is you, you pass, you pass things in explicitly

and effects are actually just a type of data.

And so when you're simulating HTTP effects, what you do is you have this program test.

So an Elm, an Elm test case is just, it's just a single expectation ultimately under

And you know, it's just this one expectation type and a program test sort of builds up

this program test and you sort of chain on a sequence of things.

So it's inherently a sort of imperative flow that you're describing.

The user goes in and does this and does this and does this, but you're actually building

up a single expectation that describes that sequence of events.

So it's a single pipeline and you can chain on.

So you sort of initialize your program test and give it your update function, your view

function, your init function, all that it needs to sort of create its mini simulation

of the Elm runtime.

And then you can simulate HTTP requests and say you expect to get a particular HTTP request

and it's going to essentially mock that out.

Let me just like, as Elm programmers, we like to think about the data types.

So actually this program test type is pretty straightforward.

You can think of it as a result.

It's like either in an error state or in a success state.

In the error state, there's kind of a whole different set of different kinds of errors.

Like it's maybe we failed because we expected an HTTP request.

Here's the function that you were trying to call.

Here's this, here's the data of other requests that were in flight at the time.

Basically whatever's needed to produce a nice error message.

And in the success case, we've got the current model of your program.

And then we have some other state about the world of like what HTTP requests are in flight,

what delayed effects are you like scheduled in time?

What ports exist?

What outstanding like browser requests have been triggered?

So it's, it's really just a data structure like that.

And then the functions that you call like click button, all it does is it's going to

render the view based on the current state of the model.

Look for the button that you wanted to click, grab the message from that, call the update

So I guess the update and view functions are also stored in program test.

But yeah, if you like a lot of it, you can kind of understand if you think about that

as the data structure of what's in there.

It's just a result with your program state, the functions for your program and information

about external effects that are in flight.

And there's a, there's one convention that maybe is like a little tidbit that might be

helpful to talk about for people.

There's a convention of like these insure function calls and expect function calls.

So you can insure that an HTTP request was made.

So there's program test dot insure HTTP request, and then there's program test dot expect HTTP

You want to talk about that distinction?

This is one of the rough edges with the API here, because there's a school of testing

thought where you should have one expectation per test, and that should be the end of your

So those are the primary functions here, the functions that take your program test value

and return an expectation that basically ends your test.

However, when you're writing these high level things, a lot of times you want these intermediate

Like, oh, your first step in your test is to have the user log in.

Well, maybe you're not on a page that has the login form, so you need to fail there

with an error message.

So there's basically a copy of all the expect functions have an insure function that basically

does the same thing, but it returns a program test so that you can do an intermediate assertion,

continue your test and get to the end.

I wish there was a way to unify those into a single check, but I haven't come up with

a good API design approach to do that.

It almost seems like it could be like a builder pattern where you just only have either, you

know, pick a keyword, insure or expect, but only have the current behavior of insure,

which means that you can keep chaining things on.

And then at the end, you say to expectation, and that takes the whole test scenario and

turns it into a single expectation, which is the same as currently what expect HTTP

request does versus insure.

So you can write your tests in that way.

There's a program test.done, which is the final step.

But I didn't want that to be the recommended approach because I think the thought of like

having a single expectation is, in my opinion, the preferred way to think about your tests.

So I didn't want to like make that recommended way be the second class citizen in terms of

how the API was designed.

But then, so like if you wanted to, oh, okay.

So there are also like simulate forms.

So, so there's like insure HTTP request and that's sort of making an expectation and continuing

on kind of what you're recommending against.

You're saying maybe this is a smell and if you have too many insurers, maybe think about

the way you're splitting up your test cases.

If you're inserting too much.

Yeah, I'm kind of on the fence about that because I think like we were saying earlier,

you don't want a ton of these high level tests to check every edge case.

You kind of want to go through your happy path, which is going to do a lot of things.

So you tend to want to assert things along the way.

Having intermediate assertions also helps you with debugging when something goes wrong.

You can kind of tell what step failed.

So yeah, that's something I wish I had a better answer for that.

But unfortunately you kind of learn that thing that's specific to on program tests.

If you have expect and insure, they're basically the same thing that you use throughout your

I mean, in a way there are implicit expectations in certain things.

If you say click button with text, then you're asserting that there's a button with that

text and it will fail, right?

You don't have to say expect.

So that in your book wouldn't count as like too many assertions in the test.

In fact, yeah, those are things where on program tests is helping you get a nicer error message

when something goes wrong.

So actually like I think you could probably realistically use a rule saying that your

normal tests should never use the insure functions.

But if you're writing application specific helper functions for your tests, like you're

writing a helper function that's like to the login form, you can use insure functions in

that helper function, but then your actual test will just call that login thing.

It's essentially a higher level version of the click button type helper.

And then your actual test only has expect at the end.

That's very cool.

Tell me if you want to work on that rule with me.

Because I could be on review rule.

That sounds great.

Yeah, so just to reiterate what you're saying here, like when you say click text and we're

saying there's an inherent expectation and clicking the text because you're asserting

that that text exists on the page, that you're saying that you could use, if that didn't

exist as a building block, you could create that your own building block like that where

you say log user in.

And if there's no login button, you could create the validation that that exists in

your own helpers and use insure in that context.

And for instance, the the API of own program test actually enforces certain best practices

like it has a click button function that you can call, but it can only click things that

are actually buttons or that are divs that have accessibility attributes that indicate

So if for whatever reason, your program is just a bunch of divs with on click handlers,

the built in click button function in on program test is not going to work.

I could make it work, but I've chosen not to.

But you can always write your own helper function that uses the like, find DOM element and do

this with it to kind of build your own helper functions that are needed to do whatever you

want that your app needs.

Yeah, that's an amazing feature.

So I guess we should maybe talk about some more of the rough edges.

So program test, I think the big one, which is kind of the most unfortunately, the most

complicated thing is if you're if you're testing a program where you want to test these external

effects and simulate HTTP responses, or simulate interacting with your JavaScript ports, things

like that, there's a limitation in in Elm at the moment where there's these data types

command and subscription, or sub that in theory, like, theoretically, those things just are

a piece of data that represents some effect that the on runtime is going to perform on

So in on program test, I need to take a command that your program produces and know was there

an HTTP request made in that command and all the various different commands.

So unfortunately, that's not possible in Elm at the moment.

So well, okay, so I know that Evan has had a concern.

And this is a similar reason to why the HTML type is not directly inspectable or destructurable

is that Evan's been conservative about allowing for those kinds of destructurings in an attempt

to prevent packages getting published that do extremely complicated or like not explicit

So I think it's about the explicit as of Elm.

So there's no internal technical reason they couldn't be inspectable.

And I think it's something that like, it's not even a decision that they should never

be inspectable.

But it's that in production code, I think specifically, Evan has wanted to avoid allowing

the use of packages that can like take a command and transform it into some other command,

which could allow for things like oh, sending all of your HTTP requests, if you're using

this caching package, it transforms all your HTTP requests to go to some other server and

redirect all your private information to a man in the middle attack or something like

So I think like, my understanding is it's something that Evan would be open to and does

make sense, but only in the context of testing, and he would not want an API that's usable

in production code to be able to inspect at that level.

And you couldn't do a test that says this HTTP request, so HTTP.get with a URL, and

then say expect equals to something else?

In general, no, because so HTTP specifically, there are functions in that HTTP request type.

And most commands in fact, do have fun, like even your port commands, which are the simplest

ones, there's a function that takes the port value and turns it into a message.

Yeah, functions pretty much everywhere.

I guess functions can be equivalent by reference, but it gets to the area where it's like not

entirely reliable.

I believe Elm will, if you say function one double equals function two, if it's the exact

same reference, then it will equal true.

And otherwise, it will crash the program.

It'll give a runtime exception.

I think that's one of the things that they want to change in Elm 0.20.

Not getting those crashes when you compare functions or JSON regexes.

Well, the regex one has been fixed, but I think the idea maybe would be to somehow disallow

comparison of functions or something.

So we should talk a little bit about the effect pattern and how it relates to Elm program

Yeah, that's where we're going with that is to work around that is that I've had to, you

basically have to refactor your program first to define your own data type that represents

the effects you're going to produce.

Then you have in production, a function that turns those effects into commands.

And in the test side, you have basically a parallel function that does the exact same

thing but it just returns a different type, which is this simulated effect type that's

specific to Elm program test.

So like, HTTP dot get is not inspectable.

But if you had your own, you know, version of simulating an effect, so you have, like,

you click a button and it says, you know, that button fetches the latest to do items.

So you have like an effect fetch to dos.

So instead of that just being a command, HTTP dot get slash to dos dot JSON or something,

it's going to be an effect called fetch to dos.

Your update function is going to be wrapped in a little helper that is actually your the

direct update function you you write is going to be returning a model comma effect.

And then you're going to have to translate that effect in your main production code into

HTTP dot get slash to dos for the fetch to dos effects type.

So it's just a custom type type effects equals fetch to dos and then all the other possible

But then in the test one, you write a translator that turns that effects not into HTTP dot

get but into simulated effects dot HTTP dot get.

So it's a drop in replacement except instead of importing HTTP, you're importing simulated

effect dot HTTP as HTTP.

Otherwise it's the same API, but it gives you something that you can inspect in on program

And unfortunately, the HTTP package has a similar restriction where you can't inspect

the HTTP body type or the HTTP expect type that's used for parsing.

Maybe you can inspect body.

But anyway, there's kind of this chain of things where you can't if you have an HTTP

expect type that like represents the decoder and all of that, you can't actually use that

for anything directly.

So that's another thing that I have to have a parallel version that's used in the test.

So just to run through how I think about doing that in the least annoying way possible is

you want to you have to look at the function that you're calling in the real world that

is producing the command.

So in your example, HTTP dot get returns a command.

So your effect type should really just take the parameters to that.

So a little different than what you were saying you were saying you could have a fetch to

I think it's actually better to keep your effect type more generic where it's just representing

the functions that are going to get called.

So you'd have I'd recommend like it doesn't reflect the message type.

It's more akin to the command type, but it's an indirect a level of indirection.

So you can translate it into a simulated or a real command.

So like get takes the URL as a string.

It takes the HTTP expect, which again, we we have to fake.

So then to create your expect, there's like the JSON body or expect JSON or whatever.

So the parameters to the to that is what you'd stick in your effect type.

So then on the real side, you just call HTTP get with the parameters and build it up.

And then the simulated version should look exactly the same because there's a whole bunch

of like there's simulated effect that HTTP which is a module.

It has exactly the same API as the API as the HTTP module.

It just returns simulated effects instead of commands.

So it's basically a bunch of boilerplate.

That's pretty annoying and takes up a whole lot of documentation of Elm program.

But theoretically, it's possible to remove that limitation.

It's kind of like a detailed project.

If anyone out there is looking to help with this, ideally, on program tests shouldn't

need any of this, it should just be able to inspect your commands and read the data without

you having to do anything to your program to be able to start using it.

That would be a game changer.

Yeah, I've actually in Elm 18.

I prototyped a package that could do that and it needs kind of some integration with

the test runner itself to swap test only JavaScript to make that possible.

But ideally, it should happen.

It just hasn't had a chance to get implemented yet.

I'm wondering whether the effect pattern still has some merits if you're able to do that,

or if it's just uses boilerplates when we get to that point.

Yeah, I think like this is, it's a question that I've seen asked more on the Haskell side

Where in Haskell, there's a type called IO, which basically is similar to Elm's command,

where it's saying that this can have external effects of any type.

But then if you're really into the strong typing and limiting the scope, letting your

types limit the scope of what functions can do, then you look at saying I want a function

that like can't do everything, but it can it's allowed to talk to the database, let's

say and it can maybe it's allowed to send log information, but it can't do anything

else like it can't read and write from the terminal console.

It can't I don't know what other effects are I can't like sleep the computer or call the

halt command, whatever.

So in that environment, there's a lot of conversation about the best way to model that.

And this effect approach is one style of that where you can have a data type that represents

just the limited set of things that are allowed, you can end up with a whole bunch of different

types that represent different contexts that are allowed.

So you could do a similar thing in in which personally, I think has an occasional use

where you maybe have some complicated UI component that wants to like ask for certain things

like maybe it needs to trigger a focus event somewhere else in the DOM.

So it but you don't want to allow it to make HTTP requests.

So you could do something like that where that module defines a type of things that

the parent component should do on its behalf.

And that makes sense.

But I think as a general, like, I think it makes sense in specific cases where you have

a component that like very clearly needs that responsibility.

But as a general pattern, I think it just gets overly complicated.

And in Elm program test, I would avoid even using that pattern at all if it were possible

Yeah, that makes sense.

I mean, it could potentially have the same benefit of, you know, how type signatures

in Elm allow you to reason about what's going on.

Is this changing the whole model?

Is that even possible?

Or is it narrowed down into, oh, this can only change this one data type.

So if I'm looking at this one piece of the model changing, I can narrow my focus and

ignore these areas of code.

You know, you can use the effect pattern to do similar things.

But it's so, as you're saying, it's sort of so heavyweight to do that, that it might not

be worth the benefit because managed effects in Elm are already so cleanly isolated that

it's in a pretty good state as it is.

I've thought about using a pattern like this for a sort of plugin architecture for authors

creating packages for Elm pages, because if you're performing a static HTTP request or,

you know, things like that, if you set up a package that generates an RSS feed, it would

be nice to know, is it allowed to make HTTP requests or not and be able to explicitly

give it permission for what it can do.

There are other ways to achieve that effect, though.

Like you could just not perform any requests that it performs except a particular wrapped

type and you have to pass in the request it can perform.

So you could pass in a reference to HTTP.

And anyway, there are other ways to achieve that effect.

And one other use of that pattern is if you need to process those effects in different

ways in different contexts.

So an example would maybe you have some kind of like, admin tool that lets you like, configure

And actually, we did this at No Red Ink, we had an assignment that students could do.

And normally it would like, you know, send data to the back end, here's the work they've

done, get data from other students.

But we also wanted a preview mode where teachers could play around with the assignment, and

also like simulate other students doing work that would like get sent to you and things

So that was a case where having an effect type as data was useful, because we can interpret

those effects in different ways in the real mode for the students doing the work, we would

send the HTTP requests in the preview mode, we have a different interpreter that basically

like uses a fake preview data structure.

And we can trigger different effects from like an extra panel of buttons for the teacher

So that's that's a case where you need this pattern, but also a relatively rare one, like

occasionally, you'll be doing where you need that capability, but not often.

I can imagine like wanting to have like security audit log or something of all of the particular

effects that have been performed.

And so you could have a loggable chain of those things and be able to trace them.

That's that's a really interesting thing to think about.

So okay, so I want to I want to loop back and talk a little bit more about this thing

you brought up with the effect pattern with Elm program test where you're describing using

the effects type that you define as a sort of equivalent of a command.

So you've got like type effects equals and then you can have no effect, which would be

the equivalent of command dot none.

And then you can have sort of instead of get to do's that specific message to make that

HTTP request for getting to do's, you're saying you could have a more general one.

And so you can have like a get data effect that has that takes a URL as part of its payload

of that constructor and it takes a decoder.

So there's an example that we'll link to in the show notes from the Elm program test examples

folder that sort of shows effects using HTTP simulations.

But how do you avoid having type variables in your effect type or or do you just have

to bite the bullet and have a get data for every specific data type that gets returned?

Yeah, because right, you need different like in one case, maybe you have one request where

you have a decoder that decodes users and in another request, you have a decoder that

decodes account information or yeah, to do to do it.

Yeah, so I need to I need to make a good example.

The example we're like to is kind of written as the simple way to do things and straightforward.

So there's not a ton of discussion, because the HTTP API in Elm is unfortunately a bit

tedious to work with.

There's a whole bunch of different types and like if you're doing if you're making tasks

versus commands, there's different there's like the expect type, but then there's also

some other type that I'm forgetting the name of.

So anyway, there is a way to deal with that.

The way you end up doing it is the effect type would have for instance, like an HTTP

request constructor, right, it would have the payload that it need the headers have

the method and the way you end up doing it is you have a decoder that decodes messages.

And then you also have a separate function that takes an HTTP error and returns a message.

And then you also have the body that takes a JSON value.

So you basically like hidden all of your request specific types behind JSON and behind message.

So that tedious to call directly, but then you make a helper function that can that basically

does take the decoder of type a, the function of type a to message or alternatively the

function of result HTTP error a to message, and then it builds the actual parameters to

the effect based on those and it would be like composing the decoder with the function

that takes the a value and turns it into a message and ends up just storing the decoder

of message in the effect type.

So that's that's the trick you can use to essentially hide things that like in Haskell,

you could deal with that with rank two types and things like that.

But in now, you have to have a helper function that has those extra parameters, but then

collapses them and return something that doesn't care about those parameters anymore.

Yeah, I have I have an example that I can link to.

I don't I don't think there's an example of this in the own program test examples at the

moment, but I've got an example that I can link to that uses that pattern that you just

So yeah, that's sort of the conclusion I came to as well that you essentially the the trick

is that you're using instead of the specific types you're decoding to, you're decoding

everything to a JSON decode value, and then you're having a function that takes that JSON,

JSON decode value and turns it into a message.

Hard to wrap your brain around, but yeah.

And finally, it's like way more tedious than I would like because the HTTP API just has

a kind of convoluted way of dealing with errors, depending on your exact call.

So but again, this is something that would just completely go away once the work to like

directly work with commands under test is done.

So yeah, again, if anyone wants to help get you, it would be incredible, easier to finish

that work than to figure out all the stuff we were just talking about, about how to deal

with the current API in your tests.

So we've gone pretty in depth on the effect pattern, why that's needed.

We've talked about simulating effects and the APIs for that.

We've talked about the testing pyramid, when to use unit tests versus high level tests.

Let's sort of do a quick round of best practices and tips and tricks to, you know, using Elm

program test effectively.

So one thing we didn't, we didn't touch on a lot, but kind of a huge amount of work is

behind the scenes in Elm program test is the types of error messages that can report about

So for instance, if you say click a button with this label, it will look for buttons

in a whole bunch of different ways.

Like it'll check for accessibility tags, it'll check for things that are actual buttons,

it'll check for a button that has an image with all text in it of that text.

So I don't know if this is a best practice so much as just the best practice is to use

Elm program test.

It can help encourage you to write code in an accessible way.

Like another thing it will do is verify that you have a label for your checkbox.

If you want to click a checkbox and make sure that the label is hooked up in a way that

actually works, which there's like three different ways to do it, but you can also make mistakes

and have it not actually work in the browser.

Yeah, no, those are great tips.

I mean, I think those are good debugging tips for just how to understand if things aren't

wired up, what should you be learning more about or double checking that you've done

Do you have any opinion on how to select tags or elements?

So you say, select them by label, select them by text, but I know that a lot of people like

to use end to end tests ID.

So a specific attribute, for instance, at Humio, we call it end to end dash ID, and

we only use those for tests.

So I think like the goal of Elm program test is to avoid needing those things in a lot

of cases, specifically like with buttons.

Elm program test is smart enough to be able to search for the labels.

And there's like five different ways a button could get label text.

And Elm program test does that in a reusable way.

So I think often a lot of the reason that people add those IDs is because the testing

framework they're using isn't smart enough to find the thing by the label, all the different

ways that labels can be attached.

So it's just easier to start attaching IDs everywhere.

So if you're using Elm program test, I think a lot of those cases, you won't need those

IDs because Elm program test is smart enough to find what you mean by some user viewable

information that's on the page.

However, there are some limitations to that.

I think one reason that I heard also is that people don't want to have their test fail

when the text of the button changes.

You consider that as part of the spec from what I'm hearing?

I mean, I would say the approach I'd recommend in Elm program test if you wanted that type

of safety is to define some constants somewhere that have the text.

And then you can refer to that both in your test and in the real code.

Because you are sharing code between your...

It's just Elm code between the tests and the code.

I'd say like that, the reason for that is because of my goal of wanting the tests to

read like something that a user could understand, or maybe someone understands what HTTP requests

are, but they can read it and see like, oh yeah, we're clicking the go back button or

So that if you look at it, basically the scenario that in the workflow that you're testing is

clearly visible and readable as opposed to being hidden behind button IDs.

However, like if someone really wanted to have an ID focus thing, they could make a

test module of helper functions that do that.

Like there's some lower level things in Elm program test where you could implement the

set of helper functions that you want for your application if the ones that Elm program

test provides do some preferences that you don't really want.

But I think in general, like I'd like to see more focus on thinking from the user perspective

and haven't in practice seen a lot of issues with text labels changing where it was hard

You just go and change it or extract a variable in a constant if that's something that you

have happening a lot.

So for the, we talked about these sort of effect handlers that are taking your custom

type representing the effects in your application and turning it into a real command and a simulated

command or a simulated effect.

How do we keep those in sync?

Because so like one thing that I think can be helpful is, you know, kind of like I hinted

at earlier, import simulated effect.http as HTTP.

If you import it with that import alias, then the handler function in your test code and

your production code will look exactly the same.

Yeah, that's exactly right.

And all the simulated effect dot whatever modules in program test, if you look in the

docs, it says, this is meant to be an exact parallel of this real module.

So that's really the key as we talked about briefly earlier, the constructors in your

custom effect type should basically, like, ideally just be the parameters that you're

directly going to pass to the function that that produces the command.

It almost seems like that could be an opportunity in people's like build or test setup scripts

to just, I don't know, have certain modules where you define your real effect handlers.

And then you could just derive it from a pretty simple like copy paste of that module and

then amend the imports of HTTP to import simulated effect dot HTTP as HTTP.

So that could be an interesting thing to explore too.

Yes, but we're starting to get to the level of effort where personally, like, you're going

to invest doing that work for your application.

You're getting in touch with me and working out a plan where we can actually get rid of

the need for this effect type completely.

That'd be so great.

I get the feeling that you want some contributions.

Yeah, well, yeah, there's just a whole handful of things that I can see the potential for

the API from program tests to be even nicer than it is and much easier to understand.

So yeah, there's a couple big projects that if folks are interested in, get in touch with

me although it would require a bit of commitment to kind of work through some design issues

and get this implemented.

So getting rid of the need for that effect wrapper is the big one.

Adding support for more commands that aren't currently represented as simulated effects,

like keyboard focus, the viewport scrolling, things like that.

If folks are interested in helping with that, I'd love to hear from them.

And it seems like you track these as issues in the GitHub.

Is that the place to look for them?

Certainly if there's something that's missing, either if you are going to work on contributing

to it or not open a GitHub issue if there isn't one already.

But also not everything is in there.

Some of the bigger projects I haven't tracked there yet, not because I don't want them there

just because I don't have enough clear information that I want to put down yet.

But yeah, that another big one is that the test dot html module that's part of the Elm

test package has some missing features and is really composable in a nice way.

So there's some improvements there that could allow the simplification of some of the Elm

program test stuff.

Like currently, if you click a link, you have to both provide the text of the link and the

URL that it's supposed to be.

But theoretically, you should be able Elm program tests should be able to get the URL

from the virtual top.

But it's just not possible the way that test dot html is implemented right now.

So that's another kind of big project.

But if anyone's excited about it, I'd love to chat about it.

And small stuff is like you're interested this morning, if you have any improvements

to the documentation, or even just issues about some part of the documentation that

is unclear, file an issue about it or make a PR with improvements.

If anyone's been using Elm program test, or is going through the process of learning it

and is inspired to write a blog post, that would be really useful because I've put a

lot of work into the documentation, but also I wrote it so I don't really have that perspective

of someone coming to try to use it.

Some more examples of that would be would be great to have.

So let's give people some resources to get started here.

So you mentioned the docs that you wrote, which are very thorough and well worth a read.

Even if another blog post would be helpful, they've got a lot of really good information.

So check out the guidebook, which is in the show notes, and just the Elm documentation

for the package itself.

Also be sure to, you know, there's a lot going on in the HTML test assertion helpers in the

Elm test package.

So we've got a link to that in the show notes as well.

Definitely check out that and familiarize yourself with the API.

And any other resources?

Is the Elm test Slack channel a good place to ask questions or discuss that?

Yeah, I think that actually is a good place.

Is it the testing channel?

Oh, yeah, just called test.

It's called hashtag testing in the Elm Slack.

On the docs, I did want to mention two big thanks to some of my former, you know, Red

Ink colleagues, specifically Katie Hughes and Michael Hadley and Brooke Angel, who helped

review the documentation, give suggestions, helped me improve that when I did the big

And Vanessa also gave a really great talk about writing testable Elm.

She talked quite a bit about using Elm program tests and some of the accessibility features

that you were discussing, too.

So that's worth a watch as well.

Well, I mean, there's so much more we could get into, so many great details to talk about

But this was a lot of fun.

Thank you so much for chatting with us, Aaron.

Yeah, thanks for having me.

And until next time, talk to you later.