We discuss the fundamentals of test-driven development, and the testing tools in the Elm ecosystem.
August 10, 2020

elm-test Basics

TDD Principles

Fuzz Testing

When to Use Types or Tests

Should you test implementation details?

Higher-Level Testing in Elm

elm-program-test Martin Janiczek's elm Europe talk on testing Msg's with ArchitectureTest


Hello, Jeroen. Hello, Dillon. I really hope we meet people's expectations with today's
episode. That goes off on a good start. You want to tell people the episode topic so they
can appreciate how bad that pun was just now? Yeah, well, I think they got it from the episode
title, but today we're going to talk about Elm tests, which is dealing with expectations.
Or do you want me to explain the pun? It's always funnier when you explain the joke.
If there's one thing I've learned, always explain the joke. Okay, so you said talking
about meeting people's expectations and in Elm tests, you write expectations for a test
to pass. See, if people weren't on the floor laughing the first time, now they're definitely
laughing really hard. You're welcome. Okay, so let's quickly move on from that and give
people a little introduction to Elm tests. So like, for somebody who has never encountered
it, what what the heck is it? How do you even get set up with it in the first place? Yeah,
so Elm tests is the official library and tool to write tests in Elm and for Elm code. So
you have two parts for it. You have one part that is the library, which is Elm dash explorations
slash test. And there's the CLI, which is run on nodes. So you to install it, you do
well, the name is Elm test. And to install you do npm install Elm tests. That's pretty
much it. Oh, and then actually, there's an Elm test init command, which is kind of helpful
for kind of setting things up because there's a there's a section in your Elm dot JSON,
which has test dependencies. And it needs to make sure it installs the you know, proper
version of Elm exploration slash test the package that allows you to make assertions
and declare test cases. Yeah, so you so you have your production code or your source code.
And next to that, you usually have a tests folder in which you put all the tests. And
Elm test is going to read all the tests, find all the tests that are available in that folder.
With a little bit of magic, and with a little bit of magic, build something using your test
dependencies that is your in Elm JSON, right and run them. Yeah, it it's kind of hard to
wrap your brain around it at first. But what it does is it looks for any values that are
exposed in your test folder of the type test, you could expose like multiple top level values
in a module of type test, and it will run those you could expose a list. Well, you create
like a list of tests with a describe. A describe is just a function that takes a test, which
takes a list of tests, and the title for the whole group. Yep, exactly. And really, what
a test is in Elm is it's you just invoke one of the expect helper functions, for example,
expect dot equal, and that takes two values. So if you say expect dot equal one, two, it's
going to say, I expected, let's see, actually, usually you pipe it. So it's hard to think
about which one is expected and actual. If you say one, right pipe, expect dot equal
two, it's going to say, I got one, but I expected it to be two. Yeah, exactly. So ultimately,
that's what an Elm test case is. It's literally just an expectation. So one of the things
that's really beautiful about the design of Elm is the fact that it's so testable, because
it's just people bend over backwards. I think we might have talked about this on another
episode, but people bend over backwards in other languages trying to make all of these
side effects and trying to assert on all of these impure things and trying to make non
deterministic things deterministic. So you have some code that depends on time. You have
some environment stuff that it's picking out and reading global variables and things like
that. And you have to do all of this setup. And some of it's implicit, some of it's required,
some of it's not. You're not quite sure which of the pieces are required or not. You're
not sure which pieces are deterministic or not. You might get tests randomly failing
on the first of the month. With Elm, you don't have any of those problems. And I can't tell
you how, I mean, I've got a background in doing agile technical coaching. So I've spent
a good deal of time trying to sort of coach people on some of these best practices around
testing. And so many of these are just built into the Elm language because of the purity,
because testing is so natural when it's just input output. And so that's all an Elm test
is. It's like you have this input, you get this output, and then state a few things that
you expected about that output. That's all it is. And it's so nice to test in Elm. So
please do it. Please write tests in Elm.
That is actually one of the things that is always quite hard with tests is the setting
up. And so when you have complex functions, complex environments, to get the thing testable
in a testable state, you need to do a lot of techniques and inversion of control, mucking,
to get it testable. And in Elm, since you're only dealing with pure functions, a lot of
those problems go away directly.
Yes. Yeah. And on the topic of testable design, I think this is one of the things that maybe
somebody who is more new to test driven development might not be familiar with this concept, but
I think it's a really important one. Actually, many people consider test driven development
to be a design practice rather than a testing practice. Because what test driven development
does, you know, when you're writing a unit test, before you write the code that makes
that unit test pass, what you're doing is you're being deliberate about what you want
the design to look like, rather than letting the implementation guide the design. You're
letting your intentionality guide the design. And by definition, you're making it testable
because you're writing it in a way that's nice to test and nice to use before you write,
you know, the implementation. So naturally, if you write the implementation first, what's
going to tend to happen is it's going to be a design that's very difficult to test. And
there's this inherent quality with testable design that it's decoupled. And similarly,
if you have not test driven your design, it tends to get coupled. So test driven development
is a good technique for designing your code in a way that is nicely decoupled and nice,
you know, nice to maintain whether like independent of the tests, it's just tends to be nicer
designed code.
Definitely. So as you just said, when you do TDD, you use the functions, the library
before you've finished designing it. And when you use it, that's when you find the flaws
in your API design. So if you try to make the API look good after the fact, it wouldn't
get you the same result as if you did it over the course of the writing.
Yeah. So before we go into too many more details of like the specifics of Elm test, maybe let's
introduce like the basic concepts of test driven development. For example, red green
refactor. So red green refactor is, it's just a very liberating way to work actually, because
the thing is, it spreads out a large difficult task over many small individual steps. And
so I find it a lot more enjoyable to work in that style, rather than trying to get everything
working at once, which is very overwhelming and kind of gives you this analysis paralysis.
Test driven development allows you to say, how do I get this one case working, you know,
if I'm doing something that transforms a list in some way, what if I give it an empty list?
You know, does it give me the correct result for an empty list? And then you you write
a test case that does that you make it pass. And in the process, you get several, several
things wired up. And so it's actually moving you forward where you're making sure that
you get the wiring because you've got a fully working thing at each step. So the red part
is you write a failing test, you have an assertion that fails. Now, that was the red part, the
first part, that's the red part. And in Elm, a compiler error could be part of that failure
process, right? So you, you may write a red test where there's a compilation error, you
fix that there's an expectation failure, and then you fix that. And that's okay. But the
thing that's important is that you're following the failures you're getting when you run your
test. So you write an assertion, you run the test, it prints out the current problem, which
may be from the Elm compiler saying it doesn't compile. But you're fixing what running the
test tells you you need to fix. And that's the test driven part. So that's the red part.
The green part is you fake it till you make it you do the simplest thing you could possibly
do to get it green. Usually the stupidest solution is the good one. If your inputs are
like multiply six by seven, just hard code 42. It's stupid, but it works. And to make
sure that that doesn't say you write a different test later, it says multiply two by three,
and then 42 will not be the correct answer for that one. So there you have to generalize.
Yeah, exactly. There are a lot. There are so many really elegant principles at play here
that I find are just very, very useful ideas in programming, like Yagni, you ain't gonna
need it. So that's the idea that you think you're gonna need to handle this case, you
think you're gonna need this functionality, but write it when you when you do need it
when when you do have concrete evidence that you need it. You know, in the case of like,
building out a product, that concrete evidence might be that you observe users using it and
see that they're running into a problem or get feedback from them. In the case of a test,
the you ain't gonna need it, you know, you prove that you need it by writing a test.
And then now you need that design, but don't design things in anticipation of I'm gonna
need to generalize this. This design discipline of fake it till you make it doing the stupidest
thing that could possibly work. Or as some people like to say, the simplest thing that
could possibly work that keeps you honest about not over designing and anticipating
what you're going to need. And it also our brains are much better equipped to solve one
case than to like, have the generalized solution to something. It's so much easier to just
think about one case at a time. So I hope you break that down.
So we got the red, we got green, what is a refactor?
The refactor, that's the one that that people forget to do a lot of the time. Refactor is,
well, it's quite delightful to do in Elm. Refactor is when the tests are green, refactor.
Now you know, I've been I've been thinking a lot lately about refactoring in Elm, refactoring
tools. You know, I've been I've been doing some work on making some contributions to
IntelliJ Elm for some automated refactorings. I know you also are thinking about these types
of things you're in with Elm review.
Yep, definitely.
The idea with the refactor step in TDD is now you have a test that kind of demonstrates
the behavior that you want is happening. So now you can safely refactor. So one of the
things in TDD also is that you don't want to you don't want to start refactoring when
you're in a red state because then you make a refactoring and you don't have that green
to tell you everything went well.
Yeah. So if you started refactoring, either stop the new test or stash or whatever refactoring
you were attempting, just do one or the other, not both at the same time.
That's a great tip. Yeah, I find I find myself in practice using that technique quite a bit.
You just you're writing a new test and then you're like, wait a minute, I really need
to refactor this thing to build this thing that the new test case wants me to build.
And as you say, you comment out the new test, you skip it, but now you're back at a green
state and you can refactor all you want. And it's this is another really great principle
of test driven development, which is make the change easy, then make the easy change.
How is that related to testing?
Because it's part of the refactor step. So you want as much as possible to rely on. So
sometimes that means that you you have a new test case. So red, green refactor, we kind
of introduced the different steps, but you kind of iterate on that cycle, adding new
cases. So you do red green refactor, you can refactor anytime it's green, it's really more
of a state machine than like a discrete sequence, right? Yeah, then you add a new test case
as needed when when there's more behavior you need to add, you add a test case to prove
you need that new behavior handled. Yeah. And then as you say, you find that I really
need to generalize this piece of code or extract this function or add a parameter here or whatever
it might be, in order to make this next step easier, like this step is going to be too
big, unless I do a refactoring step first. And so as you as you pointed out, you you
skip the new test you were adding, or you come to comment it out, you stash it, whatever
you need to do to get back to a green state. But now you know, okay, I am going to need
to do this refactoring to make the next step easier. And so you make the change easy. And
then you make the easy change. So you want to do as much of the heavy lifting as possible
as a set of refactoring steps. So that when you make the behavior change, it's dead simple.
And the search space for where a problem could happen is much smaller.
I'm curious, do you factor at every step or every few steps? I usually do every few but
I don't know if that's the best way to do it.
I think it totally depends the way I think about refactoring, which really, we could
definitely do several episodes just about that topic. And I'm sure we will. But the
way I think about refactoring is we're constantly reading code, we're reading code more often
than we're writing code. So you want it to be very inexpensive to read code, because
we do it often. So we want to optimize for reading code, and for changing code, right?
Those are if we if you can make reading and changing code extremely inexpensive and efficient,
then you're going to have a good code base to work with your you know, you're in good
shape. So refactoring helps you do that. But as you read the code, you start to understand
things about the code, you start to see certain patterns, you start to realize like, okay,
this variable is called accumulator. But really, this is like the
concatenation of a list of strings. Yeah, this is like the the admin users. That's
really what this accumulator represents. So you're reading the you know, you're reading
through your code, and you see some variable and and you're like, what is this doing, right?
Because that's what happens when you're reading code. You're like, I need to add this feature.
What the heck is this doing? You read it, you sort of understand it. And then when you
have that insight refactor. So that's one cue you can use to refactor is when you start
to understand something, take that understanding out of your head and put it back in the code.
And now the next time you or someone else is reading that code, they don't have to do
that extra step of processing it in their brain to get the understanding it's more readily
available or sometimes that understanding might not be like a variable name, but it
might be extracting a particular function that that represents some operation, you know,
grouping something together, having something in a certain module. But I find that helpful
to think about as you're reading code, just refactor something if you see something that
can change, just do it. And I think people sometimes think about refactoring as something
that you take a month on a branch, and just do a giant rewrite. And that's like a lot
of people's notion of refactoring. And that's called summer. But when all the colleagues
are on vacation, go refactor.
And I mean, it's one thing to like dedicate a large chunk of time to refactoring, but
it's another thing to just get like just do a giant step and you want to break it down
into a lot of tiny steps whether or not you spend a long time refactoring. And also like,
yeah, so anyway, I mean, I like to keep myself honest about making tiny changes where there's
almost zero risk that I've changed behavior unintentionally.
One thing I would like to point out or to emphasize on is during your refactor, you
should not support new cases. So if something needs and deserves a test written for it,
and doesn't work at the moment, and refactoring the way you do it makes a test pass, then
you should first write the test and then refactor it. Otherwise, you lose one good loop, one
good cycle. And this cycle is very, brings a lot of good things.
I do find that sometimes I like to generalize as part of refactoring step. I'm not sure
if that's compatible with what you're saying right now or not. But so there's this one
I guess not.
I mean, I guess I would clarify that I don't call that refactoring. I call that generalizing.
And sometimes people use the word refactoring there, right? So if you're refactoring, it
means you're not changing the behavior, right? But if you suddenly fix a bug during your
refactoring, it means it wasn't a refactoring step. And that, you know, it's not that you
shouldn't fix bugs or change your code in a way that resolves bugs. It's just, no, that
it's not a refactoring. And there's a particular role that a low or zero risk refactoring plays
in the development process. It's a very helpful technique.
It's good to have the word generalizing in mind for both steps.
Exactly. Exactly. They're like different modes of operation. So know which one you're in.
There are different approaches to test driven development. You can do this technique called
triangulation where you sort of every time you want to generalize your code, you write
a failing test that proves that it needs to be generalized first. So like basically, you
only generalize code by responding to a failing test. I find in practice, sometimes that gives
you like, you know, you hard code a case like you you're doing fizzbuzz and you say fizzbuzz
for for one is one. And then you just hard code fizzbuzz to return the string one. And
now you say fizzbuzz of two is two, you have to write a failing test in order to remove
the hard coding of the number one from the return value. Yeah, but really, to me, it
seems just noisy to have two test cases that are testing, turning a regular number into
a string. And so sometimes I'll do that as a generalization step where I'll just Kent
Beck in test driven development by example, which is a really nice book, he talks about
this idea as like, removing duplication between the test case and the production code. So
anyway, that's a technique that I sometimes reach for rather than sometimes it can become
a little bit dogmatic to just follow triangulation at every step. And it doesn't feel natural
in a context. So I recommend that people try out these approaches, it can be nice to do
some exercises, some some kata, we can link to some code kata. But it's really nice to
just practice code kata, and try out these different techniques to get experience with
them. And then you have to use your own judgment in a real code base to figure out which techniques
make sense for you. But if you've at least experienced them, then you have a better sense
of of how they help you write better code. Now you know everything about test driven
development. Okay, well, shall we talk about some some more of the details of the specific
elm test package? Yeah, we can do that. We didn't say how you how you write a test. Maybe
we should go into that. Let's do it. The way you write a test is you use the import the
test module. You do you call the test function from the module. So test, followed by the
title of the of the test. And I found that one pretty important because if the test fails,
you know that whatever you wrote in the title is the thing that is not working anymore.
The failing test message is like part of the thing you're building when you write a failing
test. Because if it ever fails in the future, that's going to be what's guiding somebody
to fix the broken thing. Yeah, so the title and the expectation error. So exactly, use
both. Exactly. So yeah, you got the title. And then usually what you do is you pipe user
left pizza, followed by a anonymous function. So which takes this argument, empty tuple.
So a unit. Yep. And then inside of that, you write the setup of the function, setup of
the test, followed by the expectations that you want to make sure that happen. The reason
why there is a anonymous function is for performance reasons. So that's if you only want to run
tests from one file from one specific location, all the tests are not run on that executed
exactly uselessly. Because Elm does eager evaluation. So if you Yeah, if you make it
a lambda, then you have to pass an argument for it to evaluate that. Whereas if you don't
call that lambda, it's not going to evaluate the body of that lambda until you pass in
the unit to that. Yeah, which makes everything much, much faster. Yeah, and it allows the
test framework to give you feedback as it's running things rather than just blocking on
evaluating everything at once and then Yeah, giving you feedback. And I think another thing,
I guess there could have been a different design for this. But that unit argument of
that lambda, it becomes a value if you're doing fuzz testing. Yeah. Shall we get into
fuzz testing now then? We may as well. Let's do it. So what is fuzz testing Dillon? Well,
fuzz testing is also known as property testing property based testing. And I think actually
property based testing is a good term for it too, because it kind of gives you this
sense that you're testing a general class of data. So you you're making assertions about
properties of that class of data, rather than in the traditional unit testing style, making
assertions about one specific case, one specific function call, and the output of that. So
what fuzz testing does is it uses a random generator with a random seed, you can build
up either simple or complex data types, you could you could do fuzz dot int to have a
fuzzer that gives you an int, you can build up a fuzzer very much like you build up a
decoder. And get a complex data type, you could build your own custom data type. And
then you know, you can get a list of those fuzzed values and compose them together much
like a decoder. And then you you make assertions about those randomly generated values. And
if you want to reproduce a particular failure, then you can copy the random seed and pass
that in as a flag when you run elm test on the command line. Yeah. And because when elm
test fails, you get a big error message saying, hey, if you want to produce it exactly like
the way I just ran it, run this command and it contains a seed flag and the list of files.
Exactly. And and the reason that it's that it's running it with a random seed. The point
is, like conceptually, you're not running the test against one value, you're running
it against an infinite sized set. And you're using a random sample of that. But every time
your test runs in your CI or your local environment, it's running on a different random sample.
So as you run the tests more, you approach running the tests on the infinite sample of
everything and asserting that property about that whole data set. Yes. I'm not sure if
we actually said it, but the idea is that you write your test, your first test, and
then it will run a lot of times, usually 100 times. Yeah, I think that's the default with
different inputs. Yes, exactly. So yeah, whenever you you run your test, you get 100 or more
if you want to test generate it with values that make sense that don't make sense. But
your setup will be run with a lot of different inputs. So if something goes wrong, you will
know which values it will be for. And you won't generally have coverage for a lot more
cases than you would have thoughts by yourself. Yes. Yeah, because you're basically testing
an infinite sized set because every time you run it, it runs it with a different sample.
Yeah. Another cool feature of fuzz testing is that it will shrink down the result set
to give you the simplest failing test case. So if you're using a string fuzzer, for example,
and your string fuzzer, like so like a common common example is like a palindrome, right?
Or just any sort of reversible operation. You know, if I encode this to a JSON value
and then decode that JSON value, I should have the same thing I started with or if I
take a string, and I reverse if I take a string and check if it's a palindrome, then the reverse
of that string should also be a palindrome. For example, that's like kind of property
that we're talking about when talking about property based testing. It's a behavior property
of the whole system. Exactly. So if that assertion were to fail on empty string,
maybe the you know, maybe that property also fails on a 200 character long string. But
the failure it's going to give you is the empty string. So that's called shrinking that
it reduces down the failures to find the the simplest failure it can produce. Yeah, it's
kind of like a very magical concept. It's kind of cool that that feature is just built
into it.
Yeah, the way I think it works is, I think it's going to try that string with 200 characters
first, or we'll probably try the empty string first. But at some point, it will try the
200 character long string. And if it finds it that it fails, it will try to simplify
it to this shrinking part by generating a few simpler cases than the 201. So maybe 190
characters or 199. So we'll generate a few ones, and we'll run the test on each of those.
And if one of those fails, then it will try over and over and over again with those values,
until finds the simplest thing, which could be an empty string or which could be something
else depends what your problem is.
Right. Martin Janacek has been working on, I guess it's pronounced minithesis. I don't
know because it looks like mini thesis, but it's actually like a mini version of something
called hypothesis. And hypothesis is like another approach to this idea of like property
based shrinking. But it allows you to do property based testing in a way where you can do end
then. The basic short description from what I understand from the readme of the project
is that there's like the random fuzz values, there's like the actual concrete values,
and then there's like the seed that determines those values. And the idea of like hypothesis
and mini thesis is...
I'm just going to say mini thesis also.
I know, I know. It keeps track of the underlying seeds that generated those values. And so
it allows you to do end then. Whereas like the Elm fuzz library doesn't provide end then.
But those are all like fun sort of academic details. But in practice, fuzzing is just
a really cool technique that's at your disposal when you write tests with Elm test, it's built
in. I've used it like a few months ago, I was testing some logic around a money module
in a code base. And it was really nice to do some property based testing because with
money you want to make sure that there aren't any bugs. And so it was quite nice to be able
to like to say if I take the difference of two sums of money, then it should be zero
if I'm taking the difference of the same value of money. Or like it parses negative money
correctly. So if there's a negative sign in front, whatever the actual value of the dollars
and cents are, when there's a negative value in front, the value will be negative, right?
That's like a property you can assert across the whole set of data. And it's just an extra
set of confidence. Sometimes it's nice to be able to also reason about like concrete
values with a plain unit test, but it's nice to have both at your disposal.
The thing I have found very difficult to use with property testing is building the data
sets. So usually what I find is, and this is why I don't use it much, actually, pretty
much never, is because there's always something that you don't do know what to accept as input.
For instance, you say if you put a minus sign before the value, they should be negative.
But that is not the case if the value is zero. So how do you say I want a float or an integer,
but not zero? Because for zero, it will fail. And it will definitely try zero at some point.
Right. I know. I agree. That's a really tricky part of it. One way you can do that is you
could do like and then you could do int plus one.
Then what if it generated minus one? Oh, isn't there like a positive or you could
do like absolute value. You have to get clever with it. Yeah. I guess there's not a positive
int one, but you could make a positive int fuzzer by saying taking the absolute value
and then adding one. Yeah. And then you can give yourself this building block of a positive
int fuzzer. But it is a little bit awkward. I think there's a way to like say that a particular
fuzz value to exclude it, but it can also lead to like stack overflow issues with the
fuzzing. Yeah. Or you could also ignore the test, say expect.pass, meaning the test will
just pass if the value is not valid. But if your values are rarely valid, then the test
is not worth much. Right. Oh, there is an int range. There's int range. So you can give
it a range. There's float range. And you can get clever like sometimes with characters.
Like you could create, there's also fuzz.constant and fuzz.oneof. So like you could make like
a vowel fuzzer and you can build up all sorts of complex things, but you do have to get
a little clever. And I mean, much like Elm, sometimes the way that you want to express
it isn't the way it's going to be natural to express it. And so you have to go a different
route. But once you do, you can find an elegant solution. Yeah. Jeroen, do you ever test views
in Elm? No. Do you know what the reason why? Tell me. Because he mostly writes Elm review
and there's no view in Elm review. Well, except in the name, but that's it. No, I don't. I
know there's a module or several modules to test the HTML in Elm tests or Elm explorations
tests, but I've never tried them. Have you? I've used them a tiny bit, but I find that
if I'm doing unit testing, what I find is that usually what I want to do is I want to
create a data type that represents what the view is going to be and make assertions on
that data type. And then pass that like sometimes in an object oriented context, people talk
about like view objects, you know, that you can have something that represents everything
in a formatted way and the view is just templating it. It's just picking off those little pieces
of that view object, right? So that's what makes a lot of sense to me is just have all
of that data formatted in a particular way. And if I want to make assertions about that
format, I don't need to grab it out with CSS selectors in the HTML output. I just make
assertions about this data type. And then I pass that data type to be rendered as my
view. But all it's doing is picking off these values and presenting them directly. It's
not manipulating them or doing any logic on them. Okay, but then you don't test your view
much. Right, exactly. So to me, doing a unit test at the view level never feels like it's
adding value and giving me confidence. It just feels like a pain and it feels like it's
coupling me to some things that I don't want to be coupled to. And it doesn't feel like
it makes me feel more confident that I got things right in my view. If I want to change
my view, then I change my view. I want to test the underlying logic. Now, that said,
that changes if we're talking about a more high level test rather than a unit level test.
If we're talking about an end to end test, then it's very valuable. Then you say, okay,
I click on the login button. I type this into this input field. I click on this button.
I navigate to this page. I should see this show up. Then in that case, making assertions
about what's on the page is great. So that's the distinction I would make. Yeah, if you
can write scenarios, which bring a lot more value because you test a lot more things,
make sure that things don't break. Whereas with unit tests, you only test one tiny thing
and that could work, but not the whole thing. Exactly. Yeah, the sort of spectrum from unit
to end to end testing is, you know, the lower level would be unit tests and the higher level
would be end to end tests. So on the lower level, you're more tied to the implementation.
You're less confident about the pieces fitting end to end. You're less realistic, right?
But they're faster to run. They're easier to write. Unit tests are great for exhaustively
checking corner cases. Especially with property based testing. Yeah, property based testing
is a great way to do that because they're very fast to run and it's easy to write a
lot of them. But if you think about exercising corner cases in an end to end scenario, you're
like, log into this page as this user. Now you have this combinatoric explosion where
you're like, okay, well you log in as this user and this user and you log in as a guest
user, you log in as an admin user, you log in as a regular user, and then you test out
the corner case for all of those. It doesn't make any sense. You don't have 10,000 end
to end tests in your project? Yeah, it gets insane, right? So that's the way I think about
it is the role of a unit test is to exhaustively check corner cases, but it's not to give you
confidence that the entire system is working together. So there's this notion of like the
testing pyramid where like if you split off the pyramid into three sections where the
very top triangle is one part of the pyramid, then there's like the middle slice of the
pyramid and then there's like the bottom part of the pyramid. The bottom part of the pyramid
is the biggest chunk. That's your unit tests. You want a lot of unit tests. The middle chunk
is like integration tests which don't exercise the full system, but they piece together some
parts of it. And then the top most chunk is like end to end tests or like smoke tests
and that sort of thing. And you want to have few of those, but they give you confidence
that everything is working out together. So, you know, to be honest, I hear more and more
people talking about the value of end to end testing and I find myself writing more and
more end to end tests as I go along because ultimately I want to have confidence that
I can hit the deploy button and just be confident that things are working. It depends on the
use case, but it's good to have like a sense of the role that these different types of
tests play and then you kind of have to use your own judgment.
Yeah, I think you need all of them. Yeah, you need to balance the speed and how many
you want for each one. So that it remains maintainable and fast and so that it doesn't
hinder you more than it helps you.
Right, exactly. What I've seen a lot of with people being hindered by their test suites
and it becoming a burden is like in Ruby on Rails shops, people end up with like all of
these sort of integration tests that are mocking things like crazy. So they're doing a mixture
of like they're actually executing database requests, you know, they're actually like
performing database queries and they're sort of rendering HTML and making assertions about
the HTML and like stubbing out HTTP requests, mocking certain like reaching in and stubbing
out certain functions. So they return something and then making a mock to assert that this
method gets called on this one thing. So you don't know what's real and what's fake. And
to me, that kind of test has so little value because for one thing, it's very coupled to
the actual implementation. So if I change the implementation, you can end up with either
false positives or false negatives. So you can have a failing test when everything as
far as the user is concerned is working perfectly. You have a passing test when everything is
completely broken for the user. And so all I can say is I'm very happy that Elm doesn't
have or need mocking.
Yeah. That makes me think of another very cool part of Elm with regard to testing. You
know what one of the worst things with tests are is that when you have tests that depend
on each other, when you set up something in one test and the second test depends on the
first one to have been run. And that is so awful to debug and to run. But it doesn't
work when a dude tests that only. Like what?
I'd blocked that out of my memory, but you brought it back. Yeah. I think you don't have
that in Elm because everything's immutable.
Exactly. And deterministic. There's like some test helper. I don't remember if it's an R
spec or mini test, but like I think in some Ruby thing, there's like a method that you
can call and it makes it so the order is deterministic. So the order is like always in the same order
rather than randomizing the order of the tests. For exactly the type of thing you're describing,
they run the tests in a random order, but there's some method you can call that makes
the order fixed. And the method name is like, I'm a horrible person and I don't know, I
kill puppies or something like that. It's like some awful name. It's like if you really
want to do it, you at least have to admit that you're a terrible person first.
I need to find that one.
We'll link to it in the show notes just for fun.
That's better than those React hidden functions that is, please don't use me or something.
Oh my goodness. So I was writing a test in Ruby the other week and I actually like shipped
some code and there was a method missing error. And I'm like, what? How on earth did this
test pass? And then there was this exception in production. And then I was looking into
it and someone actually pointed it out. They're like, Oh yeah, R spec monkey patches the global
variable context. So there was some undefined thing context, but R spec was monkey patching
it so it wasn't giving a method missing exception. So that was a fun one. Suffice it to say,
consider yourself very lucky to be working in Elm and testing is so much nicer than it
is in other languages. So please write tests.
Jeroen, how do you think about this sort of question of what do you test in a typed functional
language like Elm and when do you rely on types or perhaps, you know, certain properties
that something like Elm review or tooling can provide you?
Oh, usually I try to make impossible states impossible. I stop when I can't find a way
to make something impossible or when it becomes very unusable. I try to balance usability
and correctness. So I still don't know when I would use Elm review for something like
that. It really depends on the case. I think there's a lot of places where you could use
Elm review kind of as a test in a way, something that tests the contents of your code base.
Yeah. I mean, you can certainly use it to make assertions about literal values.
Yeah, exactly. So I made a blog post at one point saying here's a safe way to write regex
in an unsafe way, in a way that looks unsafe.
And that gives you the kind of the rights test for you without you having to write it.
Right. Because you could, yeah, like let's say you were writing, like there's a similar
thing that maybe is something a valid username. And so you could write tests that say it returns
nothing if I pass empty string and it returns a just username value, like a custom type
that proves that it's a valid username if I pass it this other value. So that is something
you can test. And that's in a way you're leveraging the type system, but you're letting your test
suite give it the stamp of approval that this type does indeed represent a valid check.
You could even do fuzz testing on that, right?
If you're only passing literal values to that, you could also use Elm review to help you
with that.
Yeah. Because you know, never in the code base I use a constant that is the empty string.
Mm hmm. I mean, ultimately these are all just verification methods, you know, whether it's
static code analysis, which is, you know, essentially Elm reviews a static code analysis
It could be, you know, a compiler, you know, types. It could be a unit test or an intent
test. They're all just tools at our disposal to verify our code.
Yeah. So what would I use tests for otherwise? It's usually business logic. So things that
I can't represent with types. So if I try to have a function that says if the value
is X, then do this. And if I have something else, give me something else that is something
that will always be valid type wise. And that's where I would write in a unit test to make
sure that in the first case you always get X. In the other cases you always get something
You don't have to test the wiring like you do in JavaScript or maybe in Ruby. I don't
know Ruby that much. So there's a lot less tests that you have to write.
Right. Exactly. Yeah. There are a lot of tests in Ruby about, I mean, like a lot of, a lot
of API's in Ruby. It's like if you pass in a string, then you call it like this. If you
pass in a list of strings, then you call it like this. If you pass in a regex, then it's
going to run it like this. If you pass in a hash and it has a key called this, then
it's going to run it like this. And like, and of course you have to check for nil. And
if this thing is nil, then it's going to interpret it as this or as that. And it's so much simpler
in Elm just having, having the confidence that those things are going to be wired up
correctly both for the caller and you know, your, your test cases are exercising all the
different paths. And you also just have the simplicity that you can't overload functions
like in other languages. And in some cases maybe it would be convenient, but it really,
it's, it's a, it's a very nice quality of Elm that keeps things very easy to think about
and test.
Yeah. In the testing pyramid, I would actually put the Elm compiler at the, at the base of
it. Yes. Like a tree or something.
I like this. I like this. Well, but you want more, you want even more.
Right. More than unit tests.
Yeah. I would also put Elm review between any Elm compiler and unit tests maybe. So
Elm compiler, Elm review, unit tests, integration tests, end to end tests. Maybe Elm review
could be after, it could be somewhere else, but.
Yeah. It might depend on the context too. Cause there are some things that Elm review
can make really great assertions about with static analysis. And there are some things
that can't do good stuff, static analysis on. If it's like, if it's looking at literal
values, then it can do a lot. If it's looking at user input values, then it can't necessarily
make as many guarantees about that.
Yeah. And that's where you would use types or unit tests, probably unit tests in this
Or end to end tests.
I like this extended, extended pyramid. Yeah.
Use whatever you can to, to create confidence in your system.
So we get all these tools, like even Elm GraphQL gives you a lot of guarantees. I don't know
where you would put it in the pyramid, but it gives you some kind of confidence.
I mean, I would put that as part of the Elm compiler, you know, it's just extending the
number of guarantees that the Elm compiler can make for you.
Yeah, exactly. So this question came up of how or whether you should test internals of
your Elm modules. Do you have thoughts on that?
Yeah. So usually I try not to test implementation details. But I find that what you test in
unit tests are kind of like implementation details of end to end tests.
That's right. They totally are.
Everything is an imitation, is an internal of something else.
So I think it's best to try to test at the highest possible level, even in unit tests,
when you can. But if something gets very complicated or impossible to test, because you think something
might get into some kind of state, but you don't know how, then testing the implementation
could be useful.
It's not something I do often, though.
I totally agree with you that unit tests are testing the implementation. When you look
at it, it's, I mean, from one perspective, any unit test is the implementation, as you
said, of the end to end story. So you are testing implementation, and you are actually
coupling yourself to specific implementation by writing a unit test. So that's why some
people really double down on end to end tests more, because they say, well, now I can change
the implementation, and my tests keep passing.
So I think one question to keep in mind is, how much extra setup am I having to do? How
much noise is there in the test where I can't tell whether the thing I care about in this
test is being exercised and is working or not?
So if you have to do a bunch of setup, like we mentioned at the beginning, if you have
to log in as this kind of user, that kind of user, that kind of user, and then you test
100 edge cases on this one part of the page, but you have to navigate in 10 pages deep
to test that. Maybe there's a unit test where you can really thoroughly exercise the edge
cases of if it's a guest user or an admin user or whatever, what's the visibility? What
are the visibility permissions?
And then you have one end to end test or a couple of end to end tests that exercise that
and make sure you indeed can only see pages that you have permission to view. You can't
see the admin panel if you're a guest or a regular user or whatever, but you want to
thoroughly exercise all of those permutations in unit tests. That's what unit tests are
great for. And if the implementation changes somehow, you can throw those tests away. It's
not a big deal, but it's nice to have some confidence that the user is going to see things
working correctly because it's actually exercising that code that your unit test is testing.
But in terms of should you test internals of Elm modules, I very much agree that I think
it's a feature that you can't write Elm unit tests of internal private things. I don't
think that approach makes sense to me. I think of it as if you find yourself wanting to test
the internals of one module, then what that's saying is it belongs as its own responsibility.
I think about code often in terms of responsibility. Should this really be the job of the admin
privileges, like the admin module to know whether I have access to this or not? Or should
it be its own module that tells what you have access to? Maybe that belongs as its own responsibility.
And maybe the fact that I'm trying to test this function like admin has access to that's
like a private function in this module means that it wants to be its own responsibility
and it wants to be tested separately and in a separate module. I think a lot of test room
development is about, it's not a magic bullet. It doesn't fix your design. It doesn't make
your code nice or make your code work.
Unfortunately, but it does expose issues and then you have to pay attention to the signals
it's giving you. So if something is uncomfortable, that might be a design smell. That might mean,
Hey, this code is hard to test. Well, what's that telling me about my design? We talked
about end to end testing, but we didn't really talk about how you would do that in Elm. I
think sometimes when people think about testing Elm, like we're so used to living within the
Elm ecosystem and we don't want to go outside of it, but I think it's worth just stating
that it's okay to use non Elm tools to test Elm code. For example, you can use Cypress
to exercise, you know, to pull up a browser and start clicking around and you know, you
might have to write some JavaScript and make some assertions there, but it's okay. Like
use the best tool for the job.
Yeah. I love Cypress. I find it very good. I'm very sorry that I've never had the chance
to use it at work and I have the chance to try it out, set it up and then it got forgotten.
But that's what often happens with end to end tests. In my experience, they get left
Oh, but they can be so helpful. Yeah. And then there's also Elm program tests. So I
would characterize Elm program test as more of a, an integration test than an end to end
test because it's not, it's not running end to end, right? It's not running in a browser.
Yeah. Yeah, exactly. It's, it's simulating putting pieces together rather than actually
doing that.
Yeah. You're writing scenarios like I have this application or this element. When it
gets displayed, the user clicks on something and then this action gets triggered or this
command gets triggered. This message. Why? What did I say? All of those other things.
And then you cycle and try other things and do expectations all the way around.
Right. And then it simulates HTTP responses coming back with certain statuses and bodies
and stuff.
But I think that could be the topic of another episode.
Definitely could be the topic of another episode.
Let us know if you want that.
I think that could be a fun one. I've been getting a lot of value out of Elm program
test. I think it's a really cool tool. Definitely, definitely check it out. And, you know, one,
one point to mention on that topic is testing effects. So like effects are opaque in Elm.
You can't inspect a command, right? You can't like look at the command that's being returned
and make assertions about it. So one of the things that Elm program test has you do, which
is kind of a pain, but you get a lot of value and it's not the worst design idea. It's a
reasonable design. You have to create a custom type that's a sort of intermediary value that
represents all the possible effects that you can have in your domain. And then you have
to write a function that turns that effect custom type that you define into an Elm command,
which is an opaque thing. So you can make assertions about the commands that you're
receiving. So yeah, I mean, basically Elm program test is it's actually you, you kind
of give it your init and update and view, and it ties those pieces together and simulates
some of the commands and stuff. So that's quite a cool technique. Martin Janacek had
an Elm Europe talk that he gave about this tool he built called architecture test and
Richard Feldman built like a similar tool for basically testing your update function.
So that's another kind of interesting concept. I haven't played around with those particular
tools too much, but there's so much you can play around with testing in Elm. It's really
it's purity makes it really fun to play around with testing things.
Yeah. I think the biggest hurdle is getting started. We like to write code and we like
to refactor, but writing tests is a bit of a pain. But once you get into the hang of
it, and once you have that coverage, then it feels very good.
Exactly. Yeah. Yeah. And it is. Yeah. I mean, I really think that it's a great way to slice
up a large task into small chunks because when you're looking at implementing something,
you're like, where do I even be? Like, what do I even want it to look like when I call
the function? What do I even want the API to look like? Well, you know, with the test,
that's the first thing you do the red step, you write what, what might it look like? And
then before you even run the test, you're like, is this really what I want it to look
like? And you can think about that without worrying about the implementation a little
bit. So it's a habit. And I think I think the way you get started, as with any habit
is try it out, and then try it out some more and then try it out some more experiment.
Soon, you'll you'll get Stockholm syndrome and you'll learn to love it.
All right. Well, I think that gives people enough to get started. And maybe we'll we'll
circle back another time with some some deeper dives on Elm program test, and maybe some
other testing techniques, some refactoring techniques. But hopefully that gives people
a good place to start. Yeah, I think it does. I hope it does. Let us know if it does. Let
us know if it meets your expectations.
Well, it didn't. Until next time. Until next time. Bye bye.