Elm and AI

Can we get strong guarantees from AI tools that are known to hallucinate? We discuss some strategies, and ways that Elm might be a great target for AI assistance.
April 11, 2023


Hello Jeroen.
Hello Dillon.
I'm very glad to be talking about this topic so promptly.
I think it's really important to get on it and talk about it right away.
I was thinking you were going to do a pun with AI or something and I was like, I couldn't
find anything to, anything funny to say, but yep.
Good one.
Good one.
Yeah, it was unprompted, but I had to go for it.
This was a handcrafted artisanal pun, not coded, not AI assisted in any way, I promise.
I believe you.
Not, not on the unprompted part.
It actually was not AI assisted, although, um, we're going to, I imagine we're going
to have a lot of bad puns out there these days with AI assisting people, you know, using
it for evil and creating more puns in the world.
So be on the lookout.
Well that will compensate the fact that people have less kids nowadays and there are less
dads, you know, so I think we've introduced the subject.
We're going to talk about AI and Elm and all of those things that are quite buzzy at the,
in the last few months.
And ever more and more, it's a bit hard to keep up.
It's going so fast.
So just to set the record or we're recording at the beginning of April, 2022, 2023.
No you are wrong.
If we were recording this in 2022 and talking about GPT-4, that would be incredibly prescient.
Yeah, but I'm just going to gaslight you and tell you you're wrong.
Just like that AI.
It's a hallucination.
Are you hallucinating?
Are you an AI?
I am legally contracted to say no to that.
So yeah, so this is April, 2023.
In one month, this will be outdated.
We hope it won't, but we'll see.
So chat GPT-4 was just released.
GitHub Copilot X was announced.
Yeah, I'm on the waiting list for that.
Haven't gotten access yet.
So yeah, this is the current state.
AI journey is the best one to create images so far, and we have no clue how it will be
in the future.
Things are moving very fast.
Yeah, very fast.
So these are all very interesting technologies and people think that they are going to revolutionize
a lot of things.
They're probably going to be right in some aspects and wrong in some others.
What I want to talk mostly today about was like the coding aspects of it.
Is AI going to replace developers?
Is it going to help them mostly?
And all those kinds of things that people might wonder as well.
How do you use this effectively to your advantage?
How do you avoid being misusing them as well?
So you've played with chat GPT-4 at the moment, and I think you've had some interesting results.
Do you want to talk about that?
So maybe to set the stage a little bit regarding the question of will AI take our jobs?
I tend to think that, I mean, of course, it's very interesting and time will tell and anything
could happen.
Things are moving very quickly.
I don't think we can make predictions with too much accuracy at this stage with the way
things are going.
But I tend to believe that this is going to be a very powerful tool.
And I think that it's more likely that people who are good at using these tools will replace
your job rather than the tools themselves will replace our jobs.
So I think it's an important skill, just like Googling was an important skill.
Of course, it's a next level of that.
It's more sophisticated.
It's able to do things on its own, which is not quite what Google was doing for us.
But you can be...
Googling is an important technical skill.
And if you are a developer, it's worth trying to get good at that.
And in the same way, I think using AI tools effectively is an important skill.
Similar to using an ID effectively, for instance, instead of writing everything in notepad or
on literal punch cards or...
Well, now I'm going too far, but...
Well yeah, exactly.
And we have to keep up with the technology.
So I would say that maybe I'm an optimist, but I would say keep up with the latest technology
to stay relevant.
But hopefully we're going to be able to do more sophisticated things leveraging these
Now, another sort of piece of background I want to introduce to this conversation is
to me, this is a really big question.
Is Elm and pure typed functional programming a good combination with AI?
Do you have an opinion on that?
Well, it could be a good one or it could be an irrelevant one in the sense that if an
AI writes all the code for you, and it's awesome at it, like much better than you could be,
like, let's say a hundred times better than you, then why do you need type safety?
Why do you need ease of refactoring and a language that is easy to maintain?
Maybe you don't.
I personally don't believe that that will happen.
I'll talk about that later.
I think it's actually better to have a, at least a typed language like Elm to work with
because at least so far in April, 2020 something, chatGPT and all those tools, they give you
incorrect results.
They're very impressive and they're often correct, but they're also more often than
not, I'd say incorrect or slightly inaccurate or in the case of Elm, often it has in syntax
And you want to know about those, right?
Yeah, exactly.
And one thing that Elm gives us is a nice compiler with a great type checker and some
additional linting that it does on its own.
And that helps prevent a lot of issues that humans make, but that an AI can do as well
because those AIs have been trained on humans and Slack overflow.
Probably not on the Elm Slack, by the way, which would be a great resource.
Yeah, that's true.
I've noticed a lot of like old syntax, like the prime syntax, the little apostrophe it
likes to use.
But I also think it's been trained on Haskell code.
Yeah, totally.
In some cases.
I really like your framing.
I think that's spot on for, you know, is typed pure functional programming a good fit for
these AI tools?
And I think this is exactly the right question.
Are humans going to be maintaining it or are machines going to completely just take over
the code and humans don't need to be involved?
I think that's exactly the right question.
I think that my gut feeling is right now for the next little while, at least, we've still
got a little time where we're going to be an important part of the process.
And I see it more as a collaboration between humans and AI at the point where the humans
are no longer needed.
Then, yes, the whole pure typed functional programming thing doesn't matter so much because
it's an implementation detail at that point.
But until then, I think it's extremely valuable because, as you say, it's very prone to hallucinations.
So like...
Hallucination being when it says something and it thinks it's right.
And when it isn't, when it isn't.
Yes, because these... and hallucination is like sort of the technical term that open
AI is using in some of these white papers talking about this and stuff nowadays.
But hallucination, it's very prone to hallucination because these are sort of predictive models
that kind of synthesize information, but it's not an exact science.
And sometimes it mixes things together that don't quite fit.
And so I think, I mean, Jeroen, I think it's fair to say that we really like having tools
that we can trust.
If there's one thing you can say about us as the hosts of Elm Radio, I think that's
a fair thing to say, right?
Yeah, yeah.
We do not accept half guarantees, right?
We like guarantees.
We like constraints.
We like guarantees.
Yeah, otherwise we wouldn't use side trips.
Right, exactly.
Then we get like pretty good... you can't really call it a guarantee if it's pretty
good, right?
Guarantees are not pretty good.
Guarantees are guarantees.
You get pretty good assurances, pretty good confidence, but you don't get guarantees.
We like guarantees, right?
Well, I guess to some extent Elm is also improvable in that way.
Like we're not doing formal logic or proofs.
We're pretty close to that.
Especially considering how simple the language is and how usable.
And the things that you can know about an Elm program, you do know.
So like we don't get guarantees about everything, but we do get certain things for which they
are definitely guarantees.
And we like that.
And we try to get more of those things.
I know somebody who built a whole tool to try to get more guarantees through static
analysis for Elm code.
So people go to great lengths for these things, I hear.
So then when you're talking about guarantees and then AI that's prone to hallucination,
that becomes an interesting question, right?
Now I actually am pretty confident about our ability to do useful things with that.
Maybe that's counterintuitive because I'm talking about how much we care about guarantees
and then talking about hallucination.
I'm actually very reluctant to integrate things like GitHub copilot suggestions into my code
because I think it's a very easy way to introduce subtle bugs.
But the way I'm thinking about how AI fits into my workflow for writing Elm code and
my sort of ideals for tools that involve like trusting my tools so that I can do critical
thinking and then delegate certain types of problems with complete trust to a tool, right?
Those two things do fit together, but not out of the box.
If you just throw in whatever suggestions GitHub copilot is throwing at you, like for
example, I was playing around with GitHub copilot, which for anyone who hasn't used
it, it's now a paid tool, but it will give you sort of fancy AI assisted, GPT assisted
auto completions in your editor.
I think it's free for open source maintainers.
With a certain number of stars.
Unfortunately, I didn't qualify.
I think I qualified.
Or maybe I'm using the free version still.
Yeah, you might have been using the open beta.
I got it for free in the beta period, but you probably need like 10,000 stars or something.
I don't know.
Not many.
Hey, people, we need stars.
Get on it.
Otherwise, we need to pay for like $10 a month or something, which is unacceptable.
Yeah, our listeners can pay us in stars.
Go star a project right now.
Pause this podcast and star our projects.
So it's definitely interesting working with Copilot.
I have to say, I don't find it.
In some cases, I'll trust it.
I will have a custom type with four variants, and I will write out a function that says
my custom type to string, and then it fills it in perfectly.
And it's impressive, but there are certain things like that, that I have an intuition
that it's going to be really good and trustworthy at solving.
That said, it does hallucinate certain things.
It will hallucinate certain variants because the process through which it is arriving at
these suggestions does not involve understanding the types of the program like the compiler
So to me, where it gets very interesting is when you start using prompt engineering to
do that.
And so I've been thinking about a set of principles around this.
So prompt engineering is when you ask a question to get a Copilot or mostly chatGPT or other
tools and you do it in a specific way, like you frame your questions in a specific way,
you ask for specific things, you give additional instructions so that it gives you better results.
I don't know why people call it engineering yet, but it's very interesting.
Although, I mean, there are prompt engineer job posts out there, and I think this is kind
of going to become a thing.
So it feels more like politics, like when you try to phrase things that sound good to
you, what makes you sound good.
It's more like a speech thing than an engineering thing so far.
I hope those literature majors in college are finally cashing in on those writing skills.
Oh, now you're not making fun of my poetry degree, right?
Now that I can make full blown web applications in two seconds.
Yeah, I mean, I think writing skills have been valuable for a long time, but this unlocks
a whole new set of things you can do, including engineering with your writing.
And it really is like, I mean, if you think about what these prompts are doing, like the
way that they work is they're based on the context they're given, they're sort of like
role playing with that context, essentially, because their basic premise is given like
this context, what's likely to follow it, right?
So if you write in a certain style, it is going to be more likely to follow with a certain
If you write in a style that sounds very much like a scientific paper, a scientific journal
publication, and you write about your method for your study and all these things, it's
probably going to give you more rigorous results.
And it's probably going to do the things that it's seen by gathering information from a
bunch of scientific journals, like coming up with a rigorous method and talking about
whatever, like counterbalancing, you know, addressing particular concerns and stuff.
So like, you have to really get it to role play to solve the problem you want it to solve,
to be like the smartest thing to address the problem you want.
And that's where nerds come into play again.
Yes, totally.
Because we kind of get that.
It's not just like some magic box to us, like we kind of can understand how it could synthesize
information so we need to give it useful context for that.
Now I was thinking like, people play Dungeons and Dragons.
Oh, that too.
Or used to role playing.
Yeah, very true.
Yeah, so like, I think priming it with good context is one thing that I've been thinking
about as I've been playing with it.
And another thing I think about is like, is like how verifiable is the problem you're
giving it?
So if you give it a problem and you're like, I don't know, what is the meaning of life?
And it tells you something, it's like, well, how do I verify it?
Like I haven't given it much context, I've given it sort of a vague thing, it's going
to be piecing together a bunch of things.
I'm not like giving it a specific problem to solve.
I'm not giving it a problem that I can verify that I've gotten the answer.
So if I give it some like an elm problem, and I have a way to check.
So like there are certain problems where it's difficult to find an answer, but it's easy
to know that the answer is correct once you have it.
And P problems.
Is that the term?
No, but you know, P equals NP.
I never quite got that.
I'm going to need to ask chat GPT to explain that to me better.
Okay, let's not get into that one.
That's a few hours of content, which is probably not best served by us.
I guess like things like the traveling salesman problem you like would be an example of that,
And then yeah, that's an NP problem.
You can tell if you have a solution and it does fit an optimal path, it's easy to tell,
but it's not easy to derive an optimal path, something like that.
Yeah, almost.
You know whether the solution that is given is a solution because you can check it and
the checking is pretty easy.
But knowing whether it's optimal is extremely hard.
I see.
Right, right, right.
So like, is this the most optimal solution?
Well, to check that you would need to check all other solutions.
And it's easy to if you find a counter example, then yes, you know, it's not the most optimal
one, but to know that whether it is indeed the most optimal one, you're going to have
to check everything.
And that's, that's extremely expensive.
Right, exactly.
So I think like, to me, that's the type of mindset for finding good ways to use these
tools to automate our coding.
Also like, like you mentioned, finding a counter example in the traveling salesman problem
is easy to verify because you just check how many nodes it traverses or whatever, right?
And it's, is the number smaller, right?
So that's a very cheap operation to test a counter example.
So if you know, so, so if you're able to like, get, let's say you try to prompt the, you
know, write a prompt engineering prompt for chat GPT to solve the traveling salesman problem
for something and you set it up and you prime it with some context and you like, you found
one solution, but now it needs to find a better path.
And if it, if it gives you a more optimal path, then you're done.
You can easily verify that and you can say, you know that it provided you with something
valuable because you can easily verify that it's a valid solution and a more optimal solution.
So the, there, this class of problems that is easy and cheap to verify that it's valuable
is the kind of thing that where I find it to be a very interesting space.
And I think that Elm is very well suited to this type of problem.
So like one very simple example, like if you, if you want to write a JSON decoder and now
another consideration here is like what inputs can we feed it to prime it to give, to give
us better results.
So that, so we want to give it like prime it with good context and we want to have verifiable
I've also been thinking about like that verification through a feedback cycle.
So iterating on that verification becomes interesting.
If you use the open AI APIs, you can automate this process where you can test these results.
So you, and then you can even give it information like Elm compiler output or did a JSON decoder
So if you, for example, you're trying to solve the problem of, I want to write a JSON decoder
and you either have like a curl command to run to hidden API or some JSON example of
what the API gives you, for example, that's your input.
You prime it with that.
You can even prime it with like a prompt that helps step you through that process to, to
give you higher quality output, but then you can verify that.
So you say your job is to write a JSON decoder.
It needs to decode into this Elm type and it needs at the end, it needs to give me a
compiling JSON decoder of this type and it needs to successfully decode given this input.
That's all verifiable.
So if it gives you garbage or hallucinate something, or it gives you invalid syntax,
you can even tell it and it can iterate on that.
And you can kind of reduce the amount of noise.
Because I don't want to hear about hallucinations from AI.
So you know, like before I mentioned like how much we want guarantees, not like somewhat
high confidence.
I want guarantees.
But if we can throw away anything that's garbage and only get signal, no noise, then we can
do really interesting things.
And Elm is really good for that.
You would like to have a system where you skip the intermediate steps of saying, telling,
hey, this is wrong, because this doesn't compile.
So here's some source code.
Here's my request.
And then there's some back and forth between the Elm compiler, for instance, and the system,
the AI.
And then you only get to know the ending result.
That would be much nicer.
And then it's like a proven result.
It's a guarantee at that point.
So this is kind of the cool thing is like, with a little bit of glue and a little bit
of piecing things together, a little bit of allowing it to iterate and get feedback and
adapt based on that feedback, which is actually like GPT-4 is very good at this.
You can get guarantees, you can get guaranteed safe tools, especially with Elm.
With JavaScript, that would be more challenging.
Yeah, it would be hard to verify the results.
But I'm guessing, or at least whenever you say verifying the results, I'm thinking of
the Elm compiler.
But I'm also thinking of writing tests, you know.
I would probably also try to include the results of Elm tests to the prompt, if possible.
But that does mean that you need to verify things.
And that's kind of what our industry is all about, right?
Why we have software engineers and not just coders.
That's why we call ourselves engineers, is because we make things and we know it's going
to be, we know we shouldn't trust even ourselves.
We shouldn't trust the code that we're writing, the code that we're reading, and the code
that has been running for years, because we know, well, there are bugs everywhere.
So that's why we have all those tools, type systems, test suites, formal logic, manual
QA, all those kinds of things, to make sure that we do things correctly.
And also, even the processes, like the Agile movement is running your code in such a way,
or working in such a way, that you get better results out of it.
So we do need to verify our results.
And we can't just use the results of the AI willy-nilly.
I mean, we can, and people are.
I think that's actually kind of the norm.
It's going to become increasingly common to see, sort of like, this is a really weird
piece of code.
Does this even give the right results?
Like, oh, somebody just YOLOed this chat GPT or this copilot completion into the code and
committed it.
But I mean, it's something very different from what we do today.
Because in a lot of cases, we are still running code and with not a lot of tests in practice.
I feel like most people don't write enough tests, myself included.
So this is just maybe strengthening the need for adding tests.
In a way, like our role changes, right?
Our role becomes more like verifying and guiding systems rather than like, I can write a line
of code.
That's not the super valuable asset anymore.
But I do feel like because it's going to be so easy to write code, and because you don't
go through all the steps of writing good code, you're not going to do it as much.
For instance, what you like to do, and myself as well, is to do things in a TDD style.
You know, you start with a red test, and you change the code to make the test green, but
you only change the code as much as necessary for that test to become green.
And then you continuously improve or complexify that function until it hits all the requirements.
But if I ask the tool, hey, can you give me a function that does this?
Well, I probably won't have all the same tests that would have been the results.
Just like running tests after the fact.
So you can probably ask the tool to write tests, but do you want an AI to write your
Right, exactly.
It's kind of like, who monitors the police or whatever that sentence is.
Right, right.
I totally agree with your framing here.
I mean, I think this is a very good way to look at it.
And I think, you know, what do we want to take for granted is like how I would think
about that.
So for example, JSON decoders.
Do we want to take JSON decoders for granted?
Kind of like we kind of we want to be able to write them with a lot of flexibility, but
we don't want to spend a lot of brainpower creating and maintaining them.
So I mean, if they're verifiable, that's great.
If we can continue to verify them, if we can, I mean, better still, if we can use something
like GraphQL to make sure it stays in sync even better.
But we don't really want to have to think too much about building and maintaining those
low level details.
We want that to just be like, given a decoder that works.
And so this is a very good thing to delegate to AI and in my opinion, and whereas like
solve this complex problem that has a lot of edge cases, and a lot of like things to
consider the use case, how do we want it to behave and stuff like these are the types
of things that I think our job as an engineer is still extremely relevant.
Thinking about the user experience.
And in my opinion, I think that engineering, these types of things are going to become
a more important part of the job.
Thinking about the user experience.
Sure these AI systems can sort of do that, but we need like they can, we can tell them
think about the user experience and think about these different use cases and think
about that in the test suite you write.
But I think you want a human involved in really artisanally crafting user experiences and
use cases.
And then you want to say, okay, now that I've figured these things out, here's a suite of
And if some AI thing can just satisfy those tests, maybe you're good, you know?
Actually, one of the things that I tried with chat GPT three, so maybe it's better now,
but I think my point was to hold is I told it, please write a function that satisfies
these elm tests.
So I wrote some tests and basically told it to write a function.
And it did so and it was pretty good, but it wasn't correct.
Like there were syntax errors, which I told it to fix.
And when those were gone, well, the, the tests were not passing.
Some of them were, but not, not all of them.
And the function that I needed was slightly a bit too complex to be such an easy function
to implement, as you said before.
So basically the code that it wrote was pretty hard to read.
And so that means that, okay, I have something that I can use as a basis and that I need
to change to, to make the test pass the few failing tests pass.
But because it was so complex, I was like, well, how do I make the test pass?
Well, to, to make the test pass, I need to change the code to change the code.
I need to understand the code.
So how do we understand the code?
Well, if anything you've taught me is like, or other people in the agile community, like
you can get an understanding of the code by changing the code, by doing refactoring techniques,
so extracting variables, renaming things, changing how conditions work.
And as you do these steps, these tiny steps, because we like them, you start to get some
insights into the code and then you can finally notice, oh, well, this is clearly wrong.
Now I know what I need to change.
And the thing is that I find funny is that this is exactly how you work with legacy code.
But this code is only a few seconds old or a few minutes old, which is like working with
legacy is becoming even more relevant.
Even this new code, which I find very odd and more interesting.
That's a nice insight.
I like that.
I think, I mean, I do think that we need to guide what kinds of results we want also with
these steps, with prompt engineering and priming.
But I think you're right that this does become a sort of process of creating some code that
we can look at its behavior, we can see, we can get a test around it and see that the
test is passing and verify it, but then not really understand the code and need to do
that process of refactoring to get clarity and get it in the way that fits our mental
model or gets away complexity.
But also like we can say, you know, here's a unit test, make this, like write some code
that makes this test pass.
And we can do some prompt engineering that says, do that using the simplest thing that
could possibly work.
Here's an example of the simplest thing that could possibly work.
In this test, there's this error message that the test is giving and you write this thing
that, okay, sorting a list, it returns the hard-coded list and it makes it green.
And that's the simplest way it could make that work.
So you can actually illustrate that with examples.
You can write very long prompts and you can get it to do a sort of fake it till you make
it style process that you can actually understand.
So you can get it to like follow the kind of process you would follow and it totally
changes the results you get.
And if you've, in addition to that, connect it to test output and compiler output so it
can iterate on that, you can actually like automate some of those things, which starts
to become very interesting.
I'm wondering whether that would have the same effect in the sense that if I don't see,
if I do this and I only see the end results, which is kind of the point, well, will I have
an insight into how this function works because I didn't write it.
So now it's just like someone else's code.
And again, if I need to change it, then I need to go through all those refactoring steps
or making it easier to understand for myself or just go read it well.
But definitely the thing that I will keep in mind is that all these techniques about
running good code, they will stay relevant.
So if I don't want to lose my job, this is the kind of things that I can maybe should
focus on because I think that these will stay relevant.
Maybe my whole job will be removed.
Maybe I will get fired if it has become way too good.
But maybe my chances of not being fired increase if I am one of those who are better at these
And one of the things that keeps coming up for me is like, what do I want to be atomic?
Like there's a certain philosophy of using tools that I've arrived at through a lot of
craftsmanship principles and TDD and things like that, which is like, I don't want tools
that I can partially trust and I don't want tools that give me partial results.
I want tools that I can completely trust and that allow me to take a set of low level steps,
but think of them as one high level step.
So to me, that's the question.
Now, in the case of making a red test green and a TDD step, for example, like do the simplest
thing that could possibly work.
What if that was an atomic step I could take for granted?
That instead of a set of low level steps, I will look at the code, I will hard code
the return value, I will create a new module with the name that's failing.
It says could not find module of this name.
I will create that module.
I will create a function of the name that the error message in the failing test says
is missing.
I will write a type annotation that satisfies the compiler and return an empty value and
have a failing test.
And then to make it green, I will change that empty value to a hard coded value that makes
the test green.
What if I could just take that for granted and say, hey, computer, do that step, do that
TDD step to make it red and then make it green in the simplest way possible.
And I could take that for granted and then I can take it from there.
That would be great.
And then that's something I can fully trust and I can sort of verify it.
And so another principle I've been thinking about in sort of like designing these prompts
and these workflows using AI tools is guardrails.
So like not only verifying something at the end that it did the correct thing because
you can run the Elm compiler, you can.
But along the way, if you can say, OK, like, for example, you can create a new module and
a new function, but you can't touch any other code and you can't touch the test.
The test has to remain the same and the test must succeed at the end.
You sort of set up guardrails and you say, listen, if the AI given these guardrails can
give me a result that satisfies all these criteria by the end of it, then if it does
that, I can verify that it gave me what I wanted and I can fully trust it.
Those are the types of tools that I want.
So one thing that I was really amazed by, I'll share a link to this tweet, but I saw
this demo where this was actually with GPT-3, but this example stuck with me where somebody
was finding that GPT-3 did a poor job if you asked it questions that went through sort
of several steps.
Like if you said, what is the capital of the country where the Taj Mahal is?
Then it would give Agra or something like the city where the Taj Mahal is instead of
New Delhi, which is the capital of India.
So what they did is they did some prompt engineering.
So they gave it a starting prompt, which said, here's an example of doing this thing.
And they kind of gave an example question.
They gave some intermediaries questions where it said, okay, well, in order to answer this
question, I need to answer an intermediary question first.
And so they gave an example of that as the sort of context for the prompt.
And then they said, now answer this question.
Question, what is the capital of the place where the Taj Mahal is, the country where
the Taj Mahal is?
And then it went ahead and followed that set of steps.
Intermediary question, what country is the Taj Mahal in?
Answer, India.
Intermediary question, what is the capital of India?
New Delhi.
Simple answer, New Delhi.
And it got the correct answer.
So it got higher quality results because it was guided to break the problem down into
sub problems.
And pretty interesting, right?
It's a bit funny because I feel like everyone has this like, oh, well, you know, if I tried
to chat to JPT and didn't give me great results, but then I tried to prime it or I tried to
give this kind of prompt or written this way, and now we get great results.
And I feel like maybe everyone will have their own ideal prompt and people will share it.
It's kind of like, well, I have my set of key bindings of shortcuts for my computer.
Oh, you don't have a shortcut for this action in your ID?
Oh, let me share it with you.
Or have their own proprietary prompts.
Maybe yeah.
And we will start to have like our very custom experience around working with AI.
It's like, hey, this is how I do it.
And this works for me.
And this might not work for you.
And then it'll be like, here's a very good prompt for generating the best prompt.
I kind of feel like maybe people who use Vim will have something like that.
There's a real art to it.
And I mean, you're getting it to break problems.
So I took this concept, and I applied this idea of breaking the problem down into sub
I actually had written this blog post years back about like solving Elm code, like solving
type puzzles in Elm, like a jigsaw puzzle frame then fill in.
So like a jigsaw puzzle, you start by filling in the borders and the corner.
The corner pieces are easy to find.
So you find the corners, then you've got sort of a set of fixed points.
So that's like one low hanging fruit.
It's easy to solve those problems.
Then you find the edge pieces and you can fill those in.
And now that you've got the edges, it makes solving the rest of the puzzle easier, right?
So that's one approach to solving jigsaw puzzles to break it down into sub problems.
But like with Elm types, I kind of, I use this technique a lot when I'm writing Elm
code as a human.
I will, I'll say like, okay, I don't know exactly what's going to fit here in this pipeline.
But I know I want to like take this starting value, I want to do something to it.
And then I want it to be passed through in this pipeline to this next function.
So sometimes I'll just like put a debug.todo there.
And then maybe I'll extract that debug.todo to a function or a let binding.
And then I'll try to get a type annotation for that value I have in my let binding.
That would satisfy the compiler.
Now I've broken it down into a sub problem.
So I took this sort of like, fill in this code, I don't know what goes here, I've turned
it into, okay, I know the type I need.
So that's sort of like finding the edge pieces in the puzzle.
So now I've created the sub problem for myself.
And now I can do things like, so I've got a debug.todo with a type annotation.
Now I can turn that into maybe around a debug.todo or with a debug.todo.
So now I'm saying, well, I know that if I apply some function over a list, it will work.
And now I've further broken down that sub problem.
And now I can follow that step again and say, okay, we'll break out another, that debug.todo,
give that a type annotation.
So now it's, some function, I don't know what, now follow that same process with
So it's breaking it down into sub problems.
I use this technique all the time when I'm like, often with like API design, you're doing
weird type puzzles, but also just with like user code, like trying to parse some markdown
and take the markdown blocks and find and traverse them and map them into this type
and whatever.
So I use this technique a lot.
Now I tried to use this sort of same prompt engineering idea to teach GPT how to follow
this set of steps of breaking down a type puzzle.
And it was actually really good at it.
So I'll share like a little GitHub gist of my GPT prompt and the results it got.
But what I basically did is I told it, I said, your job is to solve this type puzzle.
And I gave it some guardrails.
So like the guardrails I gave it were, I'm going to give you some code, which has debug.todo
replace me in it.
Your job is to, you know, get a, satisfy the type for that debug.todo.
And your final solution cannot include debug.todo.
You can write intermediary steps, which may include debug.todo.
And you are not allowed to change any code other than the section that says debug.todo.
So I gave it these guardrails.
Also these are verifiable things, right?
So you can test for this to see if it's given you a valid solution, given the guardrails
you wanted it to honor, because it might hallucinate and not do that, but you can check that.
So one thing that I'm thinking of is, will you be able to verify things accurately?
Maybe I'm being too, I'm trying to play the devil's advocate here a bit, and I might be
a bit too hard on chat.gpt.
But for instance, whenever you're working in that TD style, when you do things one step
at a time, you discover edge cases, right?
So for instance, you give a list as an argument and needs to return a number.
And first you hard code that number.
And then you notice, Oh, well, what if that list is, is, is empty then?
Oh, well then I need to write a test for if the, the, the list is empty.
Ah, okay.
So I'm going to do that.
But the, the AI might not do that, might not notice that.
So maybe it's, it is going to fix things correctly.
Maybe not, but let's say it's going to do it correctly, but it's not going to have a
test for that.
Or you're not going to know, or you're going to, I mean, you are not going to notice that
you're going to need to write a test with an empty list.
So that's the process is a bit hard to figure out if you don't do it yourself.
It's kind of like, I think also one of the reasons why people say, well, you should pair
program rather than review someone else's code, because you will discover those, those
things while you're working.
Yeah, totally.
Yeah, I agree.
I think, I think that's, I think that that's something that's going to play out more and
more these days.
I think we're going to see like a lot of junior developers using these AI tools in exactly
the kind of way you're describing where maybe they trust it too much to do too many steps.
And then what happens is you're not really engaging with the problem intellectually, and
you're not thinking about these test cases that come up.
So I think there's an art to knowing when to use it.
So like the type of problem here for this sort of frame then fill in problem I'm talking
about, this is this is a class of problem that I find myself spending a lot of mental
effort solving on a regular basis.
That is kind of this, this, you know, it has this quality we talked about with a traveling
salesman where you know it when you see it, if you have a good solution, you can easily
verify that it that it solved it.
And yet it's not really doing anything too creative, because it's fitting pieces together.
It's not really writing too many implementation details.
And I find that often with Elm code, you arrive at these types of situations where like, if
you could make the types work, you would trust that the code worked because like, there's
only really one good way to fit these functions together to get the right thing.
Like you're not doing too much logic of like, give it this empty value and give it this
now it might hallucinate these types of things.
But you could even use prompt engineering to tell it like, you're just fitting these
functions together.
I don't want you to write new logic or use default values.
So I think these types of things can be improved through prompt engineering and also through
selecting the right types of problems.
But like, for example, the I gave it a pretty, pretty challenging problem with this sort
of prompt I designed.
I had it fill in this, this thing in a in an Elm pages example that I have where it
takes markdown blocks and it traverses these headings and it and I what I did is I primed
it with a list of all of the available functions that it could use.
And another thing you can do a set of guardrails is you can say only use these functions.
And you could even say you must use these functions.
And these other functions are available and you may not use any other functions.
And of course, these things are verifiable in the in the final output easily.
But why would you tell it you can only use these functions because you're now limiting
limiting its creativity.
For instance, if you forget to mention, then it's not going to use that.
And it's going to use something else like list.fold instead.
Well, so the the way basically I was doing this as a proof of concept of saying, I'm
going to give you all of the functions that are in scope in this particular code base.
And I'm going to give you like all of the functions from from the list module and from
the result module.
So you can give it like a certain set of available functions and say, these are the types and
I taught it like I even taught it how partial application works.
So I can take an example.
And I said, given this type, if you pass in this function, now the type annotation is
And it I played around with it and tweaked it in these ways.
And it did a better job at solving these puzzles.
So it's pretty interesting.
You can kind of teach it in these prompts.
But you don't want to limit its creativity too much.
So but I get your point.
I wanted to solve a sort of uncreative problem of wiring things together that I don't really
like if I could automate that as an atomic step, I would love to like if I could, you
could even think of it this way, like, AI, give me all of the possible solutions using
the functions in scope here to satisfy this type, right?
Just like, and then you can just select, okay, these are all the valid ways to do that.
So if I can teach through a GPT prompt, the AI to solve that type of puzzle, and give
me all the possible results, and then I can use a tool to actually verify those results.
So I so I know I can 100% trust them.
That becomes a very interesting tool in my toolkit, where now I'm more capable as a programmer.
And I'm not sort of like hallucinating weird things into my code base, or as you said,
like having it write out these new cases that I'm not considering in my testing or considering
how it affects the user experience.
And if that's what I want the experience to be, it's just like, how do I solve this type
Oh, yeah, that's the only one sensible, sensible, meaningful solution, and it found it.
And I can easily look at it and say, oh, yeah, result dot map, tuple dot map first list dot
Yes, that it didn't use anything too crazy.
It satisfies the types.
It's good.
I can easily look at it and see it did the right thing.
But it would have taken me a long time to solve it myself.
And so in case it wasn't clear, my idea is to like, this is something I want to experiment
a little more with the OpenAI API to fully automate this.
And because it's a pretty easy thing to automate to just say, what are all the types of all
the functions in in scope here?
Like that's a an automatable thing.
So feed it that into the prompt and give it the ability through, you know, through writing
a small script that hits the OpenAI API, tell it what the compiler error is to let it iterate
So when I gave it this problem as a proof of concept manually, I manually told it the
compiler errors and it with two iterations with telling it the compiler error, it was
able to get the correct result, which I thought was pretty cool.
So one thing that for now I'm feeling is a bit of a problem, but it will probably be
solved pretty soon, very soon.
Is that, for instance, at the moment we do everything through JGPT because that's a lot
better than Copilot so far.
But that means that every time you ask a new problem, you need to prime it again, you need
to give it the context and the rules.
So for instance, if you want to say, well, I have this problem here that I'm trying to
solve and I don't want to spend mental power on doing it.
So I'm asking the AI to do it for me, but I need to give it the functions in my codebase.
Well, that's a lot of work.
And so for a moment, we do it with JGPT, you always need to write those functions in the
But at some point, I'm guessing that they will be able to analyze your codebase or your
source file, and then it will be able to solve the problem.
So a lot of our work as developers is not writing new code, it's editing existing code.
So I feel like that's going to be somewhat missing now, but it's probably going to be
solved pretty soon.
The only question is how much input can give those AIs, and that's pretty low at the moment,
I think.
Actually, GPT-4, can't it take like 25,000 tokens as input?
So you can feed it a huge input.
That's nice.
That has improved more.
That's a game changer.
Yeah, yeah.
So you can feed it all of your documentation or all, you can give it huge inputs.
But until then, I'm thinking, for instance, if we had a tool that's extracted information
from your codebase, and that you can then just copy paste into the prompt, that could
be pretty cool.
So like having all the types for the functions.
And there is a tool that is called Elm Review, which has an extract feature.
Yeah, exactly.
To get whatever information is needed.
But yeah, that's still going to be quite annoying.
Like you probably don't want to send the types for all the APIs in your codebase.
But maybe that could be interesting.
Unless you automate it, and why not?
It gives good results.
Yeah, what I mean is more like if you need to give too much inputs in a way that the
AI will not be able to parse that or understand that, or it's just too big.
Like I work on a 200,000 lines of code, codebase.
But how many type annotations does that represent, right?
Like hundreds of thousands, probably?
You think so?
That are in scope in a particular point in code?
Well, you can import anything.
Not everything is exposed, probably.
Yeah, exactly.
And a lot of things are going to be values that are locally scoped to let like I would
I would venture to guess in a 200,000 line codebase.
With code comments?
Or maybe there are a thousand.
Maybe there are like on the order of one to 10,000.
No more than definitely no more than 10,000.
Yeah, no, if you only include the exposed things or the things in scope of that file,
then yeah, that's, that's a lot less.
Oh, there's also dependencies, but that's also not that much.
Right, maybe it's doable.
And you can do heuristics to filter and not even heuristics, like reliable heuristics
to filter out.
You know, you know that if you're trying to arrive at this type, but none of the types
connect in any way.
So if you can take a string and use it on this thing, and you can take an int and turn
it into a string, then those two things connect.
But if a if this custom type can't be turned into a string, then you know it, it's not
going to be involved in your problem at all, because your problem is dealing with strings,
and you can't turn that to or from a string, for example.
Yeah, you can also remove all the modules that would create an import cycle.
So basically, you can't import anything that imports this file directly or indirectly.
So to me, this is the really interesting intersection.
So now you were mentioning earlier that you think that these tools will start to understand
your code more over time, but we're not there yet.
I actually, I don't see it that way.
I believe that these tools are going to continue doing what they're doing, which is that they've
been remarkably successful in being able to solve complex problems just through this crazy
synthesis and predictive text sort of thing.
No, I didn't mean it in the sense of understanding.
I meant it's just of gathering the information.
Like every time you ask something to Jadgipty, you need to provide information about your
code base.
At the moment, it does not look at your code base.
I see.
I see.
Right, right.
But I don't think that they will, for example, know type information about your code base,
except insofar as it's part of the predictive text because it's in there.
But I think that they're getting enough mileage solving problems through this sort of predictive
text, that they're going to keep going with that.
But I think the interesting intersection, especially with typed pure functional programming
languages is if you, so humans have their role, these sort of like compiler tools and
static analysis tools have their role, and these AI tools have their role.
So with this trifecta, I think each of these pieces needs to do what it is best at.
Compilers are good at verifying things.
Humans are good at, do we even need humans anymore?
Humans are good at critically thinking, guiding these tools.
Humans have goals.
Humans are good at gathering requirements.
I'm not going to say they're good at it, but at the moment they're better than a machine.
And they have to because humans have goals.
The AI's job is not to have goals.
Humans have goals for humans.
When a machine wants to make a program for a machine, then it can do it on its own.
This is absolutely not discrimination that I'm mentioning.
The human is the customer.
The human is the one that gets to say whether you solved the problem or not, that gets to
make the calls of what the problem you're solving is.
So that's like, the human needs to do that.
There's no substitute for that.
Because as you said, if the customer is a machine or an API or something, then you can
automate it.
So the human only asks, well, I need this, and then the machine can do the rest.
And you can have these feedback cycles with compilers and all kinds of test suites.
So yeah.
So if that trifecta is what becomes really interesting to me, the human sets the goals
and can sort of validate these criteria and accept or not accept them.
The compiler is a verification tool.
It is a tool for giving information through static analysis that is guaranteed correct
information and checking that information.
Review and other static analysis tools can provide similar input and verifications.
And AI can piece things together using those guardrails and inputs and verifications provided
by those other external systems.
So when those three things are sort of interacting, then I think it becomes really interesting,
especially, as I said, when we are using these things to create higher level building blocks
as humans.
So we can say, give me a decoder.
And I know that it satisfies these things.
And I don't have to use brainpower to check that because I know it's an automated verification
of that.
So I can trust it.
Give me a fake it till you make it simplest thing that could possibly work green test
for this test case and give it guardrails that allow me to trust that it's not going
beyond that and filling in implementation details.
Then you can actually trust these things.
And yeah, well, there's there's one question of, do you even need a compiler or type checker
in the linter and the test suites?
Could you not just ask the AI to verify things?
But then it comes again to the point of, well, who monitors who?
How do you trust the right checks?
Right, exactly.
And at the end of the day, we we do trust the compiler.
Now, that said, it is possible for the compiler to have bugs and it can.
But for all intents and purposes, we fully trust the compiler.
We fully trust Elm Review.
Of course, possible for these things to have bugs.
But I think that's a good assumption.
Whereas with AI, I don't fully trust it unless I verify it.
The thing that is very important for me with regards to the compilers and linters and test
suites is that these are consistent.
Like if you run the same code, if you ask the compiler to verify the same code, it's
going to give you the same results.
If you run the same code in a test suite, it's going to give you the same results.
If you ask the AI to review your code, like, hey, are there any consistency issues that
the linter would tell me, for instance, then from one run to another, it could tell you
different things.
It's kind of like asking a human, hey, can you please review this code and tell me how
you can improve it?
Well, if I ask you today to do this seriously on my code base, you're going to find a lot
of problems.
If I ask you tomorrow to do it again from scratch, you're going to give me a whole different
kind of problems.
Like, oh, well, I think I can rewrite this function.
But maybe it's good enough.
Linters, when they're dealing with consistency, they give you a certain minimum of consistency
of code that is written in a specific way.
And it could go higher, probably.
Like you want all our functions to be named in a very similar way, for instance, but that's
probably a bit too hard for a linter.
An AI would always tell you different things, and we don't want that.
So we need these to be trustworthy and consistent in the sense that it doesn't give you different
results every time.
And the lower level the task, the more we can trust it.
Just like Elm types, because the type system is so simple, it's easy to trust it.
Whereas TypeScript, it's so permissive, it's hard to trust it.
And there are so many caveats and exceptions that it's hard to trust such a complex and
permissive system.
So I do think that this might be a superpower of Elm.
And honestly, I think that maybe this could be a really appealing thing about Elm that
makes it more mainstream.
That, wow, this language, it turns out it's really good for automating and tooling.
And you know what?
Automating and tooling is really hot these days because people are building all sorts
of AI automation.
And we can have trusted AI automation.
So I think we're at this early stage where people are just sort of letting AI just write
their code, which is kind of crazy.
They're letting AI just execute shell commands for them.
I saw a recent thing where somebody like...
We all knew it was going to happen when you start letting AI just fill in commands in
your shell.
But somebody basically like RMRFed their drive or broke their computer and are trying to
fix it.
We all knew it was going to happen.
It's kind of a crazy state of things, right?
But if we can have tools that we can really trust and not have to worry about it doing
anything that's going to put things in a bad state or go beyond the scope of what we're
trying to do, just like perfectly reliably solve a difficult problem that we can now
take for granted.
That's awesome.
And I think Elm is a really good fit for that.
I've also heard the opposite point of view where this could be pretty bad for Elm or
for smaller languages in the sense that the AI is trained on code that is available.
And there's not a lot of Elm code out there compared to more mainstream languages like
So this could make adoption of new languages harder or smaller languages in general.
But as you said, if there are guarantees like the ones that Elm provides, that can even
out the playing field.
But if you're designing a language that doesn't have the same guarantees as Elm, and it's
just very new or very small, then you get kind of the worst of both worlds.
And this all depends on writing the tooling, right?
And so I think we have an opportunity to build really cool stuff leveraging these techniques
right now.
So I'm definitely going to be playing around with that.
Like I've got a lot of ideas.
I want to make this sort of automated type puzzle solver.
I think, you know, having it build JSON decoders starts to become really interesting where
like Mario and I were working on this Elm HTTP fusion thing, which is really cool for
like having a UI where you make an HTTP request, and then you can sort of click the JSON fields
you want and it generates a decoder.
It's like, that's great.
But what if you can tell it the type you want and it can figure out what fields to get and
generate something that is provably correct because you actually ran it and verified it,
and then you can fully trust it, but it just solves your problem.
And it sort of can solve that last mile problem where like, there are so many things I've
been trying to automate where it's difficult to do that last little piece of the puzzle
and AI can do that missing piece.
So I think this unlocks some really cool things.
I've been thinking about like some other use cases I'm thinking about are like, so for
example, like with Elm GraphQL, you know, we've talked with Matt about Elm GQL, which
sort of tries to be a simpler way of just taking a raw GraphQL query as a string.
And it's very easy to interact with GraphQL APIs through this raw query string.
And then it can generate type aliases for you of the response you get, and you just
paste in your GraphQL query string and it spits out some Elm code to execute that query.
And the trade off with that approach in Elm GQL versus Elm GraphQL, as we talked about
in our Elm GQL episode is with Elm GraphQL, you have to explicitly write everything you're
decoding in Elm code.
But you can maintain it in Elm code and you get more fine grained control over the types
you decode into.
So there's a trade off.
But what if you had a tool that generated an Elm GraphQL query, you get complete control
over the fine grained code and that you decode into, but what if you could just tell an AI
tool generate an Elm GraphQL query.
And using this sort of type puzzle solver I built, I can say here are all the functions
for generating Elm GraphQL types, solve this problem.
And here's the raw GraphQL query.
And here is the resulting Elm type I want.
And it could, I think it could solve that pretty well.
So some of these tools become more interesting when you have that extra bit of glue from
And that would solve all of Elm's problems because all of Elm's problems are boilerplate.
It's boilerplate that's really easy to maintain once you have it.
So if it's very easy to confidently write boilerplate, then yeah, Elm becomes a lot
more exciting.
If we take your last example, it does mean that you redo the same logic every time and
not necessarily in a framework or library oriented way.
So you would redo, you would inline the creation of the GraphQL query and decoding those instead
of using a pre-made library, which simplifies the API for that.
But it could be very interesting nonetheless.
I think part of the challenge right now to using these tools effectively is like defining
the problems and the workflows to leverage these as much as possible.
Another thing on my mind here is like refactoring.
So we have, you know, if you build in an IntelliJ refactoring for like extracting a function
to a module, like what kinds of refactoring should we invest in building in like IDEs
or language servers versus using AI?
I mean, we could also just ask an AI to write those things to be integrated into the ID
for instance.
So for instance, if you go back to the linter example, I don't want an AI to review my code
because it's going to be inconsistent.
I can ask it to write a linter rule once and then I can run that linter rule multiple times.
But yeah, I definitely agree that there are cases where you will want to have a transformation
using AI rather than one that is hard-coded one way or another in an IDE.
That could be interesting to find.
And I mean, I don't know.
I'm very bullish on what we can do with these AI tools.
But I'll have you ask yourself whether you should.
Well, that's another question.
The thing I'm bearish on would be just saying AI build a plugin.
You know, people are, there's a lot of hype around like it built a Chrome extension for
this thing.
It built a whole app from a sketch on a napkin.
And so it's like, okay, that's very impressive.
It's very interesting, but like how I am skeptical of how useful that is going to actually prove
to be.
Like, I don't feel like that's what's going to take people's jobs away.
I don't feel like that's what's going to replace the work we're doing day to day.
I think it's these more mature things that we can really rely on where we're choosing
more constrained problems to do higher level operations and verify them and put guardrails
on them.
Like, I don't know.
I think that's my personal bias and obsession and people will get over that and not worry
about that and be able to do cooler things than I can do.
That's very possible.
I admit that's a possibility, but that's where I'm putting my money.
So like writing, having it write the IDE completions for extracting functions and things like that.
Like it's like I can, the hard part isn't writing like the happy path.
I can write the happy path of that.
I've actually, I've done that in IntelliJ refactorings.
The hard part is everything else that it's not considering.
And if I have to babysit it to make sure it solved each of those cases, I may as well
do it myself.
Cause like the things it's going to miss, the things that I don't trust that it did
and I have to go check myself, it's easier to do them myself and engage with the problem
and solve it my way and know that I accounted for every corner case and wrote a test for
it than to just trust the AI and be like, okay, now I have to go check everything that
it did in this crazy, impossible to understand code.
That's not the way I would have solved it.
But if you paired with the AI.
That's, that's the direction I think things are going.
Just like tell it very high level instructions.
But every time you give instructions, you, there's, there's some bias, right?
So at least so far the, the AI is, they're always very confident.
Which is problematic when they're actually wrong.
But they're also never going to say no.
Well unless you ask for things that's been trained to say no to.
I've seen a lot of people in the Elm Slack ask for questions like, how do I do X or how
do I do, how do I solve this problem?
And there's often that XY problem.
Like you ask to solve it.
You asked a solution to X, but you're actually trying to solve a different Y problem.
And so even if I imagine that the AIs will become extremely, extremely good, like 100
times better than you and me combined, or there's still only going to solve the problem
that you're asking them to.
Just like, let's imagine it's the smartest person on the world that you have free access
Well, if you ask them something and they don't know, they don't think about whether you're
going to, whether it makes sense to ask the question, then they're not going to tell you.
So you need to prompt them, but you also need to think about how you ask the question, what
question you ask.
And I'm thinking maybe we should ask them as well.
Like, Hey, I have this feature that I want to do.
So can you tell me how I transform this HTML to a string?
And maybe you should also ask, like, does this make sense by the way?
Because then they start asking, answering that question.
Well, that no, that doesn't make sense.
So I said, we're good at gathering requirements, but we're not very good at those, but that
is our job.
And I think it will increasingly become our job.
So we're going to become AI managers.
I think that's a good way to look at it.
AI product owners.
Yeah, totally.
I think, and what you're talking about, the word that's been coming to mind for me is that
these AI engines are very suggestible.
Like if you say, I don't know, where might I have lost my keys?
Hint, I often forget them in my pants that I put in the hamper.
Then it's going to be like, are they in the hamper?
But it's going to run with, I've seen that with the Elm compiler sometimes says, hint,
and it tells it, maybe you need to do this.
And then it's like, okay, sure, let me try.
And it gets fixated on this one path that the compiler sort of hinted at, and that's
not a good path.
So that's why with this type puzzle, I was trying to give it a very clear set of rules
and say, this is the set of steps you're following.
And then even teach it, this is how you adjust if you are incorrect in one of your guesses.
And so you really have to prime it and prevent it from getting fixated and biased in one
But you also said some guardrails and if you were wrong in sending those guardrails, that's
going to be a long problem for you.
Oh yeah, absolutely.
And it is, I mean, these AI engines are also, they're interesting for ideation as well.
So there, I mean, that's a whole nother topic we could get into, but.
We mostly talked about using it for things that we know well, and that we can validate,
verify, which I completely agree is probably the way to use it.
But it is also very good at helping you out when you don't know something.
And there it becomes a lot more dangerous because it's overconfident and it's going
to lead you to wrong results, wrong paths, and you're not going to be able to figure
those out.
But because it knows a lot more than you, it will, I think in a lot of cases, be used
in that way.
And there people have to weigh in the risks that are involved.
So definitely in some cases, it's going to be amazing.
For instance, I am not a good drawer, but I can ask an AI to draw something for me.
I actually do have a whole website filled with drawings, but I probably shouldn't train
it on that.
But yeah, if I ask the AI to do it, then that would probably give some better results.
But when it comes to code, if I can verify it, then it's better.
If I can't verify it, then it's something new to me.
Well, that is very interesting as well.
And the thing that I'm worried here about on that matter is that if I ask the tool to
do something for me for something that I don't know, whether I will start over relying on
it instead of learning properly and improving my own skill sets.
I think that's going to happen a lot with a lot of people getting into development right
And yeah, I think being an experienced developer, it's a lot easier to know what to rely on
it for or when it's maybe like starting to write code where you're not learning to write
a reg x.
And you probably should sort of figure that out instead of just blindly trusting a thing.
Or maybe it's okay to just be like, if the test passes, I don't really care how it arrived
at that.
Maybe that's okay too.
You know, but yeah, I can for instance, imagine a backend developer who knows a little bit
of Elm and they just ask the AI to generate the UI for their application or at least the
view parts of the application.
And that's going to be very helpful to get started.
But how do you make sure that things are correct with accessibility and all those concerns
that you don't know about?
Right, exactly.
Is it going to fit well with a design system you set up?
And there are all these assumptions that, yeah, so you have to know what to rely on
it for.
And if it's like, if you can have it perform a high level task that you can fully verify
and trust it for, that's interesting.
If you can have it help you with ideation and generating a list of things to think about,
and that's input for you to consider some other things, that's also very good.
Because that, if something is helping you with ideation, you can filter out a little
bit of junk to get the diamond in the rough.
Oh, this one idea, I didn't consider that.
And that was really good.
So that's another use case.
But the sort of in-between space where you just YOLO it and blindly incorporate it into
your code, I'm honestly pretty skeptical of the utility of that.
And I'm skeptical of how maintainable it's going to be working with systems like that
and maintaining code where there's a lot of that happening.
I think it's going to be okay for things that you're okay with throwing away.
Well, that you're okay with and that you can throw away.
Yeah, yeah, right.
Yeah, if you can scope something down really narrowly.
I used it the other day for writing something to traverse a directory structure to find
the root Elm project by looking until it found an Elm.json file.
For my Elm pages scripts, I changed it so you can do Elm pages run and then give a file
path and it will find the nearest Elm.json to the file you pass in.
And I wrote it with GPT-4 and I went through a few iterations and I guided it very clearly
with what I wanted in the result.
But I knew it was like, this is going to generate one function for me that if it works, I can
just let it do its thing.
Although I didn't like the style it used.
So I told it, instead of doing a bunch of for loops and while loops, can you do it using
functional style mapping and recursive functions?
And it modified it.
And then I said, can you use ESM imports instead?
And with a few tweaks, I had it refactor it to the style I wanted.
And so yeah, it was like a constrained thing.
And the next time you do that, you will prime it with, oh, use a functional style and use
ESM, etc.
And that was like a constrained enough thing that I know with my experience, that it's
like an independent, separable problem that if it writes a function that does this, I
can use that and it can be useful to my workflow.
So I think there's an art to knowing when to rely on it as well.
I feel like we have a lot more to talk about, a lot of interesting aspects to cover, but
this has already been a quite long episode.
So I think we're going to cut it here.
Maybe we'll come back to this topic in a few months.
Who knows?
Let us know if you like it and if you want more of it.
And tell us what you've been doing with Element AI or pure functional programming and AI.
We would love to hear from you.
We'd love to hear what clever things you come up with or just how you use it in your workflow
and let us know what you want to hear about with Element AI in the future.
Did you prompt the audience well enough so that they give you the answers that you're
looking for or do you need to rephrase it slightly?
Maybe let's give them some guardrails.
Give us your example use cases.
Give us an example of the problem you used with it.
There we go.
I think we're good.
Don't give us ideas if you think they're bad.
Okay, that should be it, I think.
Well, Jeroen, until next time.
Until next time.