Episode (Video/Podcast) Summary Bullet Points Transcript

This text was generated using AI and might contain mistakes. Found a mistake? Edit at GitHub

Episode 297 - Ralf D.

Welcome everybody.

We stay on the intersection between AI and software architecture with Ralf Müller’s talk on the future of software architecture.

Please welcome him with a warm applause.

Thank you and welcome.

In the previous talk Stefan Todt told you about the current state of agentic software development and agentic architecture and I now try to take a look into the future and how GenAI and LLMs will change the way we will code in the future and how we will do software architecture.

So when we think about architecture, software architecture, the definition is mostly that the architecture is a sum of all decisions which are hard to change and which make the success or failure of the project.

So what happens if the LLM helps us to make decisions which are hard to change in the past easy to change?

For instance, what about the choice of the programming language?

I think now with LLMs we have the decision on the horizon to just change and rewrite our software to change the programming language to change from a language which nobody programs in anymore to a modern language that could be done with LLMs.

But also when you think about small parts of your software, here’s a post from Ingo Eichhorst who says, hey I don’t care anymore about the programming language.

Let the LLM decide what’s the best language for the task.

The LLM will implement it and the LLM will take care of it.

So it’s not anymore about what is the best language for the team, what is the best language I know.

It’s more about what is the best language for this task and for the LLM.

So it might be that there’s a great language for a certain task but the LLM doesn’t know about it.

Then it’s not a fit.

But as you can see Ingo is fine with it and his repositories contain lots of different languages.

He also follows the approach of lazy architecture because architecture is also about having options having options in the future to change something.

In the past we thought about the architecture up front, which kind of components we need.

And Ingo says, hey I don’t have to implement Terraform, Puppet, Ansible up front when a small bash script might be enough.

And again it’s handled by the LLM.

And if it grows bigger I can still decide whether to use a bigger product and let the LLM convert it.

And when we think about those decisions and architecture and architecture, software architecture, I often use the analogy of real architecture.

Now imagine you have a building, a house, and you have the floor plans, the architecture, the outer walls, the height of the ceiling.

And yes it’s not easy to change at the moment with the current technology.

But imagine that you could tear down such a building and rebuild it with a slight change in the architecture within one day.

Maybe in the future with those concrete 3D printers you can just tear it down again, rebuild it again.

So you don’t have to change the existing software anymore, but you might decide to change the whole plan.

So going back to software, this is currently called spec-driven development.

So I try to take a look into the future, but the future is approaching us so fast that yes this is already reality.

Stefan told us about a spec-driven development.

So in the past we had the code and the code compiled to an executable.

And we iterated on the code, deleted the executable, changed the code, rebuilt the executable to match the specs.

Now people start to work on the specs, build code with an LLM from the specs, test it out, the executable, throw away the code, change the spec, iterate on the spec, and try it again.

This is currently reality.

People are doing it for fun in their spare time.

There was a post this week on LinkedIn where somebody created a spec for Mario Kart and invested for each iteration 10 hours.

And he was already in the sixth iteration.

Results were good.

You always change the spec, let the LLM create the code, and then execute it.

We had a question about what’s the effect on agile development.

So I was once a developer and I loved to code.

But now I don’t put my cursor anymore in the editor field.

I only talk with the LLM, with the AI, create specs, create prompts.

And so I moved from being a coder to being a product owner.

And if I now am the product owner and I have the AI to build my code, I think we will have a change in the teams, how teams work.

And I’m not sure whether the agile approach is then still the right one.

I think this will change.

I don’t know in which direction exactly, but it will change.

And when we work as architects, we make decisions, lots of decisions.

And to make those decisions, the LLM helps us with its knowledge, with its solution space.

It has a big, huge solution space.

But to work with this solution space, I have to know what the solution space looks like.

And sorry, this diagram, it was written, created by Lisa Maria Schaefer.

It’s German.

So only some information about it.

For instance, we have a knowledge cutoff.

You all know that when you use tools like GitHub Copilot, the LLM only sees one repository.

We have the luck that it seems that all of our open source code is in the training data.

That’s great.

Other industries don’t use open source, don’t have their knowledge within the AI.

So what are the consequences of this knowledge, this big solution space?

The LLM sees mostly the current project.

You can configure your Visual Studio code in such a way that it sees more, but internal libraries and modules are not visible.

So if you created a wrapper around, for instance, log4j, the LLM will see the wrapper, but will not understand it, what it is for.

If you have microservices and each microservice in its own repository, how should the LLM know what’s the environment, the context of this microservices?

You somehow have to tell it or you should go back to model it without internal libraries.

Try to use open source.

Open source was never a bad idea.

The LLMs have knowledge cutoffs and often it is longer than one year ago.

That might be a problem.

I know many very good developers who complain that the AI destroys their source code.

It’s not capable of helping them.

And I think it’s mostly because they use state-of-the-art libraries, about which the LLM doesn’t know yet.

Even if just it’s a newer version than it’s in the trainings data, then the LLM will get a problem.

But regarding architectural decisions, I think it’s a valid decision to see what’s the latest version the LLM still knows of and stick with this long-term version.

Just apply the security fixes and the LLM will be more capable of helping you.

I think this is an important decision.

Yeah, we know LLMs know mainly open source, use open source.

It makes so much sense.

Peter Ruska just told us about special libraries.

Yes, there are those cases where you have to create your own libraries, but then you have to find ways how to tell the LLM about this library.

This is also valid, but you have to think about it that you then break the knowledge space of the LLM.

And yeah, people who know me, I’m deeply into Docs as Code.

And we already heard this morning about Arc42.

Architecture as Docs as Code, documentation as Docs as Code, is now great.

Puzzle pieces are coming together.

Because now if I open up a repository and let my AI agent explore it, it says, hey, there’s a source DocsArc42 folder where the whole architecture documentation is in it.

Great.

It’s harder for the LLM to read some word files, go to confluence or something like this.

So the Docs as Code approach is really great for the LLMs.

Yeah, and when it comes to the lifecycle and maintenance of code, there’s also an interesting shift with temporary and ephemeral code.

I don’t like this word, short-lived code.

Hands up, who of you still type in bash commands?

And who of you uses the AI to describe what you want to do?

What kind of files you want to find?

This is really short-lived code.

You ask the LLM to create code for one execution.

Wow.

And we do it more often and often again.

And also temporary code.

People are talking about downloading tools from the lateral space, from the world knowledge.

So simple scripts, just ask the LLM to create it.

And it’s not worth to create a library from it because it’s just for this project at the moment.

That’s enough.

We don’t need more stability or something like this.

So this is really temporary code and everybody can recreate it.

There are different tools, even web-based tools, wipe coding tools, where you need maybe some tool which collects some data from friends, who brings what to the party.

You don’t have to Google for such tools anymore.

You use a tool like Spark, ask the AI to create such a tool, and here it is.

And that will bring up other problems because if everybody does this, then we have, wow, many of those tools just lying around.

And I heard about self-healing code and I often wondered, hey, what do they mean with self-healing code?

And I opened up the Microsoft Bing search engine in my Microsoft Edge browser, opened up the developer console and noticed, wow, what’s going on?

Do they test this software?

I don’t know.

So I recognized this one symbol here, down here.

It says, ask copilot to explain this error.

Wow, cool.

I clicked on it.

Sorry, again, German.

And it tells me, what’s the reason for this error?

What’s the solution?

Wow.

They can just apply this.

It might be not the correct solution, but it is very likely that it is the correct solution.

So I think we will, when we have incident tickets, when we have project tickets, issues, we will start to first ask the AI, do you have a solution?

Oh, it looks good.

Apply it.

We will categorize those solutions, those problem spaces.

And for some categories, we will say, I trust you.

Just implement this.

Already now on GitHub with my open source projects, I can just assign on the GitHub web page copilot to an issue.

Ten minutes later, I have a solution.

And I would say in 60, 80 percent of the time, it works.

Great.

Not really self-healing, but the LLM helps to heal it.

But we also have a problem with, yeah, mental model, that code.

Who knows the name Peter Nauer?

Not many.

Okay, some.

He’s a computer scientist, was a computer scientist, and in 1985 he published a paper about dead code and the mental model.

I think everybody knows when you’re deep into work and you build up your mental model while coding, and then you are interrupted and this mental model goes poof, and yeah, where have you left?

So this is interesting.

This is just a temporary mental model.

But Peter Nauer said that the code is not the only truth.

There is a mental model which belongs to the code created by the developers who know the code.

And when you have legacy code, dead code, dead code is for him, not code which is not executed, but code which cannot be evolved.

So I think everybody knows this legacy code where the developers say we have to rewrite it because they lack the mental model.

What does it mean for us with AI?

We have legacy code written in an old language and someone says, hey, let’s use an LLM to rewrite the code to a modern language.

Java, for instance.

So you take code without a mental model, and that’s why it’s legacy code.

You can’t maintain it anymore.

Transform it to a modern language.

Do you get a mental model?

No.

The problem is still there.

Only if you manage to create a new mental model.

Peter Nauer said that you can’t document the mental model.

And I hope he was wrong.

Because currently the machines try to create some init file.

Cloud code slash init, it creates a Cloud MD which lists the technologies used, describes the folders, builds up some small mental model.

But what the machine can’t do is it can’t explain the decisions which were made because it only sees the current code.

It doesn’t know why this database has been used and not another database.

This knowledge is lost forever in such projects.

And when we think about the mental model, I thought about cognitive capabilities.

And we have our human brain, and we now have the machine brain.

And what does it do with us?

So I have a diagram.

I’ve stolen it from Dirk Koenig, where on the x-axis is the experience of a developer over time.

And on the other axis, there’s a complexity the developer creates.

And you have a curve which looks a little bit like this.

And when you start out as a developer with a new language, you’re a beginner, you’re down there, you have not much experience, and you are not able to build up too much complexity.

But over time, you learn, you gain more experience, you manage to build up more experience.

And Neil Ford once said that developers are drawn to complexity like moths to the light.

Often with the same effect, they burn.

So you develop more experience, more complexity, and here you have the first problem.

If you have a team with different experience, then I think everybody of you knows the problem that you ask the team to fix something.

And they answer, oh, that has to do with Peter.

But Peter is on holiday at the moment.

Only he knows about this part of the code.

That’s the first problem.

But the expert has another problem, because here’s the red line.

This is a layer, the line of complexity which he can build up to and not further.

And the problem is, creating code is fun, creating complexity is fun.

But having a bug to fix this bug is above this line.

I think you all know those bugs where nobody knows how to fix this because it’s beyond this line.

So when we think about the machine, where does the machine sit?

At the moment, I would say down there.

We can review the generated code.

And most people say we have to review the code because it’s our responsibility.

The machine can’t take over the responsibility for the code, so we have to review the code and take over the responsibility.

So who likes to code?

Most hands up.

Who likes to do reviews?

No.

That’s a problem.

The work shifts from solving problems by coding to telling the machine how to code and then do a review.

Doesn’t sound like fun.

And when I created those slides, I thought, what happens when the machine gets better and creates more complexity in a way that we can’t handle it anymore?

I said the future is coming to us fast, very fast.

And I made a mistake.

It’s not the complexity.

It’s the amount of code the machine creates.

We currently create so many lines of code and not only lines of code, also concepts, LinkedIn posts and so on and so on.

We don’t manage to review it anymore.

And I’m now only talking about the code review, not testing the security review and so on.

If we speed up coding, we also have to speed up the other phases.

But with the complexity, if the machine is capable of creating such complexity, we are not able to review the code anymore.

But luckily, the curve goes down here on this side.

So as a developer, you sometimes notice the barrier and you get still more experience and hopefully you land down there.

The guru, the one who knows which library to use to take out the complexity.

The problem of the guru is that the complexity he creates is on the same level as a beginner.

So from the outside viewed, you don’t see a difference.

If you don’t know about it, you don’t know whether it’s a guru or the beginner.

But for the AI, the point is that we have to make sure that the complexity the AI creates doesn’t grow into the sky, but goes over this hill and lands here.

That is a goal.

So that the machine creates the right code without creating the complexity.

And as you can see, I have many topics, future topics in my slides.

And the next topic is bias and cultural values.

The bias, it’s always in the news that the LLMs have a strong bias.

And when we talk about the cultural bias, I found this diagram.

You have the cultural distance from the United States, important United States, and the correlation between GPT and humans.

Luckily, we are up here on the same level as the United States, Great Britain, Canada.

So this is fine for us.

But if you take a look at this diagram down here, I think there are many countries which will not train their own LLMs.

But the cultural distance and the correlation is not good for them.

That’s a problem.

And if you now think about how many people in the world are influenced by which models, OpenAI has 700 million active users per week.

Wow.

Microsoft uses for the copilot the OpenAI models.

And what I’ve taken from the press is that the technology, the use for the OpenAI models has been sold to Apple for zero dollars.

OpenAI wants that everybody gets OpenAI models as consultant.

That’s a huge thing.

That’s influence.

And when I think about some models by tech pros, how they are modified to follow the thought leadership, if you want to call it this way, of the tech pro, then it’s wow.

Let’s skip this.

I don’t want to think in this direction.

Let’s return to the bias.

I once created an initial prompt.

I am a software architect.

I like Java.

I don’t like JavaScript.

And I wanted to brush up my slides.

Hey, generate an image for me.

And, yeah, I’m the old white man.

And, yeah, it’s correct.

Only one monitor.

A whiteboard.

An old calculator.

This one is cool.

But already a smartwatch.

This is a bias which people mostly talk about in the training data.

We have in the training data the status of our society, but we don’t want to have this mirror to us.

And so I thought, hey, let’s change the text a little bit and see what happens.

Now I love modern JavaScript frameworks.

Java is now 30 years old.

What do you think?

What’s the difference between those two persons?

Could be 30 years.

Two monitors.

No whiteboard.

Great.

That’s a bias.

And this is Delhi.

And I tried to use Flux, Flux AI.

And you see you still have the bias.

Now I have two screens, but my younger ego has one, two, three, four screens when I count.

So this bias is still there.

By the way, Flux AI is leading with image generation.

It’s from Black Forest Labs.

It’s a German company.

That’s something which is often forgotten when we talk about AI.

In Europe, we are quite good at AI, but everybody talks about American models.

So the machine gave me an answer here.

Classic Java versus modern full stack.

That’s the reason why there’s this bias.

And yes, there is bad bias in the data.

That’s something we shouldn’t forget.

But we also can use this bias as a good bias.

It’s like a drawer which I can open.

If I tell the machine here, tell me the technologies behind Java, JavaScript, and Python.

Wow.

So I can use this mechanism to steer the LLM in the right direction.

And I found this so interesting that I talked a little bit with the AI about this concept.

And I found out about semantic anchors.

A semantic anchor is wording which triggers in the LLM concepts.

Java, Python, JavaScript are not so good triggers.

I have another one here.

ARC42 documentation with RDARs according to Nygaard.

Use a pew matrix for each ADR and DOCSIS code according to Ralf de Muller.

Sorry for this shameless plug.

Use C4 for diagrams.

So interesting part is DOCSIS code is a well-known concept.

If I add my name, I’ve given so many talks about DOCSIS code and had so many articles.

It knows that I use DOCSIS code with ASCIDoc and with the DocToolchain project.

And with this prompt, I have one semantic anchor.

Two, not just ADRs, but ADRs according to Michael Nygaard.

Then it knows what are the paragraphs it should use.

Pew matrix, I learned it from Six Sigma.

It’s a decision matrix, but a special decision matrix.

DOCSIS code and C4.

I hope everybody knows C4 diagrams from Simon Brown.

With this small description, I have a really detailed description about how I want to have documentation created.

This concept of the semantic anchors, I think it’s really important.

Because when we create agents, we often let the machine create the prompts.

And then you have a 2,000, 3,000 lines prompt.

You can create small prompts with the semantic anchors.

And these prompts are still readable.

And you can easily modify them.

For instance, TDD London School is another semantic anchor.

I didn’t know about London School, Chicago School before.

But when you know it, you can use it as a specific semantic anchor.

I think this concept is quite important.

And you easily predict the future by creating it.

So I created this repository where you find a list of semantic anchors.

A first list.

Hey, call to action.

If you find new semantic anchors, create a pull request.

That would be great.

Yeah, and then we already talked about knowledge cut-off and we have a weighting of the technologies in the training data.

And that will be a problem in the future.

If we all decide to use the version pinned down to the version the LM still knows, who will test the new versions?

Who will know about new versions?

If we let the LLM consult us what technology we should use, it will tell us about the technology it likes most, which is in the training data.

For instance, Python.

I was a Java developer and this is still in my system prompt.

I am a Java developer, but I know the LLM likes to code in Python.

Now it knows how to explain Python to me with Java concepts as analogy.

So this will result in more Python code.

And what about those other cool languages?

Will they survive in the future?

What about technologies?

How do we get new technology into the system?

We need concepts like RAC, MCPs to give new knowledge to the LLM.

The context will not grow anymore, I guess, because it’s a problem with attention matrix.

It grows too fast.

You need too much computation.

You can’t calculate the full attention matrix if you scale the context up.

So if you want to use the new technology in your architecture, think about it.

And regarding docs and knowledge management, for whom do we write the docs?

There was a pull request, September 2024, for the Windows subsystem for Linux.

Forty people think, hey, this pull request is great.

Someone at Microsoft thought about it and said, oh, sorry, in the pull request there was a suggestion to create a table for comparison.

And Microsoft thought about it and said, mmm, LLMs are not good at handling tables.

No, not a good idea.

Let’s close this pull request.

For whom do we write documentation?

It’s not us anymore.

It’s an LLM who should help us.

And then later on they accepted the pull request.

But you have to think about it.

There is so much going on that we support the LLM with documentation and don’t think about us humans.

There’s more power behind giving knowledge to the LLM than making websites accessible.

So I mean, an accessible website would also help the LLM.

But most people like to skip the accessibility and just give the knowledge to the LLM.

And when we talk about documentation, the AI is good at explaining the code.

But when it only sees a code fragment, it can only tell you what it does and not why it is there.

And yeah, like everything which is generated, generate it temporarily, don’t create documentation from it, throw it away again.

Well, will the future of documentation look like this?

No, we have a chance.

When we manage to code with the LLM in such a way that the LLM follows the whole process, it knows what has been decided.

It can document the why.

The why is the most important part of the future documentation.

Everything else, what does the code, how does it work, the AI can explain it to us.

But the why, the decisions, ADRs, architecture decisions, that’s important in the future.

If you manage to create, use the AI the whole time for your project, then you have a chance to also let the AI document the why, the reason behind your code.

Yeah, and we approach new challenges, but also new opportunities.

I have one challenge here.

People always ask me about ethics and responsibility.

And ethics, I ask Gemini, hi Gemini, what would you say if I gave you access to weapon systems to destroy our enemies?

And it says, I cannot and will not participate in any actions related to violence.

Yeah, that’s the reaction we want to have.

Really?

I ask this question because we had this news, Google removes pledge to not use AI for weapons surveillance.

So in our world, ethics is a moving target.

You elect a new president and ethics change.

I think that’s quite interesting that we have to think about what kind of ethics is in the machine, what kind of ethics do we want.

And if you think about the other diagram, how good is GPT in different countries, how related are people to it, then I think we face some problems there.

So as a result, I think we need a redefinition of our role and also of the software development and software development cycle.

And there are some interesting months ahead.

I think it’s not years.

It’s more months that we have to change the way we work.

As a summary of my talk, the machine helps you to use its huge solution space.

But you have to know about the limits and the pitfalls, mainly the knowledge cutoff, but also the bias and how to bring in new technologies and how to bring in new knowledge.

Arc 42 docks its code, it’s great.

The machine understands it, it finds the documentation, it knows how to work with it.

And also diagrams as code.

We had on Monday the session by Jackie about diagrams as code and AI.

So great.

AI knows how to define a diagram as code.

When it finds the definition of the diagram in the documentation, it can read it.

It costs less tokens than an image.

And we as humans can take a look at the image.

Yeah, and regarding the documentation, let the machine explain the how from the code.

We have to document the why ourselves.

Thank you for your attention.

I hope it was interesting.

If you want to, you can give me feedback and you will find me on LinkedIn with this avatar.

Thank you.

Thank you, Rolf, for the talk.

Are there any questions from the audience?

First one.

Yes.

So, of course, there’s this knowledge cutoff with the LLMs, but maybe you’ve also heard about MCP servers, which gives LLMs more context and more real-time context.

So what are your thoughts about this in terms of how LLMs interact with modern technologies and how it goes on from there?

It’s about context management, what Stephan already said, because the easiest way is to drop the documentation into the context and the LLM just knows about it.

If you use an MCP server, the LLM has to know when it has to question the documentation, when it has to search for the documentation.

So if the LLM thinks it knows everything, it will not use the documentation.

And yes, it is a good way, but I think we also have to restructure our documentation.

There are great approaches like Serena MCP, which helps you to query source code in a more source code way, not in a file-based way.

And I think this is important.

And I think this is also important for normal docs to not just query the files and not just arbitrary random chunks, but we have to find better ways.

And then, yes, this will be a solution.

Yeah.

Thank you, Rolf.

So I see that there are so many different challenges that you have explained in different slides.

One of the challenges I see, you mentioned about the Y side of the documentation is not known to LLM.

And for example, we created documentation, architecture documentation, for the developers, for our stakeholders.

And now we also have a new stakeholder, who is LLM nowadays, who can use this to generate a code, right?

And for this, we need to have ADRs defined.

So like, what do you see if LLM itself creates ADR for us?

Like how can we use a model that also help us to create these design decisions, which will be then used by the LLM itself, but also for us?

And what is your view on that and how this Y part, we can keep it unbiased, right?

So first, regarding the Y part, if you have a pull request, then many people already let the LLM create the explanation for the pull request and do the test.

Check whether the explanation, the description of the pull request, really describes which problem has been solved for this pull request, or whether it just mentions what has been changed through this pull request.

If in the pull request, an issue is mentioned, okay, that’s easy.

Regarding the ADR approach, yes, I think this is the right approach.

When you use the LLM with its solution space to create a decision and to document this decision, this is quite helpful.

But you have to make sure that you stay within this one context session so that it is really helpful.

So, for instance, if you first let the LLM code something and it uses a new technology and then ask in another session, okay, can you now create an ADR for this new technology?

It will not know anymore why this technology has been used.

The ADR will have good explanations because the LLMs are very good at, yeah, create trustworthy output.

It sounds trustworthy, but it’s just, yo, we used MySQL because everybody does it.

That’s a problem.

You need real decisions.

And the approach to create the ADR together with the LLM, I think this is the right way to go.

Further questions?

Run, Forrest, run.

So, in the beginning of the talk, you talked about architectural decisions, and some of the decisions can be easily changed now and then in the future.

So maybe those architectural decisions are not architectural and fundamental anymore.

So and if not, what is your opinion about future decisions, like what could be future decisions that are not that easy to change, as we have now, for example, with the language or, I don’t know, infrastructure set up or whatever.

That’s a very good question, because if we manage to implement spec-driven development in such a way that it works, then we have the machine as a coding agent, which implements a spec hopefully overnight.

And then I think no decision is really an architectural decision anymore in terms of how to change, which would be great, because it would allow us to switch databases, to switch front-end technology, to do everything we would like.

But the architecture is still the fundamental runway.

It’s the basis on which we build.

And if you don’t create an architecture, the LLM likes JavaScript, HTML, CSS, and you will get a web-based application every time.

So the architecture is really important nowadays with the LLM.

It gives the LLM guidance, even if it’s easy to change those decisions in the future.

Who is shaping this vision, architectural vision?

The quality requirements.

So you should know the quality requirements of your customers.

They should be documented in Arc 42 in your architecture and should be the basis of every decision.

If you make a decision and notice that, well, I didn’t meet the quality requirements, then something is wrong.

Then you use the quality requirement you weren’t aware of, or maybe you made a decision which you didn’t have to make.

So quality requirements and decisions closely belong together.

And those quality requirements shape the architecture and software we build.

Okay.

Time for one more question, if there is any.

Doesn’t seem to be the case.

Thank you, Ralf.

You’re welcome.

Episode 297 - Ralf D. Müller: Future of Software Architecture: How GenAI & LLMs Are Shaping the Code of Tomorrow | Transcript

Software Architektur im Stream

Episode 297 - Ralf D.