AI Coding Is Solved. Software Engineering Isn’t.

If coding is "solved," is software engineering? Ajay Medury (software engineer) and Andrew Sierota (systems engineer) pick up where Episode 1 left off and get into the part that isn't solved: judgment. They trade notes on why weekly usage limits have quietly become the real project budget, what it's like to build a sharded Minecraft world solo as both product manager and principal engineer, what Amazon's New World got wrong about scale, running decorrelated multi-model code reviews, and what an AI "skill" actually is. It might all just be an act of intelligence.

In this episode:

Why "coding is solved" but software engineering isn't, and why judgment is the expensive part
The new bottleneck: weekly subscription usage limits as a hard budget
Breaking a big build into modules and submodules, and shrinking scope to actually ship
Wearing every hat at once: product manager and principal engineer
New World and the problem of scale at launch
Large codebases, heavy test coverage, and review rounds that exposed process gaps
Pre-flight checks, linters, and spec-tracing that cut review loops down
Decorrelated reviews: several different models reviewing blind, then taking the union of findings
What an AI "skill" is: system prompts, user prompts, and guardrails for long workflows
Severity tiers for findings: blockers, warnings, defers, suggestions, and nits
Why systems admins, as generalists by trade, may be an ideal audience for these tools

Chapters:

(00:00:00) - Real intelligence, or just an act?
(00:01:02) - Coding is solved; software engineering isn't
(00:02:02) - Keeping up with the release pace
(00:05:27) - Vibe coding vs. a repeatable process
(00:06:40) - Usage limits are the new budget
(00:08:40) - Breaking the build into modules
(00:12:18) - A sharded world, every hat on one builder
(00:17:12) - New World and the problem of scale
(00:25:04) - 200k lines and a 90-round review
(00:27:20) - Pre-flight checks that cut 90 rounds to 5
(00:30:49) - The podcast's own local-GPU pipeline
(00:33:27) - Learning by asking "what do you mean?"
(00:35:35) - When a large agent run burned through the budget
(00:38:05) - What is a skill, really?
(00:48:05) - Skills as guardrails for long workflows
(00:51:09) - Severity tiers: blockers to nits
(00:54:34) - Why sysadmins are ideal builders
(00:56:29) - A long-running Minecraft community, the real driver
(01:00:56) - Closing: an act of intelligence

Transcript

Speaker: 00:00:00

Hello, my name is Ajay Medury, and I'm a software engineer, and today I'm

Speaker: 00:00:06

joined by...

Speaker: 00:00:08

My name is Andrew Sierota, and I'm a systems admin.

Speaker: 00:00:13

Awesome, and today we are here to talk about various topics

Speaker: 00:00:18

in the AI ML space for a podcast that we have coined

Speaker: 00:00:24

Active Intelligence, because we're trying to figure out if it's real intelligence or

Speaker: 00:00:29

is it just acting?

Speaker: 00:00:33

And this podcast is for all aspiring creators, creatives, and

Speaker: 00:00:39

builders.

Speaker: 00:00:40

Or those who have already been doing it for a while and just are looking for new

Speaker: 00:00:44

tools and maybe ways to improve their workflows.

Speaker: 00:00:47

I'm curious if, you know, one of the things we talked about last time was

Speaker: 00:00:52

particularly like, you know, software engineering, writing code might be a solved

Speaker: 00:00:56

problem, but is software engineering the solved problem?

Speaker: 00:00:59

And I think, yeah, I was curious.

Speaker: 00:01:02

Yeah, yeah, picking up where we left off on the last episode there, I remember us

Speaker: 00:01:07

saying that coding could be solved, right, but software engineering definitely

Speaker: 00:01:12

isn't, and I think I was touching on this while we were chatting just before the

Speaker: 00:01:17

show, you know, coding is basically, you know,

Speaker: 00:01:21

Claude can write code all day long, faster than anyone can

Speaker: 00:01:27

humanly.

Speaker: 00:01:28

Mm-hmm.

Speaker: 00:01:29

It'll figure it out if you give it enough time, enough tokens.

Speaker: 00:01:31

Oh, yeah.

Speaker: 00:01:33

But judgment isn't free.

Speaker: 00:01:35

Yes.

Speaker: 00:01:36

And that's where humans are still very valuable, is judgment.

Speaker: 00:01:40

And that can be really expensive.

Speaker: 00:01:43

It could be a really expensive mistake if you have poor judgment on the usage of

Speaker: 00:01:48

your code.

Speaker: 00:01:49

And talking about that in particular, here's like bad judgment in terms of

Speaker: 00:01:53

accidentally putting a vulnerability out there that could now all of a sudden be

Speaker: 00:01:57

discovered by models much more easily.

Speaker: 00:01:59

The cost of that is pretty intense.

Speaker: 00:02:01

Yeah, yeah.

Speaker: 00:02:02

And I don't remember what the, that, there was the project that Anthropic did with

Speaker: 00:02:06

like the 30 big companies.

Speaker: 00:02:08

Yes.

Speaker: 00:02:09

To like the pre-release of Mythos.

Speaker: 00:02:11

Yeah.

Speaker: 00:02:11

And they were supposed to like patch everything, supposedly, before they released

Speaker: 00:02:15

Fable.

Speaker: 00:02:16

Yes.

Speaker: 00:02:17

Right.

Speaker: 00:02:18

That's funny, 4.8, Opus 4.8 was only released like two weeks ago.

Speaker: 00:02:25

That's.

Speaker: 00:02:25

And then less than two weeks later, we have Fable now.

Speaker: 00:02:29

Three days later, we don't have Fable.

Speaker: 00:02:31

Which is really interesting to me because traditional software engineering took a

Speaker: 00:02:35

lot more time, had a lot more rituals potentially.

Speaker: 00:02:39

And, you know, again, we kind of broached the subject last time is, were those

Speaker: 00:02:42

rituals still meaningful?

Speaker: 00:02:43

Like, do those still make sense to do today?

Speaker: 00:02:46

Like, because I can't even imagine a time like four or five years ago where you'd be

Speaker: 00:02:50

able to release a, you know, pretty significant version and then release the next

Speaker: 00:02:53

major version within weeks later.

Speaker: 00:02:55

I think you'd be waiting months between these kind of releases.

Speaker: 00:02:59

So.

Speaker: 00:02:59

Yeah.

Speaker: 00:02:59

Kind of, kind of hard to keep up with, to be honest.

Speaker: 00:03:02

You know, there's, there's so much changing so quickly.

Speaker: 00:03:05

By the time this is released, you know, who knows what would have changed.

Speaker: 00:03:08

Yeah.

Speaker: 00:03:09

Yeah.

Speaker: 00:03:10

I think that a lot of what we're doing now, and I guess

Speaker: 00:03:15

something that I've done, like, obviously the world's changing every day.

Speaker: 00:03:19

There's new tools, you know.

Speaker: 00:03:20

It's hard to, you know, want to use the latest and greatest all the time.

Speaker: 00:03:24

Yes.

Speaker: 00:03:25

And, but also keep adding new requirements.

Speaker: 00:03:28

To your project, right?

Speaker: 00:03:30

Yes.

Speaker: 00:03:30

Because, like, there was a point in time where in the, my Minecraft project right

Speaker: 00:03:35

now, I was just adding so much stuff because I was like, oh, yeah, I like this thing

Speaker: 00:03:38

can code everything for me.

Speaker: 00:03:40

Oh, yeah.

Speaker: 00:03:40

That's no longer a limit, but then once I started to get to, well, is it going to

Speaker: 00:03:46

work, right, there's just too many things to check.

Speaker: 00:03:51

Yeah.

Speaker: 00:03:52

And, and I think I remember last episode, I said, I think, you know, testing would

Speaker: 00:03:56

be 10 times as much as the production.

Speaker: 00:03:58

Yeah.

Speaker: 00:03:59

I'm actually thinking it's going to be 100 times more than the development now.

Speaker: 00:04:03

Yeah.

Speaker: 00:04:03

The realization is now dawning.

Speaker: 00:04:05

It's like, oh, no, this, there's a lot more.

Speaker: 00:04:08

I gave it the ability to build all this stuff.

Speaker: 00:04:10

The time it will take for me to now validate that, yeah, it just feels.

Speaker: 00:04:13

It's going to be a lot.

Speaker: 00:04:14

Yeah.

Speaker: 00:04:14

Yeah.

Speaker: 00:04:15

And I think that's the interesting part is because back when, back when I was in my

Speaker: 00:04:20

previous company, we were a cloud company.

Speaker: 00:04:22

And so we were trying to launch systems for others to use.

Speaker: 00:04:26

The expectations there were always like.

Speaker: 00:04:28

Let's maybe be a little more restricted because if we can restrict the size of the

Speaker: 00:04:34

system that we're like actually shipping out there, we can maybe do a better job of

Speaker: 00:04:37

building it and reduce things like risk, which is I'm imagining a question that

Speaker: 00:04:42

Anthropic is asking right now is what is the risk of actually releasing this model?

Speaker: 00:04:46

So it becomes a similar question.

Speaker: 00:04:48

I think we had, we had those kinds of conversations all the time where it's like,

Speaker: 00:04:52

all right, do we actually do less intentionally?

Speaker: 00:04:55

And the answer at times was yes.

Speaker: 00:04:57

Yeah.

Speaker: 00:04:57

We should do less intentionally.

Speaker: 00:04:59

I don't also think that always applies towards other kinds of systems like our

Speaker: 00:05:04

projects, right?

Speaker: 00:05:04

I do feel like we were briefly talking about this before the podcast about like, oh,

Speaker: 00:05:09

what is, you know, like, what does software engineering look like versus vibe

Speaker: 00:05:13

coding?

Speaker: 00:05:14

And there is definitely a good number of things I can bring up when I start talking

Speaker: 00:05:19

about it.

Speaker: 00:05:19

Though I do want to say like one distinction we kind of were, maybe we liked the

Speaker: 00:05:24

idea of it is software engineering.

Speaker: 00:05:27

Like in the big tech companies, like a well-oiled process, it's a repeatable thing

Speaker: 00:05:32

that they keep repeating to keep, you know, churning out new features, new products

Speaker: 00:05:36

and so on and so forth.

Speaker: 00:05:37

However, I do think that the vibe coding and more like, you know, building

Speaker: 00:05:43

locally, building a system today, like a lot of engineers who do it outside of the

Speaker: 00:05:48

big tech, I feel like it's more like a project where the project is something you

Speaker: 00:05:52

kind of figure out how to execute the project as you go along.

Speaker: 00:05:55

There isn't just an answer for every single thing.

Speaker: 00:05:57

Like you don't get told like, oh, this is where you, you know, release the code,

Speaker: 00:06:00

this is where you talk to next.

Speaker: 00:06:01

You know, you don't go step one, step two, step three with the actual like new world

Speaker: 00:06:06

of building systems, building like projects.

Speaker: 00:06:09

I do feel like there's a lot more variance.

Speaker: 00:06:12

And Andrew, it sounds like, sounds like you're kind of experiencing some of that,

Speaker: 00:06:15

right?

Speaker: 00:06:15

If I'm getting, if I'm getting it right.

Speaker: 00:06:17

Well, absolutely.

Speaker: 00:06:18

And it's funny, like, like in enterprise, you have budgets, you have deadlines, you

Speaker: 00:06:23

have a boss who's breathing down your neck.

Speaker: 00:06:25

Yeah.

Speaker: 00:06:26

Yeah.

Speaker: 00:06:27

But, but when, when you have your own project, all of a sudden, the bigger

Speaker: 00:06:33

limits is, especially when you're not paying for API, you're paying for a

Speaker: 00:06:37

subscription like ClaudeMax, CodexMax.

Speaker: 00:06:40

The biggest, the next big limit is your usage every week, your usage in every five

Speaker: 00:06:46

hour window.

Speaker: 00:06:47

And that's actually the budget now that I'm working with.

Speaker: 00:06:50

Like, like I have to calculate, like I got a 20 X max subscription for Codex and

Speaker: 00:06:55

Claude.

Speaker: 00:06:57

I can use that up a weekly limit in two days.

Speaker: 00:07:01

And I have two subscriptions.

Speaker: 00:07:03

So I can code for four days a week.

Speaker: 00:07:07

That's your budget.

Speaker: 00:07:08

Yeah.

Speaker: 00:07:08

That's my budget.

Speaker: 00:07:09

And, and then I have to think, well, that's just testing, reviewing the code.

Speaker: 00:07:14

And I haven't even really reached the stage where I'm doing practical tests.

Speaker: 00:07:19

Yeah.

Speaker: 00:07:19

Yeah.

Speaker: 00:07:19

Real, real world.

Speaker: 00:07:20

Yeah.

Speaker: 00:07:21

Yeah.

Speaker: 00:07:21

And so that's why I said, I'm like, wow, even with this process, I'm going to be

Speaker: 00:07:26

able to do it.

Speaker: 00:07:26

But as I'm practically fully automated now, it's still going to take a lot of time

Speaker: 00:07:30

because budgets aren't infinite for most projects.

Speaker: 00:07:34

Yes.

Speaker: 00:07:34

Yes.

Speaker: 00:07:35

And I think that's the idea of like within this budget for this project, how can I

Speaker: 00:07:39

actually figure out to execute the thing that I care about?

Speaker: 00:07:42

And I think that's the process of like, oh, wow.

Speaker: 00:07:45

I'm saying process project over and over again.

Speaker: 00:07:48

Maybe I'll say like, there is a self-reflection that needs to happen in projects to

Speaker: 00:07:53

feel like, Hey, what is, what is done?

Speaker: 00:07:55

done.

Speaker: 00:07:55

Like when do I actually, as you said, like earlier in your project, you're actually

Speaker: 00:07:59

like do more.

Speaker: 00:08:00

And eventually you started to realize like, okay, this might be a little too much

Speaker: 00:08:06

because then the amount of stuff that I can validate and actually make sure the

Speaker: 00:08:09

quality is good might be growing so quickly.

Speaker: 00:08:12

Then you have to take a judgment call, which actually lets you decide, or then you

Speaker: 00:08:17

have to, the AI won't do this for you.

Speaker: 00:08:19

So as a builder, you need to take a judgment call, say that, no, we're going to stop

Speaker: 00:08:24

here.

Speaker: 00:08:24

We're going to actually now figure out the validation.

Speaker: 00:08:26

We're going to start figuring out if everything's working.

Speaker: 00:08:28

At least I think that's the way I'm thinking about it.

Speaker: 00:08:31

I don't know if you feel a similar, like, do you feel like that's the process?

Speaker: 00:08:34

Yeah.

Speaker: 00:08:35

Yeah.

Speaker: 00:08:35

I, to go along with that for me, how I've kind of thought about it is, you know, I

Speaker: 00:08:40

broke the project down into phases, right?

Speaker: 00:08:43

And I was like, okay, we need, what can we start out with?

Speaker: 00:08:46

You know, an MVP, right?

Speaker: 00:08:47

Like what can we start out with?

Speaker: 00:08:49

Even that is pretty big in scale.

Speaker: 00:08:52

But at least that once that's done, there's a foundation for the later updates.

Speaker: 00:08:56

Right.

Speaker: 00:08:57

Um, and so right now I think that's why this first phase, even though I've already

Speaker: 00:09:02

reduced the scope, like I have like 16 planned modules, I'm only coding like, you

Speaker: 00:09:07

know, eight of them.

Speaker: 00:09:09

And one of them is a really big commons module.

Speaker: 00:09:11

Yeah.

Speaker: 00:09:12

But the thing is that they're being coded, but a lot of them are just skeletons for

Speaker: 00:09:16

the next phases.

Speaker: 00:09:17

I mean, I think, uh, I may say eight modules, eight modules.

Speaker: 00:09:21

Yeah.

Speaker: 00:09:21

I actually feel like that's a good thing.

Speaker: 00:09:23

Usually being able to break it down into smaller pieces.

Speaker: 00:09:26

So you actually then go and if something breaks, you want to ideally be able to

Speaker: 00:09:29

focus on one module.

Speaker: 00:09:30

Oh, absolutely.

Speaker: 00:09:31

Absolutely.

Speaker: 00:09:31

And do you feel like that's kind of the, it has worked pretty effectively?

Speaker: 00:09:35

So, so one of the modules that I have, um, has been broken down into like five sub

Speaker: 00:09:40

modules.

Speaker: 00:09:40

All right.

Speaker: 00:09:41

Okay.

Speaker: 00:09:41

Cause, cause there were, there were big enough elements individually

Speaker: 00:09:45

that I thought, you know, we need to break this down a little bit more, but for the

Speaker: 00:09:49

sake of the project planning, those modules are now instead of oh four, oh four, a,

Speaker: 00:09:55

b, c.

Speaker: 00:09:56

I don't want to go ahead and change all the other module numbers to squeeze those

Speaker: 00:10:00

in.

Speaker: 00:10:01

Yes.

Speaker: 00:10:01

Yes.

Speaker: 00:10:01

You don't want to renumber everything.

Speaker: 00:10:03

You just want to let it be a submodule of the existing.

Speaker: 00:10:06

Exactly.

Speaker: 00:10:06

So those are just submodules now.

Speaker: 00:10:08

Um, at the end of the day, um, it's all about, can we maintain it?

Speaker: 00:10:12

Um, yeah.

Speaker: 00:10:13

And I think interestingly, what you're experiencing is a, there, there's a little

Speaker: 00:10:18

bit of a mirror.

Speaker: 00:10:19

There's like the other side of the coin.

Speaker: 00:10:20

If you look at software engineering, traditionally, I think we were saying there's

Speaker: 00:10:23

processes that once one follows, uh, and the interesting thing is like what I've

Speaker: 00:10:28

experienced is when you're trying to actually build something, build something new,

Speaker: 00:10:32

usually you're getting requirements from somebody.

Speaker: 00:10:35

You're actually getting requirements from a product manager or from leadership

Speaker: 00:10:37

saying that, Hey, we've identified, uh, this opportunity in the market.

Speaker: 00:10:41

Can you go scope it out?

Speaker: 00:10:43

Can we actually go figure out the process actually entails that, right?

Speaker: 00:10:46

Like the process says, uh, leadership identified an opportunity product manager.

Speaker: 00:10:50

Now it goes and figures out what that opportunity is like, how big is it?

Speaker: 00:10:53

How much scope is it?

Speaker: 00:10:54

How many things need to be built out to capture that opportunity?

Speaker: 00:10:57

And then the software engineering folks get pulled in the senior, you know,

Speaker: 00:11:01

principal folks get pulled in, uh, where they're like, okay, what modules do we need

Speaker: 00:11:05

to actually make this?

Speaker: 00:11:07

Oh, do we need new products?

Speaker: 00:11:08

Do we need eight modules?

Speaker: 00:11:09

Do we need three?

Speaker: 00:11:10

Like, is it one good enough?

Speaker: 00:11:12

Uh, and interestingly, uh, you know, there's a lot of

Speaker: 00:11:15

things we can do with confused systems.

Speaker: 00:11:21

There's a lot of things we can do with skilled systems.

Speaker: 00:11:22

And even with, uh, you know, like the, uh,

Speaker: 00:11:26

the, the, the, the, the applications that we, that we wanted to very, very much, we

Speaker: 00:11:29

didn't want to, we wanted to shift that to the, to the, uh,

Speaker: 00:11:31

you know, the, and the, the second one, we were, we were really thinking

Speaker: 00:11:37

about how we can start working with new things.

Speaker: 00:11:43

All of these things are very different than what it would take traditionally the big

Speaker: 00:11:47

tech companies to do, right?

Speaker: 00:11:49

Because they need individuals to write the code in the past.

Speaker: 00:11:52

Maybe not anymore.

Speaker: 00:11:53

Yeah, yeah.

Speaker: 00:11:54

Hmm.

Speaker: 00:11:56

I think, you know, as I've been letting Claude code it, planning the

Speaker: 00:12:02

project out, having the spec sheets and everything like that, I begin to realize

Speaker: 00:12:07

that I kind of wish I actually shrunk it even more.

Speaker: 00:12:09

Shrunk it even more?

Speaker: 00:12:10

Like a lot more, actually.

Speaker: 00:12:13

A lot more.

Speaker: 00:12:13

Because right now I have, like, if we, I'm not sure we talked about, like, the

Speaker: 00:12:17

architecture of it all, but there's going to be, it's going to be a sharded system.

Speaker: 00:12:21

We're going to have nine separate worlds.

Speaker: 00:12:24

They're each going to be very smoothly transitions for players to go between them.

Speaker: 00:12:29

But that requires a lot of extra networking code on top of Minecraft already.

Speaker: 00:12:35

And I was thinking, I could have actually just done just one server.

Speaker: 00:12:41

Single host, yeah.

Speaker: 00:12:42

Single host.

Speaker: 00:12:42

Exactly.

Speaker: 00:12:43

Ignored all the extra networking stuff.

Speaker: 00:12:45

And, you know, I'd probably already be in game testing right now.

Speaker: 00:12:49

Mm-hmm.

Speaker: 00:12:50

Because I think that that extra layer, on top of all the other things that I wanted,

Speaker: 00:12:55

Yes.

Speaker: 00:12:55

is actually, that's the complicated part that Claude is spending a lot of time on.

Speaker: 00:13:01

Yeah.

Speaker: 00:13:02

There's a lot of gotchas that, on the first pass, it's not going to catch.

Speaker: 00:13:07

Yes.

Speaker: 00:13:07

And I thought about it.

Speaker: 00:13:10

I was like, maybe I should.

Speaker: 00:13:11

But, you know, I've already spent so much.

Speaker: 00:13:13

So you are the product manager and the principal engineer, like, dealing with this

Speaker: 00:13:18

at the same time.

Speaker: 00:13:18

Yes, and I'm like, okay, well, it's going to be worth it when it works, but I will

Speaker: 00:13:24

get back to you on when it works.

Speaker: 00:13:26

And this is where I do feel like the traditional software engineering and big tech

Speaker: 00:13:30

would have been like, oh, no, we failed.

Speaker: 00:13:32

Because the process should have already caught this at some point.

Speaker: 00:13:35

So I feel like that's the interesting differentiation I see right now.

Speaker: 00:13:40

Which is not always to say it was a good thing.

Speaker: 00:13:42

Because I do feel like going through this process of, like, or the project approach

Speaker: 00:13:47

of, like, let's go start, let's just see what we can build, and, you know, let's

Speaker: 00:13:51

make it unrestricted, right?

Speaker: 00:13:53

We're going to learn a lot more.

Speaker: 00:13:54

And I think that means we're actually going to figure out things more individually.

Speaker: 00:13:57

And I'm curious if you feel like doing the project in the way that you have done it

Speaker: 00:14:02

so far has actually taught you a lot more of what you would avoid next time.

Speaker: 00:14:07

And because you are mentioning that maybe you should have started smaller and, like,

Speaker: 00:14:11

started more restrictive.

Speaker: 00:14:13

And I'm curious if you feel like doing the project in the way that you have done it

Speaker: 00:14:13

so far has actually taught you a lot more of what you would avoid next time.

Speaker: 00:14:13

There's, like, history in software engineering that kind of points us to some of

Speaker: 00:14:17

this stuff, right?

Speaker: 00:14:19

And even then, a lot of software engineers don't actually follow it.

Speaker: 00:14:22

A lot of companies don't actually follow it.

Speaker: 00:14:24

There's no guarantee it's going to actually be repeatable and working.

Speaker: 00:14:28

So individual builders now have the same similar power.

Speaker: 00:14:32

And I'm curious, are you, like, do you feel like this is one of those moments where

Speaker: 00:14:38

you're like, all right, I'm going to go ahead and let this run as it does.

Speaker: 00:14:42

But next time, I'm actually going to do single host or try and figure out what

Speaker: 00:14:44

single host looks like.

Speaker: 00:14:45

Yeah, I have a couple ideas of new projects, I suppose.

Speaker: 00:14:50

And I thought about it a little bit.

Speaker: 00:14:53

I do think that the stuff I'm doing now is informing future

Speaker: 00:14:59

projects.

Speaker: 00:15:00

I definitely would have done it a lot differently.

Speaker: 00:15:03

And another part of it is I constantly do change what I do

Speaker: 00:15:08

all the time.

Speaker: 00:15:10

Like, half of the time.

Speaker: 00:15:12

Half of my time is spent actually optimizing the workflow, thinking about where can

Speaker: 00:15:15

we cut costs?

Speaker: 00:15:17

Where can I cut usage?

Speaker: 00:15:18

For example, I have directed Opus to actually use

Speaker: 00:15:24

Sonnet to implement fixes now.

Speaker: 00:15:26

Yes.

Speaker: 00:15:26

Because, I mean, Opus will write the, you know, the plan and then Sonnet's pretty,

Speaker: 00:15:31

pretty good at implementing it.

Speaker: 00:15:33

And at the end of the day, Opus will still review the changes.

Speaker: 00:15:37

So I'm still gaining it behind the front-tier model.

Speaker: 00:15:40

And it's not just Opus, right?

Speaker: 00:15:41

If I'm not mistaken.

Speaker: 00:15:42

Yeah.

Speaker: 00:15:42

Right.

Speaker: 00:15:42

And it's not just Opus.

Speaker: 00:15:43

Yes.

Speaker: 00:15:43

There's other reviewers.

Speaker: 00:15:44

I have GPT 5.5 as well.

Speaker: 00:15:48

DeepSeq.

Speaker: 00:15:50

It's very cheap.

Speaker: 00:15:51

It's very cheap.

Speaker: 00:15:52

It's very cheap.

Speaker: 00:15:54

You're welcome, China.

Speaker: 00:15:59

But, yeah, I think the decorrelated reviews have saved a lot of money,

Speaker: 00:16:05

actually, because I'm having three different models review the code at the same

Speaker: 00:16:10

time.

Speaker: 00:16:10

And there's a lot of use.

Speaker: 00:16:12

There's a lot of unions in their findings, and there's a lot of findings that they

Speaker: 00:16:16

individually would not have picked up.

Speaker: 00:16:17

They're all trained on different data, and that gives you three different

Speaker: 00:16:21

perspectives.

Speaker: 00:16:21

Perspectives, yeah, yeah, I love that.

Speaker: 00:16:23

And I think that's really important when you're trying to build a system that's

Speaker: 00:16:27

robust, and that's actually what I'm trying to do with Minecraft.

Speaker: 00:16:31

Like, the technology behind what's going on here is intentional to be robust,

Speaker: 00:16:37

because there is a lot of different communities that have done similar things.

Speaker: 00:16:42

But not to this scale, and not, like, even

Speaker: 00:16:47

normal MMOs have failed at doing this.

Speaker: 00:16:51

Actually being able to scale out.

Speaker: 00:16:52

Yeah, actually being able to scale and to, like, allow, like, seamless gameplay.

Speaker: 00:16:57

It remains to be seen if I can accomplish this myself, because I get a little

Speaker: 00:17:01

concerned, thinking, like, why hasn't, you know...

Speaker: 00:17:03

I was going to say, for our listeners, can you maybe, like, do you have a specific

Speaker: 00:17:07

thing that you can talk about where the scale was required?

Speaker: 00:17:10

Yes, yes, what was that game?

Speaker: 00:17:12

Amazon Game Studios, we played it a little bit.

Speaker: 00:17:14

Oh, yes, yes, New World, New World.

Speaker: 00:17:15

New World, and see, that was the game that I thought was, I thought Amazon was going

Speaker: 00:17:19

to solve this problem.

Speaker: 00:17:20

And when you say the solve this problem, which problem, if I may say, the problem in

Speaker: 00:17:25

particular.

Speaker: 00:17:25

The problem of scale.

Speaker: 00:17:27

The problem of when your game releases, you have millions of players that arrive,

Speaker: 00:17:32

and all of a sudden you have a 30,000 player queue.

Speaker: 00:17:36

Yes, yes, yes, yes, okay, yeah, yeah.

Speaker: 00:17:39

And not only that, it's not one big world.

Speaker: 00:17:42

There's hundreds of worlds that have lines.

Speaker: 00:17:45

Yes.

Speaker: 00:17:46

They're crashing left and right.

Speaker: 00:17:48

And I was like, come on, AWS, Amazon, had to have had the resources and know-how

Speaker: 00:17:53

to do this.

Speaker: 00:17:55

But yet, they made the same mistake as every other predecessor before them.

Speaker: 00:18:00

And I think that's the really interesting part, is, like, I'm also curious, like, if

Speaker: 00:18:06

I were to go back and ask them that question, what part broke, right?

Speaker: 00:18:09

Like, what was it the fact that now you would have to...

Speaker: 00:18:12

Just have n number of players on the map at the same time?

Speaker: 00:18:15

Was it that they're trying to communicate with each other over voice or something?

Speaker: 00:18:18

And that was essentially what was causing the breakage?

Speaker: 00:18:21

I'm curious of, like, what was their bottleneck?

Speaker: 00:18:25

Because, sorry, yeah, you had something in mind?

Speaker: 00:18:29

I do have something in mind.

Speaker: 00:18:31

Like, I played a lot of different MMOs.

Speaker: 00:18:34

A sharded system is really common, right?

Speaker: 00:18:38

The issue, I think, that Amazon...

Speaker: 00:18:42

That Amazon had with New World was they built the game like it was any other MMO.

Speaker: 00:18:48

They did not take advantage of their expertise.

Speaker: 00:18:50

From the get-go.

Speaker: 00:18:50

Yeah, they built their own game engine, but they didn't do anything unique.

Speaker: 00:18:56

Yeah.

Speaker: 00:18:57

They didn't structure it in a way where they could scale it automatically.

Speaker: 00:19:01

Yes.

Speaker: 00:19:02

Right.

Speaker: 00:19:02

And maybe they did, but it didn't work.

Speaker: 00:19:04

The process failed.

Speaker: 00:19:05

It didn't work.

Speaker: 00:19:07

The risk was not assessed properly.

Speaker: 00:19:09

Yes.

Speaker: 00:19:09

I mean, there were...

Speaker: 00:19:10

We played launch.

Speaker: 00:19:13

We literally could not play for a few days.

Speaker: 00:19:16

Yeah.

Speaker: 00:19:17

Yeah.

Speaker: 00:19:17

We actually gave up on the weekends.

Speaker: 00:19:19

Yeah.

Speaker: 00:19:19

Yeah.

Speaker: 00:19:20

Yeah.

Speaker: 00:19:20

We had to wait a few days.

Speaker: 00:19:22

And to me, that's millions of dollars being lost.

Speaker: 00:19:25

Yeah.

Speaker: 00:19:25

I mean, I think New World would be a completely different game.

Speaker: 00:19:28

If it was built with that scale from the get-go.

Speaker: 00:19:29

If it was built properly from the get-go.

Speaker: 00:19:31

Yeah.

Speaker: 00:19:32

And I think that's the interesting, like, trade-off there as well.

Speaker: 00:19:36

Like, this is...

Speaker: 00:19:37

My understanding is also, like, traditional software engineering would also ask you

Speaker: 00:19:41

that question is, do you know if you need that scale?

Speaker: 00:19:44

Like, do you know if the marketing has been effective enough?

Speaker: 00:19:47

And do you know if the product...

Speaker: 00:19:49

Does the product manager actually...

Speaker: 00:19:50

Have they talked to the marketing department and seen an insane amount of, like,

Speaker: 00:19:54

interest?

Speaker: 00:19:54

And have they been able to calculate the amount of interest to then inform the scale

Speaker: 00:19:58

decision?

Speaker: 00:19:59

Because it is very common for software engineering teams during this process of

Speaker: 00:20:04

building the product to ask this question of, like, hey, do we want to be scalable

Speaker: 00:20:09

to, like, two million players on day one?

Speaker: 00:20:11

Or do we want to be scalable, you know, like, one world will be scalable to, like,

Speaker: 00:20:15

100,000 users at a time, that kind of thing.

Speaker: 00:20:18

And they make these decisions so that they can, you know, like, punt some of the

Speaker: 00:20:22

very complicated, very difficult things, in this case being, like, instead of

Speaker: 00:20:26

splitting this module into six different sub-modules, module three into six

Speaker: 00:20:30

different sub-modules, I'm just going to make module three into two sub-modules for

Speaker: 00:20:35

today.

Speaker: 00:20:35

And that'll satisfy my needs for now.

Speaker: 00:20:38

In this case, it feels like that process was a failure because somewhere, somehow

Speaker: 00:20:42

they didn't understand that the demand was so high and one of the core values as

Speaker: 00:20:47

gamers, as, like, people who enjoy playing games, waiting to get in to play your

Speaker: 00:20:51

game is a game-breaking experience.

Speaker: 00:20:54

Especially when you're super excited and you pre-ordered the game.

Speaker: 00:20:57

Yeah, and you paid extra.

Speaker: 00:20:58

Yeah, you paid extra.

Speaker: 00:21:00

And it's such a shame because I actually did, like, once we finally did play.

Speaker: 00:21:03

Yeah, it was fun.

Speaker: 00:21:04

It was fun.

Speaker: 00:21:04

But the game quickly died off

Speaker: 00:21:08

because, I mean, you had millions of players who could not play.

Speaker: 00:21:12

Yes.

Speaker: 00:21:13

Day one.

Speaker: 00:21:13

And then sometimes your friends would end up on the other server or the other world.

Speaker: 00:21:17

Yeah.

Speaker: 00:21:18

And they couldn't come and join you.

Speaker: 00:21:19

So all of a sudden, one of the main reasons I play games is a social, like, it's a

Speaker: 00:21:23

social thing for me.

Speaker: 00:21:24

I want to play with other people.

Speaker: 00:21:25

I want to play with my friends.

Speaker: 00:21:27

And if I can't play with my friends, I'm going to find a much more difficult return.

Speaker: 00:21:31

Absolutely.

Speaker: 00:21:31

So I do genuinely, like, question.

Speaker: 00:21:35

That's the traditional software engineering method.

Speaker: 00:21:37

So, like, waits for such a long span because the cost of solving these problems tend

Speaker: 00:21:43

to be, again, you're making commitments to your boss.

Speaker: 00:21:45

Yeah.

Speaker: 00:21:46

You're answering to leadership.

Speaker: 00:21:47

You're saying that, all right, leadership is saying that, OK, you have a budget of

Speaker: 00:21:50

these many people for these many weeks.

Speaker: 00:21:52

And if you can get the game released in those weeks, great.

Speaker: 00:21:56

Otherwise, we're going to maybe, like, go a few, you know, not give you a promotion,

Speaker: 00:21:59

whatever it is, right?

Speaker: 00:22:01

The value proposition is so different versus I have seen indie games that have been

Speaker: 00:22:06

so successful and they haven't.

Speaker: 00:22:09

But I also do understand that they don't have that same pressure of, like, you know,

Speaker: 00:22:13

corporate top down of, like, you need to release this soon, they will release it

Speaker: 00:22:18

when they want.

Speaker: 00:22:18

So I'm actually also curious.

Speaker: 00:22:20

Do you feel like do you feel pressure to release your project?

Speaker: 00:22:24

So thankfully, I actually have not released public information on this yet.

Speaker: 00:22:29

Oh, so so if someone finds this podcast, they will recognize my voice.

Speaker: 00:22:34

Then then they're going to start asking questions immediately.

Speaker: 00:22:38

There there are some people who know that it's coming.

Speaker: 00:22:42

I haven't given any hard timelines myself because this is,

Speaker: 00:22:47

you know, a project that I'm figuring out as I'm going along.

Speaker: 00:22:52

Absolutely.

Speaker: 00:22:53

But there is pressure because I think I set up a lot of my own

Speaker: 00:22:58

deadlines in my head.

Speaker: 00:23:00

Right.

Speaker: 00:23:00

There's I have a lot of expectations of where I should be by a certain time.

Speaker: 00:23:04

And that's part of the workflow.

Speaker: 00:23:06

And I'm like, OK, I'm going to trim this.

Speaker: 00:23:07

I'm going to you know, I'm going to accept that, you know, a lot of these things

Speaker: 00:23:12

aren't exactly as I want them, but I'm going to leave it and we're going to move on

Speaker: 00:23:16

and just try to get this working right now.

Speaker: 00:23:18

And I actually think I'm trying to get to in game testing as fast as possible,

Speaker: 00:23:23

because like you were saying, like before the podcast, the practical testing can

Speaker: 00:23:28

serve way it's way more efficient than just, you know, traditional

Speaker: 00:23:34

tests.

Speaker: 00:23:35

Yeah.

Speaker: 00:23:35

Yeah.

Speaker: 00:23:36

And I think that's a really interesting topic.

Speaker: 00:23:38

Actually, we talked about we will jump into that a little more after.

Speaker: 00:23:41

I do want to ask.

Speaker: 00:23:43

But you don't feel at this point you don't feel like you would compromise on certain

Speaker: 00:23:48

aspects, though, despite the time pressure, there are certain things that you are

Speaker: 00:23:52

very much like this is a critical thing based on your experience.

Speaker: 00:23:54

Yes.

Speaker: 00:23:55

Yeah.

Speaker: 00:23:55

Based on my experience, there's a lot of things at this point that I'm I'm holding

Speaker: 00:23:59

on to, which is really interesting to me, because when you talk about big

Speaker: 00:24:03

corporations and Amazon releasing it, the distance between the person who actually

Speaker: 00:24:07

understands the experience and the person building the experience is actually non

Speaker: 00:24:11

-trivial.

Speaker: 00:24:12

So I do feel like in big tech or in generally like big software organizations, that

Speaker: 00:24:16

is something that is I I'm really excited about the, you know, like coding tools and

Speaker: 00:24:22

all of these things becoming much more democratized, because the person who actually

Speaker: 00:24:27

understands the most about the experience now can actually ask direct questions

Speaker: 00:24:29

about, like, which parts of the experience are actually going to be implemented

Speaker: 00:24:33

versus not.

Speaker: 00:24:34

And I feel like I made the joke by you being the PM and the, you know, sometimes

Speaker: 00:24:38

that's I do feel like that that's a good thing, because I actually feel like you're

Speaker: 00:24:42

able to not only understand, but ask questions and actually also get into focus in

Speaker: 00:24:46

the right way.

Speaker: 00:24:47

Absolutely.

Speaker: 00:24:48

Though the testing part is still like an all huge, you know, open box.

Speaker: 00:24:54

And I guess I'm curious.

Speaker: 00:24:55

So, like, we were talking about certain number of lines of test versus code.

Speaker: 00:25:00

And do you want to share?

Speaker: 00:25:01

Yeah.

Speaker: 00:25:02

Yes.

Speaker: 00:25:02

I think

Speaker: 00:25:04

Project now has approximately two hundred thousand lines of code.

Speaker: 00:25:08

And it's about it's not exactly fifty fifty, but it's a little bit.

Speaker: 00:25:12

It's about fifty fifty.

Speaker: 00:25:14

It's going to be by the end of it.

Speaker: 00:25:16

I also meant the two hundred thousand lines is non-trivial.

Speaker: 00:25:18

Yeah, that that is a lot.

Speaker: 00:25:20

And I'm clearly not, you know, hand reviewing any of this.

Speaker: 00:25:26

But there is plenty of standards and conventions that we talked about last time that

Speaker: 00:25:30

are being taken into consideration during the iterative review rounds.

Speaker: 00:25:35

And actually to speak on that, give a little bit more since I've actually interfaced

Speaker: 00:25:39

with the project since then, quite a bit.

Speaker: 00:25:43

One module took 90 review rounds to converge into

Speaker: 00:25:48

what I at a certain point I actually had started to strip away requirements for the

Speaker: 00:25:53

testing.

Speaker: 00:25:54

Is that the module that you broke up or is that the module?

Speaker: 00:25:56

That's the module that I wrote.

Speaker: 00:25:57

OK, OK.

Speaker: 00:25:58

That explains all.

Speaker: 00:25:59

That does explain all.

Speaker: 00:26:00

Yeah.

Speaker: 00:26:00

It did finally convert.

Speaker: 00:26:02

Into where it was mostly like comments and code were just not consistent with

Speaker: 00:26:07

previous changes and at a certain point, I'm like, OK, we're going to keep finding

Speaker: 00:26:11

things forever.

Speaker: 00:26:12

Yeah.

Speaker: 00:26:12

And so I'm like, I think this is a good place to stop.

Speaker: 00:26:16

Yes.

Speaker: 00:26:17

And actually, since I actually went through a review review process

Speaker: 00:26:23

with the review cycle.

Speaker: 00:26:26

Yeah.

Speaker: 00:26:26

With Claude.

Speaker: 00:26:27

And I said, OK, let's take a look.

Speaker: 00:26:30

I had it log all 90 rounds.

Speaker: 00:26:32

You reflected on the review process of like this 90 iterations.

Speaker: 00:26:37

OK, well, OK, yeah.

Speaker: 00:26:38

Tell us more.

Speaker: 00:26:39

I had it.

Speaker: 00:26:40

I had it, you know, from the get go log all 90 rounds.

Speaker: 00:26:44

It logged everything from all the different the three models that I used to review

Speaker: 00:26:48

things.

Speaker: 00:26:48

Really good space to do a reflection on 90 rounds is a lot.

Speaker: 00:26:51

I was like, I was like, I burned a lot.

Speaker: 00:26:54

Speaker: 00:26:55

and so it came back and looked through everything and it gave me.

Speaker: 00:27:00

So I was like, what?

Speaker: 00:27:01

I asked it.

Speaker: 00:27:02

You know, simple English.

Speaker: 00:27:03

Yeah.

Speaker: 00:27:03

What can we do to lower the amount of rounds?

Speaker: 00:27:07

Give me the executive review.

Speaker: 00:27:08

Yeah.

Speaker: 00:27:08

And it came back with a lot.

Speaker: 00:27:10

I'm not exactly familiar with maybe all the terminology specifically.

Speaker: 00:27:15

But there were things like I remember it saying, like, it will add it'll do like a

Speaker: 00:27:21

pre-flight check.

Speaker: 00:27:22

It will, like, trace all the methods and classes ahead of time against the specs.

Speaker: 00:27:27

It will, you know, it will have an index of what it needs to look for.

Speaker: 00:27:31

It added a few linters.

Speaker: 00:27:34

OK, yeah.

Speaker: 00:27:35

Yeah.

Speaker: 00:27:36

That's good.

Speaker: 00:27:36

It did.

Speaker: 00:27:37

So those are improvements.

Speaker: 00:27:38

Yeah, yeah.

Speaker: 00:27:38

Yeah.

Speaker: 00:27:39

And it's actually interesting.

Speaker: 00:27:40

The next model on the next module that it reviewed only took five rounds.

Speaker: 00:27:46

That 90 to five.

Speaker: 00:27:47

That's that's pretty.

Speaker: 00:27:48

That's pretty.

Speaker: 00:27:49

OK, I must also play devil's advocate and ask how big was the other one?

Speaker: 00:27:53

The it's I would say that they were similar, similar, similar.

Speaker: 00:27:58

But but here's the thing.

Speaker: 00:27:59

Here's the thing.

Speaker: 00:28:01

There's another reason why.

Speaker: 00:28:02

Yeah, because of the a lot of the rounds were

Speaker: 00:28:07

finding things that it should have picked up the first time.

Speaker: 00:28:11

OK, OK.

Speaker: 00:28:12

Right.

Speaker: 00:28:12

And that's where those things like tracing all the tracing the comments back.

Speaker: 00:28:17

Yeah, the linters, all of these things really did reduce a lot of the

Speaker: 00:28:22

noise.

Speaker: 00:28:24

So interestingly, I recently read this as well as like I was going through Claude's

Speaker: 00:28:29

has updated documentation online.

Speaker: 00:28:32

They actually I think a while ago put this user guide and I think it's pretty

Speaker: 00:28:37

buried.

Speaker: 00:28:38

Unfortunately, I do feel like it's a little buried.

Speaker: 00:28:39

One of their strong recommendations is plan first always.

Speaker: 00:28:42

But even before planning, you should actually ask it to understand research what

Speaker: 00:28:47

this module is doing or what this code looks like.

Speaker: 00:28:50

How does it actually trace down a particular feature?

Speaker: 00:28:52

So it's like, OK, how does your authentication flow work?

Speaker: 00:28:56

Like would be a good question.

Speaker: 00:28:57

Right.

Speaker: 00:28:57

And that actually does pre-work.

Speaker: 00:28:59

It says that, all right, capture all the information about the authentication flow,

Speaker: 00:29:01

because then I know exactly where I need to make the updates.

Speaker: 00:29:05

Sounds like that's you've experienced that firsthand now.

Speaker: 00:29:09

Yes, yes.

Speaker: 00:29:10

And like I was like last show,

Speaker: 00:29:14

there's a lot of things I'm finding out by brute force, like I'm developing the

Speaker: 00:29:19

process that, you know, a software engineer would have known.

Speaker: 00:29:23

Yeah.

Speaker: 00:29:23

Or would have been would have been instructed to do more than even know, like when

Speaker: 00:29:27

told, turn your brain off.

Speaker: 00:29:28

Just follow the process.

Speaker: 00:29:29

Yeah.

Speaker: 00:29:29

Which is I do feel like that's that's actually a very interesting distinction I want

Speaker: 00:29:33

to get back into later is

Speaker: 00:29:36

you're learning the reason why the judgment exists or the process exists.

Speaker: 00:29:41

You're using your judgment and then getting Claude to give you the right information

Speaker: 00:29:46

so that you can take the correct judgments that, you know, like probably engineering

Speaker: 00:29:51

teams have been doing ad nauseum across time.

Speaker: 00:29:54

And that ends up becoming either tribal knowledge or becomes very strict process.

Speaker: 00:29:58

It's what it feels like to me.

Speaker: 00:30:00

Like, that's that's what I'm hearing almost.

Speaker: 00:30:02

I'm curious how many more like do you feel like we also talked about using existing

Speaker: 00:30:08

tools, we also talked about like some project or so get you done as a repository

Speaker: 00:30:12

that I keep I keep talking, yes, yes, we'll we'll we'll see if that gets flagged in

Speaker: 00:30:16

some.

Speaker: 00:30:17

That is probably what got flagged on the podcast.

Speaker: 00:30:20

OK, that is probably all right.

Speaker: 00:30:22

So we were trying to syndicate our podcasts across things and the tool's name has

Speaker: 00:30:26

now become a problem.

Speaker: 00:30:27

So I'm sorry.

Speaker: 00:30:28

Andrew, you're going to have to find it.

Speaker: 00:30:29

You're going to have to get OK.

Speaker: 00:30:30

We're going to get flagged again.

Speaker: 00:30:33

Oh, it's an official.

Speaker: 00:30:34

I'll ask Claude to beep it out.

Speaker: 00:30:36

Yeah, yeah.

Speaker: 00:30:38

If it can do it, that would be fantastic.

Speaker: 00:30:40

That would be interesting.

Speaker: 00:30:41

And would also at some point love to learn more about the whole process.

Speaker: 00:30:44

We should be talking about that at some point, too.

Speaker: 00:30:45

Yes, yes, absolutely.

Speaker: 00:30:46

There's a whole process that I've used to normalize the audio to transcribe

Speaker: 00:30:52

per speaker.

Speaker: 00:30:54

That's yes, that's pretty amazing.

Speaker: 00:30:56

The podcast.

Speaker: 00:30:57

I did.

Speaker: 00:30:57

I did read the transcript, at least blurbs here and there.

Speaker: 00:31:00

I was like, OK, it's pretty good.

Speaker: 00:31:01

It was impressive.

Speaker: 00:31:03

And it was done with local compute as well.

Speaker: 00:31:06

Yeah, on GPF.

Speaker: 00:31:07

That's another that's a successful project right there.

Speaker: 00:31:09

Yes.

Speaker: 00:31:10

Anyway, sorry, coming back into this.

Speaker: 00:31:13

Do you feel like there are you're learning a lot of things based on your individual

Speaker: 00:31:17

judgment as well and you're learning a lot about the tool life is what I feel like

Speaker: 00:31:22

is most likely happening.

Speaker: 00:31:23

So maybe I should ask you that question.

Speaker: 00:31:25

You're right here.

Speaker: 00:31:26

Do you feel like.

Speaker: 00:31:27

You're learning a lot more about the tool and the process that you would follow and

Speaker: 00:31:30

building now with the modern AI tooling?

Speaker: 00:31:34

Definitely.

Speaker: 00:31:37

I guess, too, I think that was a really broad question.

Speaker: 00:31:40

It was very broad, I'm sorry.

Speaker: 00:31:41

So I was like, oh, where do I start with this?

Speaker: 00:31:44

I guess you have a little bit more specific to help me nail something down.

Speaker: 00:31:47

Yeah, yeah, definitely.

Speaker: 00:31:48

Definitely.

Speaker: 00:31:49

And without spending too much.

Speaker: 00:31:51

So like we talked about reviews, we talked about basically analyzing the codebase

Speaker: 00:31:56

ahead of time, basically doing research, pre-research ahead of time, then planning

Speaker: 00:32:00

and then executing.

Speaker: 00:32:02

Do you feel like there are more examples where you're like, OK, this is taking way

Speaker: 00:32:08

longer, this was too big, and I think maybe more like it

Speaker: 00:32:14

didn't understand me here clearly and it did all these things as well.

Speaker: 00:32:17

One of the things I think you mentioned to me earlier was you did tell Claude to

Speaker: 00:32:22

exercise his own judgment.

Speaker: 00:32:24

And that is something you're still working on figuring out if that was effective or

Speaker: 00:32:28

not.

Speaker: 00:32:28

Right.

Speaker: 00:32:29

Yes.

Speaker: 00:32:29

So so I guess, yeah, I remember talking about this last show.

Speaker: 00:32:33

Like, there's a lot of things that Claude will just fill in the blanks, fill in the

Speaker: 00:32:37

gaps, especially if you're not specific or intentful with your instructions.

Speaker: 00:32:42

If you say code this, well, it's going to code it.

Speaker: 00:32:46

But is it going to be the way you wanted it?

Speaker: 00:32:49

Is it going to be, you know, is it going to work more than once?

Speaker: 00:32:51

You know, you know, and will you be will you be able to follow its thought process

Speaker: 00:32:57

as well?

Speaker: 00:32:58

Exactly.

Speaker: 00:32:58

Yes.

Speaker: 00:32:59

I think before the podcast, we were talking a little bit about like how you've also

Speaker: 00:33:02

maybe you're using a particular workflow here to actually also learn new things

Speaker: 00:33:08

because it may tell you things and then you're just like, what do you mean?

Speaker: 00:33:11

Yes.

Speaker: 00:33:12

Yes.

Speaker: 00:33:12

And I'll speak to that.

Speaker: 00:33:14

Yeah, there's a lot of times where Claude will present an issue to me that popped up

Speaker: 00:33:19

for review.

Speaker: 00:33:21

And I'll read over it and I'll be like, what?

Speaker: 00:33:25

What are you saying?

Speaker: 00:33:26

Yeah.

Speaker: 00:33:27

And so I will literally just ask, like, what do you mean?

Speaker: 00:33:30

Like, could you expand on this a little bit more?

Speaker: 00:33:33

And then after it starts talking, I was like, OK, I'm starting to get the picture.

Speaker: 00:33:37

I ask more specific questions.

Speaker: 00:33:39

Right.

Speaker: 00:33:40

And I keep going and I keep going.

Speaker: 00:33:41

And eventually I have the entire picture and then I'll make a decision.

Speaker: 00:33:46

Yes.

Speaker: 00:33:47

And sometimes by the time I get to there, I'll say, well, we don't need.

Speaker: 00:33:51

We don't need any of this.

Speaker: 00:33:52

Yes.

Speaker: 00:33:53

OK.

Speaker: 00:33:53

And I think that is so this is this is great because this is actually what I wanted

Speaker: 00:33:57

to ask you is how much time did you end up spending doing that?

Speaker: 00:34:00

Because that is a huge value add, right?

Speaker: 00:34:03

Like that is all of a sudden you're like not only learning about the system, you're

Speaker: 00:34:06

also able to then detect it's like this was not a valuable portion of the system,

Speaker: 00:34:09

let's just get rid of it.

Speaker: 00:34:10

That's actually a really good thing.

Speaker: 00:34:11

Honestly, like sometimes you know the tool can do so much.

Speaker: 00:34:15

It does a lot.

Speaker: 00:34:15

And then you all of a sudden you can be like, oh, no, we can simplify.

Speaker: 00:34:18

We should simplify it.

Speaker: 00:34:20

And that is something that takes software engineering teams years potentially before

Speaker: 00:34:25

they realize that the systems they built, there are some portions which are not

Speaker: 00:34:28

useful and not needed and we can get rid of them and then also reduce the amount of

Speaker: 00:34:32

time we take or we spend maintaining them.

Speaker: 00:34:35

And that's cost.

Speaker: 00:34:36

That's again leadership cost.

Speaker: 00:34:38

I'm like curious, like how long now it took you to be able to learn some things

Speaker: 00:34:41

based on like just asking Claude.

Speaker: 00:34:43

Was it days?

Speaker: 00:34:44

Was it hours?

Speaker: 00:34:46

Oh, it's usually minutes.

Speaker: 00:34:47

It's usually all right.

Speaker: 00:34:48

Well, it's usually minutes.

Speaker: 00:34:50

I mean, it when something's costing me time, I

Speaker: 00:34:56

think it's going to be an easy lesson.

Speaker: 00:34:59

That's that's because you are definitely being able to see the, you know, your limit

Speaker: 00:35:04

getting exhausted.

Speaker: 00:35:05

Yeah, that's it.

Speaker: 00:35:07

I don't wait to see the percent change on weekly.

Speaker: 00:35:09

I hit refresh after two, three minutes and well, there's another percent gone.

Speaker: 00:35:13

And you're like, oh, that's that's a heavy.

Speaker: 00:35:15

Yeah.

Speaker: 00:35:16

Yeah.

Speaker: 00:35:16

It was when I had access to Fable for the three days there

Speaker: 00:35:22

was I actually was not specific on something that Fable

Speaker: 00:35:28

was researching for me.

Speaker: 00:35:29

And it presented me with a couple of options.

Speaker: 00:35:32

And I said, expand on this option.

Speaker: 00:35:35

It launched a hundred agent workflow

Speaker: 00:35:39

to research this and came back with the most worthless response

Speaker: 00:35:45

I've had, probably from Claude.

Speaker: 00:35:49

But it was successful in burning 20 percent of my weekly limit in 30 minutes.

Speaker: 00:35:53

Oh, that's 20x too.

Speaker: 00:35:55

Yeah, on 20x.

Speaker: 00:35:56

Yeah, that was about three point three million tokens for that response.

Speaker: 00:36:00

Oh, wow.

Speaker: 00:36:01

Yeah.

Speaker: 00:36:01

Yeah.

Speaker: 00:36:02

And that would have been what that would have been if it was all output.

Speaker: 00:36:05

That would have been like one hundred and fifty bucks an API.

Speaker: 00:36:08

Wow.

Speaker: 00:36:09

Yeah.

Speaker: 00:36:09

Yeah.

Speaker: 00:36:10

Yeah.

Speaker: 00:36:11

That's that's a lot.

Speaker: 00:36:12

That's an.

Speaker: 00:36:13

And that is also something that happens often.

Speaker: 00:36:19

But no, that's that's really interesting.

Speaker: 00:36:21

So it moves so fast that it essentially did so much work.

Speaker: 00:36:25

That was not really worth much at the end.

Speaker: 00:36:28

It told you something you mostly probably already knew.

Speaker: 00:36:30

Yeah.

Speaker: 00:36:30

And I was really simple.

Speaker: 00:36:32

I just expand on option two versus like go do a million.

Speaker: 00:36:36

You know, I didn't I didn't say do like a deep deep research report.

Speaker: 00:36:40

I don't know a PhD on this.

Speaker: 00:36:41

Yeah.

Speaker: 00:36:42

I just I just wanted a simple explanation.

Speaker: 00:36:44

And the thing was, I'm running multiple workflows.

Speaker: 00:36:47

I all tabbed came back.

Speaker: 00:36:50

I'm like, word, what happened here?

Speaker: 00:36:52

All right.

Speaker: 00:36:53

OK, so I do want to ask this question now is like in your mind, like, do you feel

Speaker: 00:36:58

like with events like this in general, like you now need to

Speaker: 00:37:03

really incorporate more rituals, more processes, make your project more of a

Speaker: 00:37:09

process if you want to make this into a.

Speaker: 00:37:12

Real life project or sorry, into a real life product or a real life.

Speaker: 00:37:16

Like, do you feel like now when you start running into things like this, your

Speaker: 00:37:20

confidence level has dropped enough that you now need to add things like guardrails

Speaker: 00:37:23

to increase your confidence, because one thing you already said was you did change

Speaker: 00:37:27

the process to do pre-work and that pre-work saved you from 90 to five iterations,

Speaker: 00:37:33

so like nine iterations on a single module to five, which is massive.

Speaker: 00:37:37

Do you feel like now you'd be more interested in looking at opportunities?

Speaker: 00:37:42

To add more guardrails, to slow it down intentionally?

Speaker: 00:37:45

So so that's a great thing to ask, because I actually that's my next kind of

Speaker: 00:37:51

process improvement part of the workflow is that I realized that I'm creating

Speaker: 00:37:57

a lot of workflows for it to follow.

Speaker: 00:38:00

It doesn't always follow them correctly.

Speaker: 00:38:02

Right.

Speaker: 00:38:03

Even if it's literally reading a script.

Speaker: 00:38:05

And then I realized I probably should be using skills and I haven't.

Speaker: 00:38:10

And so

Speaker: 00:38:12

my next step actually is to implement all the workflows that I have been using over

Speaker: 00:38:16

and over and implement them as formal like Claude or Kodak skills.

Speaker: 00:38:21

Yes.

Speaker: 00:38:21

Nice.

Speaker: 00:38:22

I said, OK, I am a little I may have some biased thoughts over there because I've

Speaker: 00:38:26

been using skills for a while.

Speaker: 00:38:28

But before we get into that, could I would you like to explain for audiences in case

Speaker: 00:38:33

like what is a skill versus like what is other prompting in traditional software or

Speaker: 00:38:38

prompting?

Speaker: 00:38:39

So, you know, maybe I you might be able to talk more to this, but

Speaker: 00:38:44

but I'll give you my interpretation of it first, because, again, I haven't actually

Speaker: 00:38:49

made a skill with Claude yet.

Speaker: 00:38:51

So that that was the next thing I'm going to do.

Speaker: 00:38:54

There's a really useful tool from Anthropic to do it.

Speaker: 00:38:57

But this is the skill tree.

Speaker: 00:38:59

The skill creator is so, so creative.

Speaker: 00:39:04

From my understanding.

Speaker: 00:39:08

Speaker: 00:39:10

essentially

Speaker: 00:39:12

is like a set of instructions that points Claude to where it needs to find the

Speaker: 00:39:17

information to replicate something

Speaker: 00:39:19

over and over, like a workflow.

Speaker: 00:39:21

Right.

Speaker: 00:39:22

Yes.

Speaker: 00:39:23

There is for me, it's like I don't know exactly how it was that different than

Speaker: 00:39:30

me telling you to read a file.

Speaker: 00:39:32

See, so that that's the question, because that's what I had already.

Speaker: 00:39:35

Like, all I have to do is tell Claude will convert this runbook into a skill.

Speaker: 00:39:41

That's all I'm going to say.

Speaker: 00:39:42

OK.

Speaker: 00:39:42

And then let's see if it works.

Speaker: 00:39:44

Right.

Speaker: 00:39:46

I'm hoping it will because it's it's all right.

Speaker: 00:39:48

All the instructions already there.

Speaker: 00:39:51

That's yes.

Speaker: 00:39:52

I think I think generally you've got the right gist of it.

Speaker: 00:39:54

It is a workflow for Claude to execute.

Speaker: 00:39:57

And in a sense of like instructions that it should know how to like methodically

Speaker: 00:40:01

say, do one, two, three, and you'll get the result that is intended as part of this

Speaker: 00:40:05

skill.

Speaker: 00:40:06

Like, for example, a skill could be like a tax preparation skill could be like, OK,

Speaker: 00:40:10

you know, put together the person's W2 and then put together all the right fields,

Speaker: 00:40:15

fill up the 1099 form or whatever the form is and then submit it.

Speaker: 00:40:19

Right.

Speaker: 00:40:19

Those would be the skill.

Speaker: 00:40:21

The thing I can add that I what I understand why it tends to behave differently than

Speaker: 00:40:25

traditionally, like just saying a prompt and then telling it to go read a file is

Speaker: 00:40:29

there is something the concept of system prompts versus user prompts.

Speaker: 00:40:35

Yeah.

Speaker: 00:40:35

So system prompts are what the AI model is essentially used to give themselves

Speaker: 00:40:40

context or like separate the role from like, oh, this is supposed to be me doing

Speaker: 00:40:45

some work versus this is what the person who's talking to me is asking.

Speaker: 00:40:48

So that person talking to me from a model perspective with the user prompt, whereas

Speaker: 00:40:53

the system prompt is essentially the actual model's own identity in a sense.

Speaker: 00:40:57

Like, what is its role?

Speaker: 00:40:59

What is its job?

Speaker: 00:41:00

What context is it?

Speaker: 00:41:01

Am I working for Anthropic?

Speaker: 00:41:02

What is my language supposed to be like?

Speaker: 00:41:04

Am I supposed to avoid saying things like, you know, like avoid profanity, avoid

Speaker: 00:41:09

doing like suggesting things that are not real, always base my things in reality.

Speaker: 00:41:13

So the there are two prompts that every AI system or every chatbot system basically

Speaker: 00:41:19

usually uses.

Speaker: 00:41:20

It is a system.

Speaker: 00:41:21

It is a user prompt because you could have n number of user prompts that get built

Speaker: 00:41:25

up over time.

Speaker: 00:41:25

But there's only one system prompt for that system to build over time, which was a

Speaker: 00:41:30

problem in the past, because if there was some very specific instructions like tax

Speaker: 00:41:35

reparation, that is very repeatable, that's very standard.

Speaker: 00:41:39

The system prompt might not be able to hold every single set of

Speaker: 00:41:45

like tax preparation

Speaker: 00:41:47

and, you know, like a writing expert and let's say, you know, a software engineer.

Speaker: 00:41:52

Right.

Speaker: 00:41:53

You can't put all of those into the system prompt.

Speaker: 00:41:54

It just gets too big.

Speaker: 00:41:55

So what they actually the innovation here with the skill was that the system prompt

Speaker: 00:42:00

would have a stub and then you could replace that stub with a user provided set of

Speaker: 00:42:06

instructions, which was a skill.

Speaker: 00:42:07

So you have within the system prompt, a specific section that's like add skill text

Speaker: 00:42:12

here.

Speaker: 00:42:13

And even the skill is supposed to follow a particular format that works well with

Speaker: 00:42:18

that model, which is supposed to give information like, all right, give me the

Speaker: 00:42:22

circumstances under which the skill needs to be invoked.

Speaker: 00:42:25

Give me, you know, like the actual step by step instruction first.

Speaker: 00:42:28

Give me some examples like how this works.

Speaker: 00:42:31

Give me like some starting text and what the eventual response should look like,

Speaker: 00:42:34

because then all of those things can go into that system prompt.

Speaker: 00:42:38

And then it's like the.

Speaker: 00:42:40

The AI model has a more specific role and is now executing a particular skill like

Speaker: 00:42:44

tax preparation and instead of the user prompt defining all of those things, which

Speaker: 00:42:49

ends up sometimes also being isolated and separated because you're also doing things

Speaker: 00:42:52

like when a user asks you for something, the AI model may or may not do everything

Speaker: 00:42:57

because a user may ask you to do bad things as well, like, you know, maybe a prompt

Speaker: 00:43:01

injection trying to do something negative is also a possibility.

Speaker: 00:43:04

So that same isolation that occurs on the user input or the sanitization

Speaker: 00:43:10

that occurs on user input does not apply to the skill and the system prompt.

Speaker: 00:43:13

Therefore, the skill is intended to be more focused with the instructions and follow

Speaker: 00:43:17

a particular format.

Speaker: 00:43:18

That's that's maybe the overly detailed explanation of it.

Speaker: 00:43:22

But yes, the intention is supposed to be that you can repeatable workflows end up in

Speaker: 00:43:27

skills.

Speaker: 00:43:28

They actually get honored more effectively.

Speaker: 00:43:30

So they actually get treated more like the model's gospel, like things that it'll

Speaker: 00:43:35

follow religiously versus not.

Speaker: 00:43:37

So, yes.

Speaker: 00:43:38

But exactly as you said, it is a workflow.

Speaker: 00:43:40

It is actually intended to be focused on a specific area and solve that repeatedly.

Speaker: 00:43:44

Yes, we just went into a five minute conversation about what a skill is.

Speaker: 00:43:48

But anyway, Andrew, come back.

Speaker: 00:43:50

Sorry.

Speaker: 00:43:50

Yeah, skills.

Speaker: 00:43:50

So you want to try skills next?

Speaker: 00:43:52

Yes.

Speaker: 00:43:52

Yes.

Speaker: 00:43:53

I want to try to create my own skills through the workflows I already have

Speaker: 00:43:57

outstanding and hopefully I will get more consistency.

Speaker: 00:44:01

That's my goal.

Speaker: 00:44:02

Like the workflows work, but there's so many instructions and so many

Speaker: 00:44:08

things that need to be followed.

Speaker: 00:44:10

Yes.

Speaker: 00:44:11

The models are just forgetting, conveniently forgetting certain steps along the way.

Speaker: 00:44:15

I'll be like, why aren't you doing this?

Speaker: 00:44:18

And then it'll be like, oh, it was because I interpreted,

Speaker: 00:44:24

you know, something I said earlier differently.

Speaker: 00:44:27

Like I sometimes it will say, for example,

Speaker: 00:44:32

sometimes I'll say continue autonomously.

Speaker: 00:44:35

Yes.

Speaker: 00:44:36

And it's very simple, straightforward.

Speaker: 00:44:38

Very straightforward.

Speaker: 00:44:38

It understands that.

Speaker: 00:44:39

Right.

Speaker: 00:44:40

But sometimes it will still,

Speaker: 00:44:43

you know, a blocker will come up even though I have the protocol for that.

Speaker: 00:44:47

It will show the decision menu and it would pause the whole workflow.

Speaker: 00:44:52

I'm not looking at it 24-7.

Speaker: 00:44:54

I come back, it's been sitting there for five hours waiting for me to say something.

Speaker: 00:44:59

Yeah.

Speaker: 00:45:00

And you're like, well, you know, you should have just continued autonomously.

Speaker: 00:45:05

You had the information.

Speaker: 00:45:06

Yeah.

Speaker: 00:45:07

And and then I I would ask it, so so why did you stop?

Speaker: 00:45:10

Yep.

Speaker: 00:45:11

And it was funny.

Speaker: 00:45:13

It literally said I didn't have a good reason to stop.

Speaker: 00:45:17

Oh, my gosh.

Speaker: 00:45:18

OK.

Speaker: 00:45:20

It needed your direction, Andrew.

Speaker: 00:45:21

Yeah, yeah, yeah.

Speaker: 00:45:24

And yeah, it was it's it can lose, I guess.

Speaker: 00:45:29

It was a simple thing to remember, but it still got drowned out

Speaker: 00:45:34

over time.

Speaker: 00:45:35

Yeah, I mean, and I definitely.

Speaker: 00:45:38

I think that is a challenge when context windows get long.

Speaker: 00:45:41

Yeah.

Speaker: 00:45:42

Models have been known to bias towards remembering the last thing you told them or

Speaker: 00:45:47

the first thing you told them.

Speaker: 00:45:48

And everything in between kind of just gets muddled.

Speaker: 00:45:52

Modern techniques and modern, you know, like the latest versions of Claude may be

Speaker: 00:45:57

like are better at this in certain

Speaker: 00:45:59

circumstances that are not so that they are still susceptible to them.

Speaker: 00:46:02

So there's still a possibility that will occur.

Speaker: 00:46:03

Something in the middle just gets lost.

Speaker: 00:46:05

When you said I imagine when you said continue autonomously, it's probably like

Speaker: 00:46:09

continue what?

Speaker: 00:46:10

And then it just was like, well, it was a bit better than that.

Speaker: 00:46:14

Well, yeah, I understand.

Speaker: 00:46:17

I totally agree.

Speaker: 00:46:18

Yeah.

Speaker: 00:46:18

So this does feel like there's some aspect of it where it got lost in the sauce

Speaker: 00:46:23

somewhere at some point.

Speaker: 00:46:25

And that is where something like a skill, which is like, OK, regardless of what the

Speaker: 00:46:29

person is asking me, this is what I'm supposed to be able to do is like a kind of

Speaker: 00:46:33

like it's it's kind of like a grounding.

Speaker: 00:46:38

Truth for it.

Speaker: 00:46:38

It's like this is the ground truth for me to follow or the grounding instructions

Speaker: 00:46:41

for me to follow.

Speaker: 00:46:42

So it won't ever like it always consider that, OK, regardless of what this person

Speaker: 00:46:46

said, what is the how does that work into the grounding truth of these instructions

Speaker: 00:46:50

that I'm supposed to follow and then it'll ask for clarification ahead of time and

Speaker: 00:46:55

it won't just arbitrarily just wait so that that is maybe the way to put it as well

Speaker: 00:46:59

as like you can if you have a skill for a researcher, the researcher will be like,

Speaker: 00:47:04

oh, I can't start until they give me all this information.

Speaker: 00:47:06

But once you give me that information, I can do this thing on its own.

Speaker: 00:47:10

And the similar thing is like when you hand it off from a researcher to, let's say,

Speaker: 00:47:13

a actual planning agent, then that planning agent also, if it has the right skill,

Speaker: 00:47:19

can also be like, let me clarify what I need ahead of time and then I can basically

Speaker: 00:47:22

move on.

Speaker: 00:47:23

So that's also like some of the skill benefits are also like giving a very solid

Speaker: 00:47:28

understanding of what's required to start.

Speaker: 00:47:30

And then because now you have a very clear understanding of what's required to

Speaker: 00:47:33

start, the model can also ask questions to make sure that it has enough information.

Speaker: 00:47:38

So you can ask clarifying questions and do all that stuff ahead of time.

Speaker: 00:47:42

So, yeah, sorry.

Speaker: 00:47:43

But yeah, I can keep going.

Speaker: 00:47:46

I think the important part for me

Speaker: 00:47:49

is that just like you were saying, as we started this like little tangent here,

Speaker: 00:47:54

was I like losing confidence in it, you know, performing things that I have

Speaker: 00:48:00

already established.

Speaker: 00:48:00

I have done what the models just did.

Speaker: 00:48:04

But yes, sorry, sorry.

Speaker: 00:48:05

And I'm hoping, like you implied, that the skills

Speaker: 00:48:11

will be a guardrail, that it will protect this workflow.

Speaker: 00:48:16

It's like I feel like the workflow is at a point where it's near perfection.

Speaker: 00:48:21

It's never going to be perfect, but it's near perfection enough that I really want

Speaker: 00:48:25

it to follow it.

Speaker: 00:48:27

And it needs to be able to follow it over multiple hours.

Speaker: 00:48:31

Yes.

Speaker: 00:48:31

Yes.

Speaker: 00:48:31

And I think that's a really important part is over time.

Speaker: 00:48:34

Yeah.

Speaker: 00:48:35

Yeah.

Speaker: 00:48:35

Because there's times where I see it, you know, midstream, there's regression.

Speaker: 00:48:40

It's like, oh, it's no longer like I'll try to give some concrete examples.

Speaker: 00:48:46

Like I would ask it to like it's part of a workflow and which is in a runbook.

Speaker: 00:48:52

So it has a file to check what it needs to do every time.

Speaker: 00:48:55

This is obviously not a skill.

Speaker: 00:48:57

I would say please print out the findings per model.

Speaker: 00:49:04

The severity.

Speaker: 00:49:05

Right.

Speaker: 00:49:05

A little brief description of what it was.

Speaker: 00:49:08

Yes.

Speaker: 00:49:08

Right.

Speaker: 00:49:08

And so that way, when I come over to check the logs later.

Speaker: 00:49:11

Yes.

Speaker: 00:49:12

Yeah, I can see.

Speaker: 00:49:13

OK, Deepsea found this.

Speaker: 00:49:15

OK, Codex found that.

Speaker: 00:49:17

All right.

Speaker: 00:49:18

And I also have some requirements for it to keep track of Deepsea usage since I'm

Speaker: 00:49:23

actually paying API spend on that.

Speaker: 00:49:25

Yes.

Speaker: 00:49:26

There's a lot of notes that I can go over and review later with Claude to optimize

Speaker: 00:49:31

things more, determine whether these models are worth keeping around.

Speaker: 00:49:34

Yeah, right.

Speaker: 00:49:35

And that's great.

Speaker: 00:49:36

I was actually curious.

Speaker: 00:49:38

So do you use these output contracts as the union mechanism as well?

Speaker: 00:49:42

Like, how do you you mentioned that there was all unions between the findings?

Speaker: 00:49:47

Yeah.

Speaker: 00:49:47

You rely on the contracts to essentially.

Speaker: 00:49:49

Yes.

Speaker: 00:49:50

Yes.

Speaker: 00:49:50

Because they're supposed to wait.

Speaker: 00:49:52

The orchestrator is supposed to wait for every single one to finish first.

Speaker: 00:49:56

They're all running blind.

Speaker: 00:49:58

Nice.

Speaker: 00:49:59

All of them have optimized prompts per like like angle for that for that

Speaker: 00:50:05

model.

Speaker: 00:50:05

Yeah.

Speaker: 00:50:06

Yeah.

Speaker: 00:50:06

For for that specific module, what I'm looking for.

Speaker: 00:50:09

Oh, nice.

Speaker: 00:50:10

Yes.

Speaker: 00:50:10

Yeah.

Speaker: 00:50:10

And it's preloaded again with all the trace patterns that it already knows it needs.

Speaker: 00:50:17

Nice.

Speaker: 00:50:17

And so as the reviews go along, it's only deltas.

Speaker: 00:50:20

And to be one question, just to clarify as well, like I said, contracts.

Speaker: 00:50:24

I didn't clarify.

Speaker: 00:50:25

It's a structure, right?

Speaker: 00:50:26

It's basically like a schema.

Speaker: 00:50:26

It's like, yes.

Speaker: 00:50:27

Yeah.

Speaker: 00:50:28

OK, so sorry, just quick clarification, but continue the workflow of the workflow

Speaker: 00:50:33

here.

Speaker: 00:50:33

Yeah, yeah, yeah.

Speaker: 00:50:34

So so how another part of refining the reviews that I remembered was instead

Speaker: 00:50:40

of reviewing the whole module every single time,

Speaker: 00:50:44

it would start to nail down on its own where the problems, where the seams are

Speaker: 00:50:49

really obvious is maybe a good way to put it.

Speaker: 00:50:52

And at a certain point, the majority of the reviews are only deltas.

Speaker: 00:50:57

What did we change?

Speaker: 00:50:59

What what needs more attention?

Speaker: 00:51:01

Did the fix implement correctly?

Speaker: 00:51:03

Did the test work right?

Speaker: 00:51:04

Yes.

Speaker: 00:51:06

And at a certain point it converges.

Speaker: 00:51:09

There's no more like I stage it like you had priority one, two or three.

Speaker: 00:51:13

Yes.

Speaker: 00:51:13

I have blockers, warnings,

Speaker: 00:51:18

defers, suggestions.

Speaker: 00:51:19

Yeah.

Speaker: 00:51:19

And nits.

Speaker: 00:51:22

Well, yeah, those those are that is actually also very common software engineering

Speaker: 00:51:26

methodology terminology.

Speaker: 00:51:27

Yes, I mean, Claude gave it to me.

Speaker: 00:51:29

Yeah, I mean, I asked it like, well, how would we set this up?

Speaker: 00:51:33

How would we define the different levels?

Speaker: 00:51:35

And that is what it chose for me.

Speaker: 00:51:38

And it made sense.

Speaker: 00:51:39

And I was like, oh, I'll go with it.

Speaker: 00:51:40

It makes sense.

Speaker: 00:51:41

It's pretty, pretty.

Speaker: 00:51:42

It's a good one.

Speaker: 00:51:43

Yeah.

Speaker: 00:51:44

OK, I think the one thing I will definitely say here is that sounds like

Speaker: 00:51:49

we're kind of, in a sense, also like converging onto a particular process,

Speaker: 00:51:55

because a lot of what I'm hearing is actually like very common with like a lot of

Speaker: 00:51:59

the capabilities are the way that, you know, we operate as a software company and

Speaker: 00:52:04

also like the way that my previous companies operated, like treating software

Speaker: 00:52:09

development more as a process.

Speaker: 00:52:10

And I'm even more curious now that as you have if you feel like you're transitioning

Speaker: 00:52:15

from more of a vibe coding to a more of a process driven approach or do you feel

Speaker: 00:52:20

like, oh, no, I don't actually want to get too much in the process, because that's

Speaker: 00:52:23

another big thing I've seen still, like I don't know how prevalent the term still

Speaker: 00:52:28

is.

Speaker: 00:52:28

I still think it's very prevalent.

Speaker: 00:52:30

Vibe coding is I hear it all the time.

Speaker: 00:52:32

So I want to get your take on do you feel like going away from vibe coding and more

Speaker: 00:52:38

engineering is a good thing or do you feel like I don't necessarily want to go in

Speaker: 00:52:42

engineering because there's also a traditional if you talk to people in the

Speaker: 00:52:45

industry, they're like software engineering is so slow is what it be like.

Speaker: 00:52:48

It's a very common thing to hear it as well, because, yes, there are processes and

Speaker: 00:52:52

rituals that make it slower.

Speaker: 00:52:54

So do you feel like you want to avoid becoming software engineering or what's your

Speaker: 00:52:58

take on that?

Speaker: 00:52:58

I'm curious.

Speaker: 00:52:59

Oh, I think.

Speaker: 00:53:01

I think I think this kind of puts everything we've talked about a little bit more.

Speaker: 00:53:06

No, I think it like all comes together here.

Speaker: 00:53:08

I was complaining earlier about New World and the feeling at launch.

Speaker: 00:53:13

Right.

Speaker: 00:53:13

Not being able to scale when clearly there it was Amazon who made the game, they own

Speaker: 00:53:19

AWS.

Speaker: 00:53:20

They have all the skills for this.

Speaker: 00:53:22

What happened?

Speaker: 00:53:25

It didn't make any sense to me, but uh,

Speaker: 00:53:30

they like take everything together.

Speaker: 00:53:31

One of the biggest focuses on this is I would like to do it right.

Speaker: 00:53:35

Yeah.

Speaker: 00:53:35

And and to do it right will require some standardization of processes.

Speaker: 00:53:41

So I guess to answer your question directly, I do think that this is becoming more

Speaker: 00:53:46

process driven than purely vibe coding.

Speaker: 00:53:49

Now, I guess maybe day one it was vibe coding because I didn't really have a

Speaker: 00:53:55

structure to anything yet.

Speaker: 00:53:56

Right.

Speaker: 00:53:57

It was a blank slate for me.

Speaker: 00:53:58

This is the first time I've, I've like, you know, I've written code manually before

Speaker: 00:54:02

in classes and stuff like that, but not two hundred thousand lines of anything.

Speaker: 00:54:06

Oh, yeah.

Speaker: 00:54:07

Oh, yeah.

Speaker: 00:54:07

And one thing I will throw out there is I think, Andrew, we were also talking a

Speaker: 00:54:12

little bit about your profession and since admin is still very there is a lot of

Speaker: 00:54:18

systems that you need to put together.

Speaker: 00:54:19

Right.

Speaker: 00:54:20

Right.

Speaker: 00:54:20

There's a lot of connections.

Speaker: 00:54:21

There's a level of architecture that also you need to consider is like, what is what

Speaker: 00:54:25

are these things actually do and understand them enough enough depth that you can

Speaker: 00:54:28

put them together.

Speaker: 00:54:29

And I feel like, you know, like we talked about the perspective that you're bringing

Speaker: 00:54:34

here, we also talked about this admins in particular, maybe being a really good

Speaker: 00:54:39

audience for this kind of tooling because, you know, software engineering

Speaker: 00:54:45

can be is a very wide term and a software engineer does a lot of things.

Speaker: 00:54:48

You can have like specialized roles.

Speaker: 00:54:51

You can also have generalists.

Speaker: 00:54:52

I feel like this admins, you have to be a generalist up to a large extent because

Speaker: 00:54:55

you're working directly with people and your scope is always huge.

Speaker: 00:54:59

So a lot of what we can do.

Speaker: 00:55:05

Right.

Speaker: 00:55:05

We do try to automate as much as possible.

Speaker: 00:55:08

Right.

Speaker: 00:55:09

That's how we have many more hands.

Speaker: 00:55:11

Solve it once.

Speaker: 00:55:12

Yeah.

Speaker: 00:55:12

Get it fixed one time.

Speaker: 00:55:13

Yep.

Speaker: 00:55:15

We do thankfully have other teams that can take different things like help desk and

Speaker: 00:55:20

stuff like that.

Speaker: 00:55:21

So we're not necessarily doing like all of the front line stuff all the time.

Speaker: 00:55:25

But when it comes to like, like campus infrastructure, networking issues, look at

Speaker: 00:55:31

the websites down or, you know, like, like something like that, that would

Speaker: 00:55:35

definitely fall into take for granted.

Speaker: 00:55:37

Yeah.

Speaker: 00:55:37

And into our wheelhouse.

Speaker: 00:55:39

I guess I think I got away from your question, though.

Speaker: 00:55:42

Could you could you repeat it?

Speaker: 00:55:43

The main aspect being a risk perspective is I do feel like you're building

Speaker: 00:55:49

something in your personal time that you also realize there is only so much time, so

Speaker: 00:55:54

much token budget that you have and a combination of that.

Speaker: 00:55:58

And also wanting to deliver something for you during your personal like for your

Speaker: 00:56:02

personal project.

Speaker: 00:56:03

And do you feel like at this point is risk of token

Speaker: 00:56:09

expenditure, maybe something that drives your decision towards going more into like

Speaker: 00:56:14

a process oriented approach?

Speaker: 00:56:15

Absolutely.

Speaker: 00:56:16

Or do you think it's maybe also a combination of like the experience you've had of

Speaker: 00:56:19

like orchestrating these systems for your professional life?

Speaker: 00:56:22

Or maybe it's a combination of both.

Speaker: 00:56:26

That's interesting.

Speaker: 00:56:27

I think specifically for Minecraft, it's more informed by my previous experiences

Speaker: 00:56:32

running LuxWander.

Speaker: 00:56:33

Ah, yeah.

Speaker: 00:56:34

OK.

Speaker: 00:56:34

Actually.

Speaker: 00:56:35

And actually, I just realized that as a keyword.

Speaker: 00:56:37

I just dropped right in there.

Speaker: 00:56:39

We're going to have to bleep that out later.

Speaker: 00:56:41

No, no, no, it's OK, it's OK.

Speaker: 00:56:43

I think it's more inspired by that.

Speaker: 00:56:46

There's a lot of ways where I could maybe like tweak it to parallel to some things

Speaker: 00:56:52

that work, but I think it's more so a lot of the development and what drives a lot

Speaker: 00:56:57

of the development.

Speaker: 00:56:57

And I think it's a lot of the decisions, it goes comes directly from experience that

Speaker: 00:57:00

I had running LuxWander before, like, and I think like I

Speaker: 00:57:05

guess is like a little fun story when when LuxWander first released in 2010,

Speaker: 00:57:11

it was, you know, Minecraft pre-alpha.

Speaker: 00:57:14

That was, I don't like to say it, 16 years ago.

Speaker: 00:57:18

Yeah, it was 16 years ago.

Speaker: 00:57:20

Oh, yeah.

Speaker: 00:57:22

And don't remind me.

Speaker: 00:57:25

And

Speaker: 00:57:27

you know, I released like some advertisements online, like on like Minecraftforms

Speaker: 00:57:33

.net, right?

Speaker: 00:57:34

Like I had like a thread, you know, and there was rudimentary,

Speaker: 00:57:39

you know, multiplayer Minecraft.

Speaker: 00:57:41

It crashed all the time.

Speaker: 00:57:43

You know, parts of the map corrupted all the time.

Speaker: 00:57:46

Updates were coming every day, you know.

Speaker: 00:57:49

So it almost feels like your original, that predated all of your professional

Speaker: 00:57:54

experience.

Speaker: 00:57:54

So yes, yes.

Speaker: 00:57:55

But your love for actually building this community and this actual like version of

Speaker: 00:58:01

Minecraft that everybody could enjoy in the way that you wanted it to actually was a

Speaker: 00:58:05

more of a driver, essentially, even today continues to be more of a driver to

Speaker: 00:58:10

building something really awesome, not to say that your professional experience

Speaker: 00:58:14

doesn't help a little bit here and there, but maybe it's a combination of like

Speaker: 00:58:19

wanting to build something and having the, you know, like a wish to build something,

Speaker: 00:58:22

but also like a little bit of the learnings that you've had over time, you know,

Speaker: 00:58:26

professionally and personally in your previous experience.

Speaker: 00:58:30

My reasoning for this is like some of the move from a project driven

Speaker: 00:58:36

approach of let's just like vibe coded and hope it works and hope it works like we

Speaker: 00:58:40

can deploy it versus a, oh, I actually have built something that people have used

Speaker: 00:58:45

and I want to build something again that people have will use and will really love

Speaker: 00:58:51

is a very strong driver for saying that I'm not just playing around.

Speaker: 00:58:55

I'm building something from like a place of wanting it to be successful.

Speaker: 00:59:01

And that is maybe also part of like and the risk of like my risk is I actually want

Speaker: 00:59:07

to build something as bad either, and that is like a personal feeling about it as

Speaker: 00:59:11

well.

Speaker: 00:59:11

But it's driving you to now make decisions that are resulting in like more definable

Speaker: 00:59:17

processes and actually improving the quality as well as reducing the cost so you can

Speaker: 00:59:22

actually finish and get it out the door.

Speaker: 00:59:25

And I mean, I said a lot of things.

Speaker: 00:59:26

Let me let me maybe finish it up and ask a question is, do you feel like

Speaker: 00:59:31

these tools essentially have really actually enabled you or do you feel like

Speaker: 00:59:37

these tools are just giving you more of a mirage of like getting the AI tools,

Speaker: 00:59:42

Claude in particular so far?

Speaker: 00:59:44

Was it so far?

Speaker: 00:59:44

So that is a really interesting question because I cannot say definitively

Speaker: 00:59:50

yet until I start testing it right now.

Speaker: 00:59:53

So.

Speaker: 00:59:54

So so I think ask me in a few weeks again.

Speaker: 00:59:56

Yeah, because because right now it's like undefined.

Speaker: 01:00:01

I don't have any proof yet besides the development.

Speaker: 01:00:05

And so I would like to answer that confidently.

Speaker: 01:00:08

But I guess I can answer the kind of like the idea before that.

Speaker: 01:00:13

I do think it's quite empowered me to actually create something that I've always

Speaker: 01:00:18

wanted to do.

Speaker: 01:00:20

Right.

Speaker: 01:00:20

There is there is a lot of things, a lot of like a big wish list of stuff that I had

Speaker: 01:00:24

that last time I ran Lux Wanderer,

Speaker: 01:00:27

but I just didn't really have the manpower.

Speaker: 01:00:30

I didn't have like the skills, you know, that's funny.

Speaker: 01:00:34

Yeah.

Speaker: 01:00:35

You still don't know.

Speaker: 01:00:37

You will.

Speaker: 01:00:38

You will.

Speaker: 01:00:39

You will.

Speaker: 01:00:39

Skills creator.

Speaker: 01:00:41

I will soon.

Speaker: 01:00:44

And

Speaker: 01:00:46

it has closed the gap, though, that I think like you said, it's democratizing,

Speaker: 01:00:52

you know, I guess, intelligence.

Speaker: 01:00:55

Yeah.

Speaker: 01:00:55

Yeah.

Speaker: 01:00:56

And then, you know, it might all just be an act.

Speaker: 01:00:59

And I think that is something that we will have to see if it's if it is a

Speaker: 01:01:05

act of intelligence.

Speaker: 01:01:08

That's all right.

Speaker: 01:01:09

And at that point, I think we got to end the show.

Speaker: 01:01:11

Thank you.

Speaker: 01:01:12

Thank you.

Speaker: 01:01:14

We'll have to come back and see if it is.

Speaker: 01:01:17

Yes.

Speaker: 01:01:17

Yes.

Speaker: 01:01:18

Cut the cut.

Episode 2

30th Jun 2026

AI Coding Is Solved. Software Engineering Isn’t.

Transcript

Listen for free

About the Podcast

About your hosts

Ajay Medury

Andrew Sierota