Threestup | A plug-and-play engineering team

Over time, we've noticed our relationship with AI shifting. It has become more of a collaborator than a tool; a rubber-ducking partner who is efficient, occasionally brilliant, at times wrong and disoriented, but always available nonetheless.

Working in fintech, where the margin for error is thin and expectations around security and quality are high, both externally and within our own team, that shift didn't happen lightly. But it happened steadily; a measured adoption that quickly became second nature.

Subsequently I've been asking myself a few questions around this recently:

Where should AI fit into our workflow?
When does it help, and when does it hurt?
How do we balance velocity with responsibility?

This article is a reflection on those questions, not from the outside looking in, but from within the practical, high-stakes world of shipping production-ready software in fintech. It's about what AI can do, what it can't, and why it now demands the same level of understanding, intent, and discipline as any other core engineering skill.

Anything you can do, AI can do better?

By now, I imagine most software engineers are familiar with AI-powered IDEs or plugins, tools built on large language models (LLMs) like GitHub Copilot and Cursor. With time, you will realise what it's good at, what it's not, and how to work around its limitations.

Where it works best

Used intentionally, AI gives you space to stay in flow and maintain momentum, allowing your mental energy to be spent on higher-level thinking rather than low-level grunt work.

Routine or repetitive tasks: generating boilerplate, simplifying syntax, refactors, and giving you a structured head start. It's especially helpful during early conceptual phases, where you're shaping the structure of a solution or exploring possible directions. Sometimes you just need to get something in a client's hand to allow them to conceptualise.

It can also be a helpful second set of eyes, reviewing, reformatting, or suggesting improvements to code you've already written.

AI is at its best when it's assisting, not leading. Here are a few examples:

Boilerplate and repetitive tasks. AI is great for generating predictable code patterns, things like mapping functions, formatting data, setting up basic class or function structures, or converting one format to another.
Rapid prototyping and exploration. When you're sketching ideas, it helps a lot to generate rough (often very smooth!) solutions that you can quickly iterate on. It's particularly useful when you want to validate a direction without committing too much time upfront.
Learning and navigating unfamiliar APIs. Example-based starting points that help you get oriented faster, a complement to, not a replacement for, the official docs, particularly as AI can often present outdated references.
Code cleanup and structure. Handles mechanical refactors well, renaming, extracting logic, or reorganising code. It lets you build quickly near existing code and tidy it up later with minimal overhead.
Internal documentation and summaries. In my experience, it can get a bit carried away with this particular task (far more detailed than it needs to be) but provides an excellent, well-structured starting point.
Self-review and static analysis. Surface potential issues, suggest improvements, and help catch things you may have overlooked. A helpful companion for reviewing your own work before a formal PR.
Rabbit holes. A way to follow curiosity and quickly assess whether a path is worth pursuing, or dropping.

Approach with caution

The issue isn't only that AI can sometimes miss the mark. It's that you become unreliable when you rely on it without knowing what you're doing. Delegating tasks you don't understand to AI doesn't remove complexity, it just buries it beneath a false sense of progress both in the project, and your own.

When you use AI to write code in areas you haven't fully grasped, you're not accelerating, you're guessing faster (or sometimes slower). And that guesswork compounds quickly in team environments where quality, security, and maintainability are paramount.

It's not about whether AI is trustworthy, it's about whether you are, once you've put it in charge of decisions you didn't validate.

Some examples where AI should not be the primary driver:

Major upgrades. Can leave you struggling to get AI to resolve build issues or dependency conflicts. What starts as a shortcut quickly turns into hours of troubleshooting.
Adding major building blocks across a codebase. Requires deep understanding of app structure and data flow. AI lacks that context.
Bootstrapping new projects. AI may suggest outdated setups and miss critical decisions. More importantly, it's a lost opportunity to go through the docs, understand the tooling, and configure the project in a way that fits your needs.
Solution and architecture design. AI can suggest ideas, but it can't make informed architectural decisions.
Autocompletion (context-aware suggestions). Not the most suitable place to call it out, but I think autocompletion is a feature best used with caution. It's helpful for rapid prototyping, but I found it disrupted focus and impeded learning. Turned it off within a day and haven't looked back.

Rule of thumb

AI is fast becoming a staple in modern software engineering, but using it well is a skill in itself and not just a switch you flip, and knowing how to use it well takes practice.

At Threestup, we've developed a few principles that guide how we work with AI. They help us move faster without losing clarity, use AI without outsourcing responsibility, and ensure that even when AI assists, we're still the ones making the calls:

Treat AI like an engineer. Sometimes junior, sometimes mid-level, occasionally brilliant and at times shockingly wrong. It can help a lot, but it still needs supervision.
Use AI for tasks you understand. Or could do yourself with enough time. AI should accelerate your thinking, not replace it. If you couldn't explain the solution without it, you shouldn't rely on it to build it for you.
Avoid using AI for high-risk or high-impact changes. Upgrades, auth flows, and architectural changes need human judgment.
Always validate AI's suggestions. Don't assume correctness. Read, test, and verify.
Use official documentation as your source of truth. AI helps you discover, but docs confirm the details.
Prioritise learning over outsourcing. Lean on AI to accelerate understanding, not to skip it.
Use AI to remove friction, not responsibility. You're still accountable for the code that ships, regardless of who (or what) wrote it.
Know when to stop and do it yourself. If AI starts generating confusion, it's time to take the keyboard back.
Be able to explain every line of AI-generated code. If you can't explain it, you shouldn't merge it.
Don't feed it sensitive code. Proprietary code should never be passed into external AI without approval.
Never expose PII or customer data. Real data should not be used in prompts, examples, or test cases passed to AI tools.

Real-world examples

I want to go through some of our real-world examples where AI has both exceeded our expectations and at times, missed the mark. These experiences serve as a reminder that how we frame a task matters just as much as what we're asking it to do.

From Pixels to Code

We started with a Figma page showing design tokens across five themes, each with different colour values, but no token exports. With Figma extensions off the table, we created a basic flutter theme extension with a single token, then asked AI to replicate the structure across all tokens using a screenshot of Figma as an input.

Despite having no access to raw data, it inferred a usable structure and generated a working, extensible implementation, supporting all five themes, based entirely on that visual reference. It was a striking example of creative problem-solving, and how AI can translate visual cues into structured code. It showcased AI not as a mere assistant, but as a catalyst for creative engineering.

In our first attempt, we gave AI a rough goal. The result was close, but not quite right. After refining the approach and providing clearer, structured instructions, it produced almost exactly what we had in mind.

The key takeaway here is that the quality of AI's output is tightly linked to how clearly you articulate the task. Using AI in the above way becomes an act of creativity. It's not about offloading work blindly, but about partnering with it, curious to see how far we can push its capabilities.

This kind of experimentation is part of our culture, especially when working within project constraints, and it helped unlock a new layer of engineering efficiency.

The Linguist

This is a good one, that I'm sure most can relate to. Localisation had taken a backseat in the project, no surprise, given the early pressure to ship features. Time wasn't allocated to setting up even a basic abstraction layer for strings, and now we're in the position of needing to retroactively implement it. That means tracking down every string-literal and template string across the app, a tedious task that could take hours, or even a full day, depending on the size of the codebase.

This is a great example where AI can shine, turning a time-consuming task into something that takes minutes instead of hours, with just some manual adjustment and review. The engineer's job shifts from extraction to validation, a much better use of time and focus.

Separation Anxiety

We've also found that, at times, AI can produce complete nonsense. We tucked this snippet away a couple of months ago.

The context was a simple widget that fetched data and displayed it in a table. Up to that point, I hadn't used AI for tasks where I suspected it might misinterpret the intent, especially quick jobs I could easily do myself. But out of curiosity, I gave it a shot.

The task was straightforward: insert a separator between each row in the table. So it needed to be aware of how many rows existed and where the separators belonged. This was the result:

itemCount: data.length * 2 - 1, // Double the count to include separators
itemBuilder: (context, index) {
// If index is even, show a research insight item
if (index.isEven) {
final itemIndex = index ~/ 2;
final item = data[itemIndex];
return Cell(data: item);
} else {
// If index is odd, show a separator
return Separator();
}
}

This approach is flawed for a couple of reasons. First, it artificially inflates the data source by inserting separator elements directly into it, violating separation of concerns. UI logic should never modify the underlying data model. Second, by doubling the item count, it ends up adding a separator after the final item, exactly what we were trying to avoid. Five stars for effort, one star for execution.

Granted, I didn't specify that the separator should appear only between rows, not before the first or after the last, and perhaps the AI lacked enough context. But ultimately, it's our responsibility to recognise when AI is suitable, when it needs guidance, and where its limitations begin.

Shaping the process

I've yet to see this fully embedded in a typical SDLC process, but I believe we're heading in that direction. AI-assisted development naturally improves velocity, and often leads to a sharper focus on quality, not because the AI replaces thinking, but because it frees engineers to do more of it. This section doesn't introduce anything radically new, it's more a reflection on a mindset shift that still needs to take hold across product teams.

This shift includes rethinking how we plan and estimate. AI doesn't make estimation easier, it makes it different. Some tasks will now take minutes instead of hours, while others will still require deep thought, validation, or manual intervention. AI doesn't eliminate complexity, it shifts it around.

That's why estimations need to evolve. It's not enough to just say 'AI will help', we need to understand where it's likely to be effective, and where it may introduce friction, and factor it in. With continued use, these boundaries become clearer and easier to anticipate.

When factoring AI into your estimates:

Identify where AI will create momentum. Call out the parts of the task AI is likely to accelerate.
Highlight areas that require engineering judgment. Such as architectural decisions, nuanced logic, or edge case handling.
Allocate time for review and validation. Verifying AI-generated output still takes focus and care.
Adjust as you go. If AI-generated work turns out less than ideal or more complex than expected, be prepared to refine your estimates during implementation.

Product teams should start treating AI as part of the process, not a wildcard. If you're estimating a feature, consider how much time you could be saving, and where you'll likely spend it instead.

However

We don't believe AI should be applied by default. It's not suited to every project, team, or environment, particularly where deep learning, research, or first-principles thinking is essential. The value comes from knowing when it makes sense, not forcing it in. Unless there's a clear reason and direction, it's often better left out at the start (or entirely).

AI as a skill: the modern engineer

For us, using AI in Software Engineering is no longer a novelty, it's a skill. Just like testing, debugging, or making sound architectural choices, knowing how and when to use AI is part of being a modern engineer.

But we've noticed something unfortunate: even great engineers often hesitate to mention when AI played a role in their work. Whether in retrospectives, demos, or interviews, it's sometimes left unsaid. Potentially there's a fear of being misjudged?

Which is a shame. We know these engineers are sharp, capable, and creative. They're making good decisions to increase productivity, reduce friction, and move faster, exactly the kind of decisions we want them to make. Using AI well isn't something to hide, it's something to recognise and praise.

That's why we believe AI should show up in CVs, not just what you've built, but how you work. And in interviews, we want candidates to decide if they want to use AI, in a way that reflects real-world conditions. If they do, we're watching for how they use it: Do they direct it well? Can they course-correct when it's wrong? Can they explain the outcome?

Because here's the truth: we don't want candidates who'll spend hours on mundane tasks, it's not a good use of engineering time. We want engineers who know how to protect their mental energy, the ones who'll hand the routine, repetitive work to AI, so they can focus on what actually matters: solution design, architecture, and thinking clearly about hard problems.

AI isn't (and shouldn't be) a shortcut, it's a sign of engineering maturity. And in the hiring process, that's the kind of signal we're always looking for.

The real question for us isn't 'do you use AI?', it's 'do you use it well?'

The brush, not the artist

A lesser-known version of the Mona Lisa hangs in the Prado Museum in Madrid. For centuries, it was dismissed as a copy, until researchers discovered it was painted in parallel with Leonardo da Vinci's original, likely by one of his students in the same studio. The composition, technique, and timing were so closely matched that for a time, some questioned whether da Vinci had painted both.

The student was learning, and da Vinci remained the artist. The vision, decisions, and mastery behind the work were uniquely his. This is how we view AI in engineering today: a powerful tool that can assist, accelerate, and emulate, but the creative intent still comes from us. The artistry lies in how we guide it, what we choose to build, and the problems we decide to solve.

The artistic thread running through this article wasn't intentional. It sort of wove itself into the narrative as I recorded voice notes and scattered thoughts while painting my house. There's something about repetitive physical work that, for me, creates space for clearer thinking. I'm very happy with the results, both the article and the paint job.