[Opinion] An agentic AI development team: a candid week-one review

Last week you would have read that we've launched an Agentic AI development team, with human oversight, replacing dev, test, CI/CD and UX resources. In just three days, the impact on our workflow has been impossible to ignore.

We pored over the data from GitHub – commits, pull requests, issue trackers, and comments – to analyse exactly what changed once our AI team got to work. The results were eye-opening.

Technically we leverage a mixture of OpenAI Codex (released this weekend) with OpenAI Codex CLI (released last week) running in github codespaces. Whilst the original plan was to use orchestrated calls to the OpenAI API, these tools certainly sped up time to market.

This no-nonsense report breaks down how Agentic AI affected our development speed, what improved, what didn’t, and how it felt to work with an artificial collaborator. (Spoiler: things moved faster, but there’s more to the story.)

Development by the Numbers: A Surge in Commit Activity The most obvious change was sheer volume. In the past three days, our repository’s commit count roughly doubled compared to the week before. Code changes that would normally trickle in at a moderate pace came in a flood. We recorded dozens of commits (42 to be exact) in the last 3 days, versus just 20 in the previous week – a 490% increase. This jump in output aligns with broader trends seen in the industry, where AI coding assistants have been shown to boost single developer productivity by over 50%.

In our case, every single day of the week saw more code pushed than usual. Daily commit counts for each day of the week, comparing the baseline week (grey) to the week with Agentic AI (orange).

Every day showed a notable increase in commits, with weekday output roughly doubling and even the traditionally quiet weekends seeing more activity. As the chart above illustrates, Agentic AI’s influence was felt from Monday to Sunday. On weekdays, the AI helped sustain a higher tempo – for example, Thursday went from 3 commits (baseline) to 8 commits with AI, as it tackled tasks in parallel with human developers. Perhaps more striking, however, was the weekend.

This project usually slows down on Saturdays (only 1 commit in the previous Saturday), but with Agentic AI working in the background, we still saw 3 commits on Saturday and 4 on Sunday. In effect, we now have an active contributor around the clock.

The development pace is no longer strictly bounded by human working hours or energy – a new code change might land at 3 A.M. on a Sunday, which was virtually unheard of before. It’s not just raw commit counts either.

The frequency of updates increased: commits were being pushed more regularly throughout each day instead of in one or two clumps. This suggests Agentic AI was breaking work into smaller, frequent updates. Code quality aside (we’ll get to that later), such incremental progress kept momentum high and continuously visible.

We'd start the day to find fresh commits already in the repository, courtesy of our AI developers. Pull Requests and Issues: Faster Merges, Quicker Fixes Commit volume is one thing, but what about the bigger picture of features and fixes?

Here, too, the AI made a mark. We saw a jump in pull request activity – 5 PRs were opened and merged in the last three days, compared to only 2 the week prior. More importantly, the turnaround on these PRs was faster. In one case, an agent opened a pull request to implement a new analytics dashboard in the morning and had it merged by end of day.

This kind of same-day delivery for a significant feature was rare before. Typically, a feature PR might span a couple of days of development and review. Now we’re seeing major feature branches coming together in hours.

The AI doesn’t procrastinate or get tired, and it can crunch through boilerplate and tests at speed, enabling human reviewers to focus on the important parts of the code.

Issue tracking tells a similar story. We closed 30 issues this week (up from 1 last week) and many of them were resolved far more quickly than our historical norm.

One example: a bug report about an authentication failure was filed on Wednesday morning; Agentic AI had drafted a potential fix by early afternoon. By Wednesday evening, after a brief human code review, the patch was merged and the issue marked resolved.

Bug turnaround time shrank from days to mere hours.

This rapid response is a direct quality-of-life improvement for the team and our users – critical problems get fixed before they can linger or disrupt progress. It’s the kind of agility you hope for in a fast-moving startup, now made possible even with a small team, thanks to AI support.

To be clear, not every issue was a quick win. Some complex problems still required careful thinking and didn’t benefit as much from AI’s brute-force speed. But for a lot of routine bugs and small feature requests, having Agentic AI suggest solutions or even automatically open a pull request with a fix made a huge difference. It’s as if we gained an extra set of hands that proactively picks up the to-do list and starts knocking items off it.

How Our Workflow Changed with an AI in the Loop

The day-to-day development workflow evolved in subtle but important ways.

First, code review and discussion became even more central. With Agentic AI generating code, we found ourselves spending more time reviewing and discussing changes, and a bit less time writing trivial code from scratch.

In fact, the volume of comments on pull requests and commit threads roughly tripled – about 30 comments this week vs. 10 last week, indicating very active discussions.

Far from replacing human collaboration, the AI actually sparked more dialogue. We had candid conversations about why the AI chose a certain implementation, or whether a generated function was optimal, which led to clarifications of requirements and sometimes deeper design discussions.

Rather than just coding in isolation, the team was collectively steering the AI’s contributions in the right direction.

Second, we had to adjust our task planning. We learned to feed Agentic AI well-defined problems in parallel with our own work. For example, while a human developer focused on refining the UI, we let the AI handle writing unit tests for a module. In another case, we assigned the AI to draft an initial version of a complex algorithm while engineers concentrated on integration and edge cases.

This parallelism was a big win – development felt more async and multi-threaded. However, it required trust and verification. We quickly instituted a rule: no AI commit goes to production without a human code review. That kept us confident in quality while still enjoying the AI’s speed.

We also noticed that the nature of commits from Agentic AI had a pattern. The AI is very good at churning out the boilerplate and repetitive code, so many of its commits were things like adding missing documentation blocks, refactoring code for style consistency, or expanding test coverage.

These are tasks that, honestly, humans often put off.

Now they’re being handled continuously. Our codebase is already cleaner and better documented after a week, simply because the AI diligently took care of those housekeeping tasks.

The humans, on the other hand, could focus on the tricky parts – figuring out product requirements, solving tricky architecture problems, or tuning the user experience – essentially the creative and analytical work that AI isn’t as good at.

The Pace vs. Quality Trade-off: An Honest Assessment

So, did Agentic AI make everything better?

It definitely made us faster – but “faster” is only good if the quality holds up.

Here’s the candid truth after one week: the code produced with AI’s help is generally solid, but not perfect. We did encounter a few hiccups. In the rush of rapid commits, a couple of new bugs slipped through.

For instance, one automated refactor by the AI introduced a subtle regression in the date parsing logic of our app. It wasn’t caught until a day later when a user reported something odd. We had to roll back part of that change and re-approach it.

This was a reminder that speed can amplify mistakes – if an AI is going to move fast, our safety nets (tests, reviews, monitoring) need to keep pace too.

Fortunately, our test suite caught some issues early, and we added more tests (often written by the AI itself) as a backstop.

We learned that AI-generated code still needs the same scrutiny as human code. If anything, we’re now doing code review with an even finer-toothed comb.

On the bright side, reviewing AI code often means tweaking or improving it rather than starting from zero – a net time saver.

But it’s not a free lunch; you must budget time for oversight. The quality of commit messages was another minor issue – the AI’s commit messages tended to be very utilitarian (“Update dependencies”, “Refactor function X”) and lacked the context or rationale a developer might include. We’ll likely fine-tune this by prompting the AI to produce more descriptive commit notes, or by manually editing them in important cases.

Our team’s morale and productivity saw interesting effects as well. Some devs reported feeling more productive and focused, since they could offload tedious coding to the AI and spend time on creative problem-solving.

Others admitted to a learning curve – it’s a new skill to guide an AI effectively and to double-check its work without redoing it entirely.

Overall the sentiment is positive: we enjoy having Agentic AI as a junior developer who works at lightning speed and never sleeps, but like any junior dev, it needs supervision and mentorship (albeit in a different form).

In meetings, we’re already talking about refining our guidelines for when to trust the AI’s changes and when to be extra cautious.

Conclusion: Faster, but Wiser

In just one week, Agentic AI has undeniably accelerated our development cycle. More commits, quicker merges, and faster fixes – the quantitative metrics all point up and to the right.

Our velocity has improved in ways we can measure (commit counts, issue resolution time) and in ways that are harder to quantify (the team is able to attempt more tasks in parallel, knowing an ever-ready assistant is handling the grunt work).

This lines up with what many developers are finding: AI tools can significantly speed up the coding, and our experience so far reinforces that reality.

However, this candid review wouldn’t be complete without the flip side: speed means nothing without control.

We’ve learned that while Agentic AI can turbocharge our workflow, we have to steer the ship carefully.

The past week taught us to improve our test coverage, maintain rigorous code reviews, and continuously communicate as a team to direct the AI’s efforts effectively.

When we got that balance right, the AI genuinely amplified our productivity – it felt like we suddenly had a larger, more capable development organisation without actually hiring new people.

When we got it wrong (by trusting a change too quickly, or not communicating the exact requirements), we had to hit the brakes and course-correct.

Our verdict after week one? Agentic AI is a game-changer for development speed, but it’s not a magic wand.

It’s a powerful tool – one that can lead to sloppy outcomes if misused, or outstanding results if handled with care. In these seven days, we saw both sides.

The key is treating the AI as what it is: a very fast, tireless assistant that still relies on our guidance.

With that understanding, we’re optimistic. If this first week is any indication, our project’s pace is about to go into overdrive – and we’ll be keeping a close, watchful eye as we let this AI accelerator run.

In the end, faster is better only when you stay smart about it. And that’s exactly how we plan to proceed in the weeks to come, with eyes wide open.