QCon AI New York 2025: AI works, PR doesn't: How AI breaks through the SDLC and what to do about it

AI News


Michael Webster, principal engineer at CircleCI, presented “AI Works, Pull Requests Don't: How AI Breaks the SDLC and What to Do About It” at the first QCon AI New York 2025.

Mr. Webster began his presentation with the timeline and current status of AI.

More and more organizations are integrating AI into their development workflows, resulting in an increase in code generation.

Research on AI-assisted coding led by Dr. Hao He. Software engineering students at Carnegie Mellon University focused on the long-term impact of more than 2,000 open source software projects, comparing projects that incorporated Cursor, an agent-based AI coding assistant, to a corresponding set of projects that did not incorporate Cursor. The study concludes that the static analysis warnings generated by Cursor and the continuous increase in code complexity improve development speed and software quality. However, this was only temporary, lasting only about a month before a slowdown in development speed was observed.

Similarly, the “State of AI-Assisted Software Development” study conducted by DORA concluded that while AI-assisted coding is effective in increasing development speed, increased instability was also observed.

When it comes to code reviews, the number of code additions can typically be 25 times greater than the number of code deletions, which is a challenge for any organization. According to this State of Code Review report conducted by Graphite, it takes large organizations about 13 hours to merge a pull request, compared to about 4 hours for small organizations. However, we found that this discrepancy is because smaller organizations are more than four times more likely to skip formal code reviews altogether.

Webster discussed the impact of AI on the software development lifecycle (SDLC) and continuous integration/continuous delivery (CI/CD) processes at CircleCI.

Using AI in the SDLC increased development speed by 3-5x for about a month before we observed persistent technical debt accumulation.

Queuing theory is the mathematical study of queues that analyzes arrival rates, service times, and system capacity to predict queue lengths and waiting times. An example of this is Little's law, defined as L = λW:

  • L = average number of items in the system
  • λ = average arrival rate = exit rate = throughput
  • W = Average wait time in the system for the item (internal duration)

Using a version of queuing theory, Webster defined delay as equaling the arrival rate greater than the processing rate.

The traditional testing approach is to run a complete test suite. Code coverage can reduce testing time by: Build a map from code for testing. Run tests with dependencies. and rebuild it periodically.

Test impact analysis (TIA) is a testing strategy that identifies and runs only the tests affected by recent code changes, eliminating the need to run entire test suites, increasing CI/CD throughput and ultimately reducing costs. Tests are mapped using code coverage data.

Webster introduced the TIA strategy used at CircleCI, which features a map of defined endpoints and associated TypeScript test files. In a recent study of 7,500 tests, testing time using this TIA strategy was reduced from 30 minutes to 1.5 minutes. The same principles could apply to code reviews, he argued, since not all code requires the same level of scrutiny.

Webster concluded his presentation by introducing Chunk, an AI agent developed by CircleCI that claims to “verify code at the speed of AI.” This process is as follows: Let's verify first. Use pipelines and environments that developers are already using. Fix unstable tests and ensure that software products are production-ready by providing adequate testing and code coverage. Learn from merging, undoing, rolling back, and commenting.





Source link