And not only that: chatbots never get mean and never argue with each other
Generative AI engines used for programming have become an important tool for programmers, and in fact some are using them secretly from their superiors. There are those who are already thinking about the next step, which is to eliminate human programmers. A recent experiment shows that this goal no longer seems so far away.
ChatDev. This is the name of the imaginary software development company created using the ChatGPT 3.5 generative AI model. The experiment was conducted by a team of researchers from several universities and aimed to propose how a set of independent AI-controlled agents could work together to ultimately successfully solve various software development problems.
Waterfall development. The researchers published their results in a study entitled “Communicative Agents for Software Development”. They followed the well-known software engineering model of waterfall development and divided the virtual enterprise into four chronological phases: design, coding, testing and documentation.
Bots for different roles. The problems were then solved not by a single bot, but by a series of bots (or rather chatbots) that communicated with the rest. The chatbots had different roles and each was assigned to a phase. For example, the “CEO” and “CTO” of ChatDev focused on the design phase.
Minimal human interaction. At each stage, the AI chatbots worked together with little human intervention, such as deciding which programming language to use to identify errors in the code. By the way, the project code is available on GitHub.
five in a row. From there, they checked how ChatDev behaved when executing software projects. How long would it take and how much would it cost to complete these projects? One of the examples they gave was instructing ChatDev to create a simple game called Gomoku, also known as “Five in a Row.”
What language do we use? In the first design phase, the CEO asked the CTO to suggest a programming language to “meet the needs of the new user.” The CTO responded that the recommendation was Python, to which the CEO replied, “Great!” and then explained that “its simplicity and readability make it a popular choice for both beginners and experienced developers.”
Seven minutes and one euro. After giving 70 different tasks, the study concluded that each project – all of which were relatively simple – took about seven minutes to complete and cost, on average, less than a dollar. This included the validation and testing phases as well as the identification of potential vulnerabilities. Overall, 86.66% of the generated software could be executed perfectly.
The effect promises to be remarkable. Studies like this show that the impact on the programming world can be significant, but the researchers made it clear that there were limitations such as errors and subjectivity in the language models. Of course, the role of the human programmer in reviewing and validating the code and its output is still very relevant, but efforts like these make the future automation of development processes seem more immediate than ever.