The State of AI in software engineering

AI in software engineering

AI (Artificial Intelligence) refers to the ability of machines to perform tasks that would normally require human intelligence, such as learning, reasoning, and problem-solving.

AI-powered tools are being developed to assist developers to automate repetitive tasks by generating code, identifying bugs, optimizing performance, and more. It is utilized to improve software quality and reduce development time by automating various aspects of the development process and providing suggestions and information when needed.

In this article, I'll try to summarize the state of AI in software engineering in mid-2023 as I see it.  

AI tools available to developers

I started to use some AI-enabled tools recently for the research and coding part of an upcoming blog post and my experience was overwhelmingly good. I think it is best to categorize these tools by the problems they help to solve. The main areas I identified are coding and search.

Coding

Artificial intelligence powered coding assistance has the potential to revolutionize the way developers write code. Tools like Tabnine or GitHub Copilot use machine learning to provide intelligent suggestions for code based on the context of the code being written. GitHub Copilot is trained on billions of lines of code in the public repositories of GitHub, which allows it to suggest code that is both accurate and relevant.

Tools like these are helping developers in several ways. First, they can help save time by automating repetitive coding tasks. Second, they can help reduce errors by suggesting code that is more accurate and efficient than what a developer might write on their own.

I used GitHub Copilot for some weeks now and in my own experience a few of the suggested method bodies or even classes were nonsense, but most of the time they were useful. Using Copilot made coding faster for me. The quality of the help you get depends on the code you already have. By understanding your context the AI model can proved better and better code suggestions. It surprised me how good it was.

However, AI-powered coding assistance tools are not perfect and still have some limitations. For example, they may not always provide suggestions that are relevant or accurate, and they may struggle with complex or unusual code patterns and are prone to suggest code that may contain security vulnerabilities or other flaws. Additionally, there are concerns about the potential impact of these tools on the job market for developers, as they may eventually be able to replace some aspects of a developer's job. Privacy and copyright-related problems can also arise. Many tools provide FAQs to try to clarify their stand on these questions, it is worth checking out the one from GitHub for example.

Overall, the state of AI in coding assistance is rapidly evolving and shows great promise for the future of software development. Tools like GitHub Copilot are just the beginning of what is likely to be a major shift in the way developers write code.

Information retrieval

One of the biggest opportunities I see in AI tools is something not limited to software engineering. At Wise, we have 5000+ employees, more than 600 engineers, and 500+ microservices. Our autonomous teams are writing documentation and sharing information in Confluence, GitHub, Jira, Slack, and a few more platforms. If you have a project where you have to touch domains not familiar to you, one of the most difficult issues you will face is to get information on it. How a process works, what services play a part in it,  what the rules are, who owns what, and so on. A big part of my daily work is understanding projects and code written by others so I can contribute too. This usually goes by searching for docs in Confluence, asking around on Slack, reading readmes on GitHub, and reading code trying to put something sensible together.

A ChatGPT-like tool, trained on all the data available in a company could be a huge step to make this often slow and painful learning process faster and more efficient. It is not by accident that Atlassian introduced Atlassian Intelligence for its cloud-based products just a few days ago. By looking at the promises it could deliver some of what I mentioned above, and I'm curious to see it. But this tool is still limited to Atlassian products, and can't process data from GitHub, Slack, or Google Drive docs.

I'm sure you know that feeling after reading something in a book or article you think - 'oh yeah, I see how it works', but when you start to implement it you start to face edge cases or aspects you haven't thought about initially. The book did not go into that much detail and sometimes it is really difficult to find answers for these kinds of questions online. In a scenario like this, I had asked ChatGPT like I would ask an experienced colleague about the topic. Not only could the AI answer my questions it was also capable of generating an implementation that was syntactically and semantically correct. When I started asking specific questions about the implementation (Why did you use a temporary file for the writes first?) it managed to answer those clearly and correctly. Going even further it managed to provide alternative solutions and rewrites when I asked it to use a different Java lib or modify the code in any way.

It is the least to say I had a very positive experience. Imagine a chatbot that knows everything about the tech or any other aspect of the organization you work for and you can ask it anytime about anything. It sounds almost invaluable.

But before we take this very near future for granted there are some concerns too. One of the main issues around AI tools in this area is giving access to sensitive company data including source code to third-party companies. These offerings often keep the right to use any inputs to train and further enhance their models. This has the risk of PII and other sensitive data leaks. It is the task of security teams to figure out how a company can utilize the tools within regulation. GitHub Copilot with a business license for example provides the possibility not to use user input and suggested code for learning, keeping it safe for the company.

Another problem is the data quality the tools can work on. If a team is not creating documentation we only have the code to work with. A powerful AI model can make sense of what is going on in a complex service, but we are not there yet. An even bigger problem can be outdated, or duplicated docs on a topic. In these cases, I as a human can ask my colleagues who wrote it, what is going on, but an AI can't. If the fact that the information can be outdated is not mentioned in the response, this can lead to a developer being confident in the suggestions by the AI, while it can be long outdated or faulty.

+1 Writing this article

I also tried to use ChatGPT to help me write this post. First I asked for actual paragraphs, but even after fine-tuning while they made sense, it was just not my style so I ended up not using anything the tool generated. However, I still found it very useful for research. I used it to suggest topics and themes, look up information, and so on. A faster version of searching with Google by myself.
I also have to mention that I use Grammarly to check my English and it also has AI-powered features.

Summary

Overall, AI can revolutionize software development by improving efficiency, reducing errors, and helping developers create better, more sophisticated applications. As I see many companies are getting familiar with the options and starting to introduce these tools for internal use. Now AI may be banned to be used in your job mainly for security and privacy concerns, but as soon as these are solved it will be big.