While it may feel as though Natural Language Processors (NLPs) such as ChatGPT emerged overnight, in actuality, they have been around for quite some time. It can be exciting to think of all the possibilities of what this advancement means for a multitude of applications: work, school, and learning new skills. These NLPs are touted as making our lives easier, but how much can they actually (accurately) accomplish at this time? What are their limitations?
In the realm of these processors helping with our work, many stories have emerged of how AI could one day replace many jobs. On the list of purported at-risk fields are coders and software engineers, but can these programs actually produce the same caliber of work as a trained and experienced developer? Cloudberry’s Senior Developer, Mikhail Kornienko sought to find out just how advanced these offerings are at performing the craft he has perfected in his 20+ years in this field. The following details Mikhail’s experience in testing out the capabilities and shortcomings of NLPs in writing (meaningful) code:
What tools are currently available?
Currently, a few tools are available to write code using various AI models. The following have been reviewed and are being used in Cloudberry development:
- GitHub Copilot: $10 per month subscription
- OpenAI’s ChatGPT: Limited free version available, $20 per month subscription or paid-per-token subscription
GitHub Copilot was originally created as a tool to support programming, so the dataset it was trained on is squarely for programming needs (though it can also naturally generate some sensible auto-completions inside the code – such as trying to write or auto-complete a comment for a chunk of code).
Copilot supports a variety of languages – and it’s all a matter of an Integrated Development Environment or IDE tool to be actually able to plug into the system. Some IDEs, like VS Code or PHPStorm, do support this integration, however, some do not (at least not without jumping through hoops) such as Apple’s Xcode.
How can these tools best be leveraged as they exist today?
Copilot. We primarily use Copilot in an always-enabled mode, allowing it to generate code as it sees fit.
In many cases, the code served up is in pretty good shape, saving time on grunt work.
AI doing the grunt work, under strict developer control, that is, would be our recommended best practice for using AI at the moment.
What constitutes “grunt work”:
- Creating chunks of code as it sees fit, based on the context it has access to (which is usually the source code, and some additional context provided during the coding – such as programming correcting some things Copilot has created initially)
- Creating chunks of code in a “per-request” manner, when the code is based on the programmer writing a comment, which explains the functionality of a code (function, class, etc), which doesn’t yet exist, but would be created by Copilot based on the description
- Creating predictable structures (for example sequences of objects, where some of their contents change predictably – such as IDs incrementing, etc – based on a few first items in the sequence
Strict developer control is necessary as the AI can generate code that does not even properly work. Although this is rare, it does happen. More often, AI generates code with small, easy-to-miss bugs or gotchas, which can, in some cases, result in extensive collateral damage. Perhaps most hilariously, the AI can also generate “dreamt-up code”, which appears to be correct but doesn’t actually rely on any existing functionality for a language.
In a normal flow, we would just let Copilot run alongside, providing chunks of code, quickly reviewing the proposals and either accepting them as-is, accepting them on the condition that the code be modified before being used, or rejecting the code altogether. Copilot providing unsatisfying code is not the end of the world, as sometimes even when code is rejected, Copilot provides some “ideas” or “hints” of its thought process which can then be hopped on by a programmer. In such a case, we would start writing the code on the fusion of the programmer’s thoughts and input provided by Copilot. On its own, Copilot can often hop on the programmer’s “revised” thoughts, and with a few “manual” hints (basically the programmer writing a few lines of correct code), it might just start giving much better suggestions.
ChatGPT. While Copilot is used for general-flow coding, ChatGPT can be used for even more.
The way it has been approached for this test is by setting up a local server that talks to ChatGPT API, using its on-demand pricing (rather than the $20/month plan), as it makes it much more affordable. The free plan was not utilized, because at times it tends to be throttled or altogether unavailable (even though the free tier can at times be faster, and could be a higher iteration of the GPT engine completely – such as GPT 4.0 vs GPT 3.5). So far, for our use cases, GPT 3.5 has worked well enough.
The primary use of ChatGPT is for “research”, or writing code on-demand, followed by a programmer talking to the system and asking it to make specific changes to the code, or providing feedback.
Using ChatGPT for research is essentially like having documentation for any programming language, concept, or algorithm, but in an extremely fast and flexible way. All this information is already available by doing general Google searches or looking up information in docs, but ChatGPT makes this process easier and much faster, in most cases. Depending on the information provided, it is possible to jump into actual coding, or ChatGPT can be asked to expand on a specific topic, allowing one to dig deeper and explore concepts even further. This could be very useful when writing code in a new language or for a new framework – when Copilot just won’t be useful (because there’s very little knowledge on how things should actually be coded). In that case, ChatGPT can be fed natural language requests, and output the code. This code, in turn, can be processed by the programmer, analyzed, learned, as well as checked against any potential issues, and be iteratively and interactively improved or modified in a human-readable form, simply by chatting with the ChatGPT system.
ChatGPT, when it comes to programming (though also for most professional topics), works best when a user limits, or channels its ability to focus on something specific, rather than just fetching data from a wide array of all available data. This “channeling” is normally done using “prompts”. This prompt is given to the system in the beginning of interaction (prompts for GPT systems are a pretty complex topic in general, and we are still yet to completely understand them). The following is an example prompt that could be used when writing code in PHP/Laravel framework:
“You are a Lavarel programmer helper. You help the person who asks you different programming-related questions. The system you are helping with is built using Lavarel. You must use Laravel 8 or later to generate responses. Be brief and concise. When unsure about something, mention it.”
Overall, AI systems definitely help and can be pretty big time savers, but just like with everything else, in order to actually make these systems serve you well, you need to use them and learn them, and in the case of AI systems, learn to control and limit them – always be ready to be deceived or get something generated that will kill your database – or worse.
The Developer’s risk of being replaced: Mikhail’s take
As a developer, embracing the changes and adapting has always been one of the most important things to not only stay relevant but remain in the field at all – so whether or not to embrace AI is not really a difficult choice to make.
The concern of making developers obsolete has been hotly debated. Some developers have tried (and continue to try) not to touch the AI-assisted approach with a ten-foot pole. They could have the last laugh, who knows? Throughout development’s history, developers have gone from no documentation to documentation in the form of Xerox copy, to piles of books with documentation and code snippets to use, to very early days of online communities, very rough search engines, to Google and Substack, and finally, now, to the ChatGPTs of the world (!) and have not yet become obsolete. Does this mean all of the talk of being replaced is simply the latest exaggeration? If we choose to stay relevant and embrace it, we must also embrace the inherent consequences that accompany this scary and exciting advancement.