Why your AI Code Completion tool needs to Fill in the Middle
Excerpt
Analyzing how fill-in-the-middle allows Codeium to make better suggestions.
Code Completion Models
Large language models have been trained over billions of bytes of data to perform exactly one task extremely well: given the preceding N characters, predict the next one. The driving force behind the AI revolution weâre currently experiencing is that being able to predict the character with high accuracy is an incredible superpower. It allows you to build chatbots like Bing and ChatGPT, copywriting assistants like Jasper, and code completion tools like Codeium and Copilot.
The models powering code completion tools know how to complete entire functions just from their signatures:
They can see your imports and predict what task youâre trying to complete:
But thereâs a problem: the model only knows about the code before your cursor. What about everything thatâs after? The existing code there can be incredibly useful when programming, providing information about potential functions to call, coding practices to emulate, and approaches to take.
So, whatâs the solution? Enter Fill in the Middle (FIM). Introduced in a paper last year by OpenAI, FIM is an under-discussed technique that allows language models to incorporate the context that comes after the cursor during training.
How Fill-in-the-Middle works
Itâs quite simple: letâs say we have a training example that looks like this:
and we want the model to learn to predict the middle text jumps over
from the prefix The quick brown fox
and the suffix over a lazy dog.
First, we make two cuts to separate these sections, introducing new tokens <PRE>
, <MID>
, <SUF>
, and <EOM>
(end of middle):
Then we simply transpose the middle and suffix:
Now, we train exactly like we did before, predicting the following text jumps over<EOM>
from the earlier text <PRE>The quick brown fox <SUF> a lazy dog<MID>
. The model automatically learns the meaning of the special tokens and learns that it is expected to generate text that makes sense after the prefix but before the suffix!
At inference time, if weâre trying to infill a document like the following:
we can present it as
to the model and request characters until the model emits an <EOM>
token, at which point it has successfully joined the prefix with the suffix.
FIM vs non-FIM models
With FIM, we can greatly improve the accuracy of code completion tools by providing context to the model that would otherwise be missing. Letâs see some examples comparing two different code autocomplete tools, Codeium and Tabnine Pro.
Codeium is a free code completion product used by tens of thousands of developers around the world. Codeiumâs enterprise offering allows customers to self-host Codeium in their virtual private cloud or on-premise to ensure that no data is sent outside of the company. Tabnine is an AI code assistant that also offers self-hosting for enterprises.
Here are two suggestions with the same prompt for each tool. Codeium, on the left, is using a FIM model which can see the usage of the distance
function below the cursor and is able to infer that it is supposed to compute the edit distance between a
and b
. Tabnine Pro, on the right, at the time of writing likely didnât use FIM, and gives a worse suggestion as a result.
Codeium
TabNine Pro
In this Golang code, Codeium understands that it needs to initialize the messages
channel, while TabNine just outputs Hello World
:
Codeium
TabNine Pro
Codeium can even generate an accurate docstring for an already-implemented function:
Codeium
TabNine Pro
Conclusion
Software engineering is rarely a linear task: programs are usually not written in one shot from start to end. Most day-to-day programming involves adding functionality, refactoring code, and fixing bugsâall tasks that benefit greatly from context after the cursor.
It should be no surprise then that code completion models trained with FIM capabilities easily outperform simple left-to-right models. Indeed when we deployed FIM for all Codeium users we saw large increases in our acceptance rates and user satisfaction.
Off-the-shelf code completion models like Salesforce Codegen (which powers FauxPilot) have not been trained with FIM, so code completion tools that want to use FIM need to train their own models. This is harder than it may seemâthere are some subtleties involved in choosing where to cut the document and in ensuring that your modelâs left-to-right performance does not suffer.
If youâd like to try out Codeiumâs FIM code completion model, head over to our playground or try us out in your IDE of choice.