The Drunken Plagiarists

Working with Co-pilots

Dear KV,
After more than a year of hearing people talk about AI and co-pilots, I finally tried one on a small project. I even paid for the privilege of doing so, figuring that the paid version would be superior to the free one. But what I have found confuses me, and I'm wondering if you too have tried any of these tools. From your previous articles, it seems you might not be focused on the latest tools in our industry. So, maybe you've just continued to use vim and Makefiles. Have you tried these things, and do you have any words of wisdom for the rest of us who are looking at them now?

Co-Piloted

Dear Co-Piloted,
It may shock KV's readers to learn I'm a bit of a tools dweeb, and in fact whenever some new tool comes out that supposedly will help me to create or understand software better, I have been willing to try it. This applies not just to tools but also techniques. I'm even a certified Scrum Master, but that's a story for another time.

I have tried many editors, several IDEs, various debuggers, and all manner of new and interesting tools in my career, and continue to do so. Like you, I had held off on trying the tools based on LLMs and even now continue to throw up in my mouth whenever I'm forced, in conversation, to refer to these as AI. Being able to spit out passable marketing doggerel has about as much in common with intelligence as does an American Presidential election. In fact, those two are clearly, deeply related.

Before trying to use these tools, you need to understand what they do, at least on the surface, since even their creators freely admit they do not understand how they work deep down in the bowels of all the statistics and text that have been scraped from the current Internet. The trick of an LLM is to use a little randomness and a lot of text to guess the next word in a sentence. Seems kind of trivial, really, and certainly not a measure of intelligence that anyone who understands the term might use. But it's a clever trick and does have some applications.

If you're typing a suicide note, for example (and the corpus of text contains thousands of these), it's quite possible the code will be able to guess which word you might use next since thousands of people who came?and went?before you typed it as well.

Code is a an even more constrained environment than prose, in a way, because code must be run through a process that has a strict syntax?one that's far stricter than any human language. It's thought this narrowness facilitates the process of guiding the creation of code, with templating being cited as an early use case for these technologies. And who would not want help with such drudgery? Many pieces of code that are written, especially for the visual web, are just copy-and-pasted versions of other pages, and the same might be said for other areas of coding.

While help with proper code syntax is a boon to productivity (consider IDEs that highlight syntactical errors before you find them via a compilation), it is a far cry from SEMANTIC knowledge of a piece of code. Note that it is semantic knowledge that allows you to create correct programs, where correctness means the code actually does what the developer originally intended. KV can show many examples of programs that are syntactically?but not semantically?correct. In fact, this is the root of nearly every security problem in deployed software. Semantics remains far beyond the abilities of the current AI fad, as is evidenced by the number of developers who are now turning down these technologies for their own work.

Guessing the next word used by a cohort of morons, which is what co-pilots actually do, leads to incredibly incisive text such as

server.mtx.Lock() // Lock the cache

Yes, thank you, that's the mutex Lock method. But WHY do we lock the cache? And what do we do about it later? This is akin to a comment such as

i++ // Increase i by 1

The only reason anyone is impressed by this is that it's written in a form that is more palatable to those who wish to anthropomorphize their machines, something Dijkstra warned about in the 1960s.

Another classic from our new Robot Master:

// Get retrieves the value for a given key if it exists and is not expired. // Parameters: // - ctx: context for the request. // - request: contains the key to retrieve.

Wow! Really? If a cursory glance at the code wasn't enough to tell me this, I shouldn't be here at all.

Finally, my favorite feature of co-pilot programs is the abject plagiarism. We already know that the text and code being typed out by these things comes from scanning billions of lines of text and source code available in GitHub, but they can even be helpful in unintended ways. A colleague who was taking a night class in distributed systems showed me what happened when his professors (foolishly) suggested the students "use the new tools" in order to become more modern developers. As he accepted more and more of the co-pilot's suggestions, he noticed a pattern: It was as if someone was typing in another file from somewhere else. The coding style itself was one of the clues, but eventually the co-pilot gave itself away completely by saying, "You know there is a file just like this over in this other repo?" In a way, this makes sense. But as part of a homework exercise, it's just hilarious.

The more I've used these tools in my projects, the more I've realized that co-pilots are nothing more than drunken plagiarists, sitting behind you and your code with their hot, gin-soaked breath whispering semantic nothings in your ear. They are not a boon to your work, they are a rubber crutch?one that will cruelly let you down when you need it most. Now, we all just need to get real work done while we wait for this latest hype cycle to die a justified and fiery death.

George V. Neville-Neil works on networking and operating-system code for fun and profit. He also teaches courses on various subjects related to programming. His areas of interest are computer security, operating systems, networking, time protocols, and the care and feeding of large codebases. He is the author of The Kollected Kode Vicious and co-author with Marshall Kirk McKusick and Robert N. M. Watson of The Design and Implementation of the FreeBSD Operating System. For nearly 20 years, he has been the columnist better known as Kode Vicious. Since 2014, he has been an industrial visitor at the University of Cambridge, where he is involved in several projects relating to computer security. He earned his bachelor's degree in computer science at Northeastern University in Boston, Massachusetts, and is a member of ACM, the Usenix Association, and IEEE. His software not only runs on Earth, but also has been deployed as part of VxWorks in NASA's missions to Mars. He is an avid bicyclist and traveler who currently lives in New York City.

Originally published in Queue vol. 22, no. 6—
Comment on this article in the ACM Digital Library

More related articles:

Michael Gschwind - AI: It's All About Inference Now
As the scaling of pretraining is reaching a plateau of diminishing returns, model inference is quickly becoming an important driver for model performance. Today, test-time compute scaling offers a new, exciting avenue to increase model performance beyond what can be achieved with training, and test-time compute techniques cover a fertile area for many more breakthroughs in AI. Innovations using ensemble methods, iterative refinement, repeated sampling, retrieval augmentation, chain-of-thought reasoning, search, and agentic ensembles are already yielding improvements in model quality performance and offer additional opportunities for future growth.

Vijay Janapa Reddi - Generative AI at the Edge: Challenges and Opportunities
Generative AI at the edge is the next phase in AI's deployment: from centralized supercomputers to ubiquitous assistants and creators operating alongside humans. The challenges are significant but so are the opportunities for personalization, privacy, and innovation. By tackling the technical hurdles and establishing new frameworks (conceptual and infrastructural), we can ensure this transition is successful and beneficial.

Erik Meijer - From Function Frustrations to Framework Flexibility
The principle of indirection can be applied to introduce a paradigm shift: replacing direct value manipulation with symbolic reasoning using named variables. This simple yet powerful trick directly resolves inconsistencies in tool usage and enables parameterization and abstraction of interactions. The transformation of function calls into reusable and interpretable frameworks elevates tool calling into a neuro-symbolic reasoning framework. This approach unlocks new possibilities for structured interaction and dynamic AI systems.

Chip Huyen - How to Evaluate AI that's Smarter than Us
Evaluating AI models that surpass human expertise in the task at hand presents unique challenges. These challenges only grow as AI becomes more intelligent. However, the three effective strategies presented in this article exist to address these hurdles. The strategies are: Functional correctness: evaluating AI by how well it accomplishes its intended tasks; AI-as-a-judge: using AI instead of human experts to evaluate AI outputs; and Comparative evaluation: evaluating AI systems in relationship with each other instead of independently.