The Crypto-CS-SETI challenge:
An Un-programming Challenge

A challenge to all bright minds in the IT and CS world: Can you disassemble a program for an unknown computer, given only the compiled ROM image?


I hereby announce a challenge, beginning immediately, to gather as much information as possible about the program file linked below.

Imagine finding a crashed flying saucer.

This time it is no hoax and there are no black helicopters involved.

And there is no doubt that it is not of our planet because there is absolutely no way Akhnaten could have bought a used carbon-carbon heat-shield on eBay—and, in particular, one that cannot be carbon-14 dated as it has no carbon-14 atoms.

By deciphering the hieroglyphs, we learn that there was a royal cover up (which is why we haven't heard about it until now) and that the alien technology was based on self-replicating biological principles, which, unfortunately, replicated nowhere nearly as fast as the locusts and cockroaches could eat them.

As more and more of the secret royal journal is translated, things start to make sense: Where Akhnaten got his strange ideas, why he was so terrible at genetics, but not, alas, the Philip Glass Opera.

But apart from the heat-shield, the only actual physical evidence is some broken test-tubes and a small glittering object, which archaeologist quickly declare—"Probably had ceremonial or religious significance,"—which is simply their way of saying "no idea..."

On subsequent examination, the object is found to be a small triangular corner, sawed from a piece of printed circuit board and then used as jewelry or as a talisman—proving the archaeologists both wrong and right.

An X-ray examination reveals that one part of the object contains what is clearly some kind of mask-programmed ROM storage.

We read out the bit-pattern...and then what?

What, if anything, can we deduce about the rest of the alien world from only a program? We don't know what it does, and we have no information about the computer for which it was written [1]?

It could have been an alien's favorite game cartridge for his Erodommoc-46 home computer, or it might be a piece of supercomputer which unlocked the tasty secret of DNA-computing. Or maybe it was the mobile phone with which they digitally signed their marriage certificate several hundred thousand years ago.

Of course, none of the above is true, except possibly for the bit about the opera, but the question is worthy of study in its own right. It's not only a purely academic question, but a very real problem for curators of computer history collections, where artifacts don't always come with enough metadata to allow proper identification. The task is akin to breaking a cryptographic code, only instead of searching for highly confidential messages such as "Hvid armes hovedkvarter har ikke mere jordbærsyltetøj, kan I sende en stafet over åen med et par glas?" we are looking for "while (something()) {something_else(); };". And instead of having been encrypted, it has merely been compiled and that should make the task much easier—nobody has actively tried to hide anything. Quite the contrary, like all terrestrial programmers, they probably went out of their way to make sure the computer would understand their program in precisely one way.

But we cannot make too many assumptions about the code.

It might have been developed in a highly structured manner with formal methods and proven correct because it was part of something important, or it could have been optimized by a genetic-algorithm compiler and be the most terrible spaghetti code and only marginally less tangled than animal DNA code.

Can we actually disassemble a program for an unknown computer? The best way to find out is to try, so here is my challenge to all bright minds in the IT and CS world:

With the publication of this article I release a 16384 byte program file [2].

The challenge is to deduce as much as possible about the program, about the device and about what it does. The program is from a computer very few people have ever come into contact with and for which no copies of the documentation are thought to exist anymore. And Google is absolutely clueless about it, so it is a pretty good stand-in for an alien EPROM.

The only help you get is this:

It is an unmodified real-world program that somebody once wrote to do something beneficial. The computer still works and I have studied it sufficiently with a logic analyzer to be able to judge any submitted results from the challenge.

There is no prize for winning in this challenge, because I have absolutely no idea where to put the goal-posts. The real prize will be learning something about programs, as archeological or crypto-zoological artifacts and I strongly urge publication of this learning.

I also urge anyone who might be able to identify this ROM-image—either from having programmed this computer, or who for some reason recognizes this code—to please contact me privately, rather than publicly spoil the challenge for the rest.

And with that, good luck!

References

[1] A somewhat similar situation was discussed in http://en.wikipedia.org/wiki/History_Lesson_%28short_story%29

[2] The binary file for the challenge is available here:
http://phk.freebsd.dk/UCO/


POUL-HENNING KAMP (phk@FreeBSD.org) is one of the primary developers of the FreeBSD operating system, which he has worked on from the very beginning. He is widely unknown for his MD5-based password scrambler, which protects the passwords on Cisco routers, Juniper routers, and Linux and BSD systems. Some people have noticed that he wrote a memory allocator, a device file system, and a disk encryption method that is actually usable. Kamp lives in Denmark with his wife, his son, his daughter, about a dozen FreeBSD computers, and one of the world's most precise NTP (Network Time Protocol) clocks. He makes a living as an independent contractor doing all sorts of stuff with computers and networks.


Comments

Displaying 10 most recent comments. Read the full list here

rstos | Thu, 19 Jan 2012 13:13:26 UTC

Julian: I have already deleted my work directory and most of my students have done so also. This has moved from an interesting challenge for them to learn about unknown processors on to a lesson on how NOT to run such a challenge.

Poul-Henning Kamp | Thu, 19 Jan 2012 15:57:09 UTC

There is a reason this was called "a challenge" which is not the same as "a competition". If you care to read the article, it should be quite evident for you, that the challenge was about broadening our understanding of code and computers, adding to our sum of human knowledge. It should also be quite obvious that there is no way I can prevent anybody from cheating or breaking the rules of the challenge, in fact, I was quite worried that somebody would come out and say "Hey I wrote that, its ..." and ruin the challenge. That is why the challenge is set up as a "gentlemans challenge". If you don't care to participate, don't participate, it's entirely voluntary. If you don't want to abide by the rules of this challenge, at least be a gentleman, and don't ruin it for others. The group in question broke the rules, it is evident from the material they sent me, that they already quite early tried to identify the device using searches on the web. If you want to ace the challenge and identify the device, you do it by analyzing the program, and from the program *alone* find out what the computer does. This particular ROM image is pretty unique in the world, in the sense that we have no CPU documentation for the computer that runs it. Probing our knowledge and understanding of code using a CPU which people had heard about or read about, would invalidate the premise "based on only the ROM image". Ideally, I could have designed a computer and written a program just for such a challenge, but it would not have made the challenge life-like and realistic the way this one is: hardware and software written for educational purposes are never realistic. This ROM image is not realistic, it is *real*. I still think it is a damn good challenge, and I am very amazed at the skill, speed and deductions of the group who solved it in 5 days. I am grateful for the information they shared with me, and I am even more grateful that they have not spoiled the challenge for everybody else, in that respect, they have been true gentlemen. As for winners, I think anybody who manages to solve some part of this challenge will feel like a winner to themselves, it is not in any way an easy challenge. But the true winners, in the sense that there can be any, is whoever publishes their methods observations and tools, and thus adds to our understanding of computers and code. Poul-Henning

shawn_e | Thu, 19 Jan 2012 16:33:24 UTC

"cluster" comes to mind and I'm not talking about a large array of alien CPUs working together.

Segher | Thu, 19 Jan 2012 18:07:16 UTC

Hi everyone. I am the miscreant who stumbled upon that infernal block diagram (which btw is in the public domain and available on many websites). We reversed (most of) the instruction set and internal architecture of the CPU, and some of its peripherals. It became clear that this machine was of earthly origin, but nothing else was apparent, not even an approximate age, or what part of the world it is from. Since I'm quite into computer-related archaeology, and interested in way too many things, I soon found myself on yet another yak-shaving expedition, as always seems to happen. Then suddenly, O Fortuna!, I was looking at a block diagram that matched everything we knew about our alien CPU in detail. That gave us a name for this CPU but nothing much else. Us being human beings, we threw that name into google, not expecting much (we were told we wouldn't find anything, after all!) Unfortunately that ends up identifying the device this CPU is from. Then we contacted phk. We did not "cheat" in any way. The challenge is to find out what we could from the binary alone; that is what we did. When we could no longer adhere to the rules, through no fault of our own, we were open about that. Poul sent us the schematics of the device, so there is no way we can honestly reverse any more of it; and I do not find looking further into the code driving it to be very interesting, given what its function is, so that's the end of that as well. But. This was quite a nice challenge for us, and it's really a weird historical computing artefact. I do encourage everyone else to keep looking at it and discover, well, whatever there is to discover, and hopefully your journey will not be stopped short by a train wreck like ours was! Segher

Poul-Henning Kamp | Thu, 19 Jan 2012 23:30:36 UTC

Thank you for being graceful about this. If you ever come near Denmark, call me, and I'll show you a LOT of old computing artifacts in datamuseum.dk Poul-Henning

XAD | Sat, 28 Jan 2012 08:59:37 UTC

I don't fell able to participate to this challenge, but I would like to mention, that there is a fundamental flaw in it: Decoding a human-produced code is far away from decoding an alien-produced code. It should be expected, that the "Hardware" and the "Software" could be based on entirely different principles: other materials, non-binary processors and logic, chaotic organization,.... As a challenge, I would have suggested to decode the DNA Processing Logic (not the DNA-Content, which is now quite well known). Even this would be much easyer, than decoding an alien-based code, since we have a good understanding of its reactions to external inputs.

Poul-Henning Kamp | Sun, 05 Feb 2012 21:51:28 UTC

Well, until we see some actual aliens, I'll refrain from speculating too how they build their computers. I don't think it is a bad assumption that they might have binary computers, the principle has a lot going for it, so much in fact that it has out-competet all other types on this planet by a large margin.

Poul-Henning Kamp | Sat, 05 May 2012 19:20:04 UTC

This just to add a pointer to a very interesting blog post from the crew who reverse engineered the unknown CPUs instruction set in a matter of days: http://fail0verflow.com/blog/2012/unprogramming-intro.html

rcf | Thu, 19 Jul 2012 10:32:24 UTC

Are there plans to release any solutions to this challenge? The failOverview team suggested they would be showing their methods but haven't posted anything on this since their introduction to the challenge in March.

hepp | Sat, 27 Oct 2012 14:00:23 UTC

I really hope there will be a approach and tools blog at some time. I really want a fly-on-the wall look at how you get from this blob to something useful.
Displaying 10 most recent comments. Read the full list here
Leave this field empty

Post a Comment: