May 15, 2005

Obfuscating Code

Robert Cringeley, in a post early last year, raised some concerns about reverse engineering .NET code:

.NET is almost exclusively Just-In-Time compiled. JIT'ing means, "I was just about to interpret this, but I'll compile it at the very last minute instead." In effect, the .NET code remains in interpretation-intended form right up until the end. The point is that it carries around tons of info with it that makes reverse engineering easy just as with interpreted languages. The original Microsoft BASIC was an interpreted language and subject to this vulnerability, which is why it was so easy to copy on punched paper tape and why Bill Gates once referred to many of his earliest users as "thieves." Many languages are interpreted including some of my favorites like Forth, PostScript, and Scheme. Java is interpreted and subject to this same vulnerability but the evolution of Java has led to it being used mainly for server applications where the source is a bit further out of reach. .NET, on the other hand, is Microsoft's chosen successor to Visual BASIC, and effectively exposes source code at the very heart of Microsoft consumer and enterprise applications.
The answer to providing a modicum of security for interpreted applications has to this point been obfuscation -– making the code look different so it can be difficult to decompile and figure out. Obfuscation used to mean padding the code with extra variables and gibberish -- that is until a company in Cleveland, Ohio, called Preemptive Solutions Inc. came out with a bytecode optimizer for Java. Called DashO, this software was intended to make Java programs load and run faster by removing all code that wasn't necessary, which is to say de-obfuscating and making perfectly clear what had been so carefully muddied before.

Preemptive also makes Dotfuscator for the .NET market. A "community edition" of this obfuscator was included with VS.NET 2003. Microsoft knew they had a thorny problem on their hands-- balancing the utility of source code access with the legitimate need to protect commercial software.

I believe you can attribute much of .NET's success to its transparency; it's free, easy to obtain, easy to write, and easy to reverse engineer. I've read dozens of blog posts where authors successfully decompiled Microsoft .NET libraries to diagnose difficult problems. .NET's openness is also an indirect compliment to the open source movement, where "security through obscurity" is a derogatory slur.

On the other hand, there are special conditions where you do need some additional security. Why pay for a component when you can download it, easily decompile it, and comment out the trial restrictions? If I was selling a commercial .NET component, I'd be a fool to release a trial version without obfuscating it first. As with all client-side protection methods, this is only a stopgap intended to raise the difficulty bar. But it's still worth doing. I lock the front door of my house, too. Right after I activate my nuclear-powered laser attack robots.

I believe it's best to err on the side of transparency. That buys you a lot more in the long run. You'll want to leverage basic "locking the front door" efforts, such as obfuscation, to keep cracking your licensing code from being a trivial one-click operation. But don't expend a lot of additional effort on protecting your code-- all client-side protection mechanisms are vulnerable by definition. Instead, keep improving and refining your code. You're a lot more likely to beat would-be pirates through frequent, meaningful updates than you are by bothering your customers with increasingly onerous security measures.

One alternate solution is to write code in languages that are already obfuscated*, as demonstrated in the International Obfuscated C Code Contest. Here are two winning entries from 2004. Note that this is actual source code you're viewing!

Or, for ultimate obfuscation, you can opt to write all your code in whitespace language.

* I kid! Or maybe not.

Posted by Jeff Atwood View blog reactions

« Conventions and Usability

The Code-First Dictum »

Comments

Another option is to ship source. I've debugged problems in Delphi apps by stepping into the VCL (Visual Control Library) to see where it was blowing up. Typically it was because my code was doing something bad that it wasn't expecting.
This way, you can see what's going under the covers, and can ship an optimised binary. The best of both worlds.

Mike Swaim on May 15, 2005 9:04 PM

Right. A lot of companies make the (smart, IMHO) decision to sell source licenses and bypass all these issues entirely. I believe Dan Appleman's stuff is like that:

http://www.devx.com/opinion/Article/20513/1763

Jeff Atwood on May 15, 2005 10:24 PM

Using dotNET greatly effects vertical market companies as it exposes their algorithims and other trade secrets. These are companies that serve a narrow market such as health care, CNC controllers, etc.

Security by obcurity is indeed not good to rely on. But it does drive up the cost of reverse engineering as well as other measures. My company made CNC machinery and tried for years to work with our competition. Slowy over the course of a decade we manage to import and export data to most of our competition's machine. But it was costly and the translation often imperfect. However since our competition went to a dotNET version finding out how to import and export their data has been been two orders of magnitude easier.

The whole process has left me wondering the wisdom of using dotNET in a commerical application. Applications have their innermost secrets exposed when written in a dotNet language. Granted the internals of DLLs and ActiveX components are still as difficult to decode even if they are called from a dotNet application. But needing to code trade secrets in an older framework in order to keep them is defeating the benifits of dotNET.

RSC on May 16, 2005 8:01 AM

We found this little fact when poking around with the Reflector. Just grab a commercial product written in .NET and run the Reflector against it. Take, for instance, ASPNET Menu...not legally of course, you could have your own version of that nice product by staring at the source code and decompiling it. It's wrong...but it works. Remember though...folks have done this for years with Java byte code.

Brian Swiger on May 16, 2005 10:39 AM

I firmly believe that tools like Reflector and it's decompiler actually make the .NET platform a more attractive environment to work with. Developer documentation is always insufficient (if not downright inaccurate) - having instant, easy access to the source for any (managed) component - especially the BCL itself, makes a developer far more productive, and thus makes the platform itself more appealing.

I personally find Reflector every bit as useful as downloadable source - maybe more so, because it doesn't involve downloading big code bases, mucking around in Visual Studio, etc. I've never felt the need to download Rotor, for example.

Kevin Dente on August 18, 2006 9:39 AM

"If you’re shipping high-level source code in any form, including bytecode, self-hosted executables, or encrypted bundles, you’re ultimately shipping your source code. Get used to that idea, or go back to writing in C."

http://www.matasano.com/log/1055/de-obfuscation-for-the-impatient/

Jeff Atwood on June 3, 2008 3:46 AM

> If programs are for communicating with other programmers, why do we have a contest that encourages such complete perversions of best practises?

http://selfexplanatorycode.blogspot.com/

Jeff Atwood on August 5, 2008 2:10 PM

I'd like to note that the image* of the maid-girl's head is that of the character Rinia from Moekan / Moekko Company.

More info at the IOCCC:
http://www0.us.ioccc.org/2004/omoikane.hint
http://www0.us.ioccc.org/whowon.html

* http://www.codinghorror.com/blog/images/obfuscated_1.gif

Kaori on December 1, 2008 9:12 PM

Nov	JAN	Dec
	09
2008	2010	2011