23 Feb 22

Not Invented Here vs Not Implemented Here

Not Invented Here (NIH) Syndrome is a derogatory term used to describe people or organizations that inist on coming up with their own solutions to problems they face rather than use other, commonly-available solutions. I think that software engineers tend to abuse this notion though, and end up creating poorer-quality products as a result.

This stems in large part from the conflation of "Not Invented Here" with "Not Implemented Here." If Apple said that somebody else invented phones already, and there are already other phones out there, and decided not to create the iPhone, they'd obviously be worse off for it. But their own implementation of a phone was a massive success. So was NIH Syndrome a bad thing in their case?

If we take NIH to it's logical conclusion, Apple never would have created the iPhone, Linus Torvalds never would have created Linux, Google wouldn't have created GMail. After all, they re-invented phones, operating systems, and mail services according to the NIH principle. I'd argue that they didn't invent them though, but re-implemented them in a way that was superior (by at least some metric) to other implementations available at the time.

Joel Spolsky makes a plausible case that relying on third-party solutions to problems outside an organization's core business is fine, but that sometimes it's OK to outsource technology or business functions outside one's core competency. I'd go much further, and say that its perfectly fine for engineers to create their own implementation of other viable solutions even when outside their core business. I think it's desirable, and even honorable for a company or product to be as self-sufficient and independent as possible, and often the embrace of NIH Syndrome brings an organization an entirely new line of business as a result.

In the case of Apple, mobile devices were certainly outside their core business, yet they embraced NIH Syndrome and created a whole new core business. They created a better implementation of a phone. Had they strictly opposed NIH and insisted on using a third-party product that already existed because it was outside their core competency of "computers" they'd be worse off for it.

A common tendency amongst software engineers is to insist on code re-use, which usually takes the form of third-party dependencies. I think that much of the time, code re-use is a clay idol. You inherit functionality, sure, but you also inherit the vulnerabilities, the bugs, and maybe even the team dynamics.

So, Mr. Smarty Pants, can you do better?

Maybe! It's premature to just automatically assume that you can't do better and jump too quickly to gluing together other's solutions. And if you use a third-party product, library, whatever, you need to think long and hard about the implications, and whether or not you understand it. The Python and NPM ecosystems are rife with layers and layers of dependencies, often broken, often vulnerable, and often for functionality which an average engineer could put together in a short amount of time (looking at you.)

DIY has its costs, but third-party solutions have costs too. You have to get familiar with their API, you are dependent on their documentation, their maintainers, and probably some byzantine fetch/build process that you have to integrate with your own. TANSTAAFL and all that. But it's true.

"Don't roll your own crypto" is a common refrain. This is where "not invented here" vs "not implemented here" are often conflated. Is it a probably (I would never say "always") a poor decision to invent your own new crypto algorithm? Yes, probably. Almost certainly. Is it a poor idea to implement a well-known crypto algorithm on your own? Here I'm going to equivocate.

Can you do worse than OpenSSL? Sure. Could you do better? Definitely! Companies and orgs do this all the time. We've got LibreSSL, Microsoft CryptoAPI, Apple CryptoAPI, BearSSL, BoringSSL, GnuTLS, Mozilla's NSS, etc.) If it's OK for them to re-invent the wheel, why is it wrong for you?

I'd caution readers that there's more to crypto than just an algorithm implmentation, and safely managing keys in memory, etc. are difficult problems that other third-party libraries have probably solved. But maybe it's worth it to explore this a bit and make an educated decision. In the process you'd certainly learn something; you might decide that by the time you're done you have a superior solution that's a whole new line of business. Maybe your solution will cut your org's compute costs. Maybe you've just dodged the next big Heartbleed-level scandal. Then again, maybe your solution introduced a whole bunch of _new_ vulnerabilities, performs terribly, and is more difficult to use than its competitors. My point is, don't automatically assume that just because it's NIH that its going to be superior to a DIY solution.

(Of course, it's entirely possible to use NIH crypto improperly too.)

For a lot of problems, it's entirely plausible that the "juice isn't worth the squeeze." You probably don't need a from-scratch operating system, database, crypto libraries, whatever for your line of business app. This is basically Joel Spolsky's argument, and it's hard to disagree. But there are counterexamples too. Look at Log4J/Log4Shell. A bunch of companies hated the idea of NIH Syndrome so much that they (in some cases blindly) relied upon this third-party package with JNDI and remote-code execution capabilities when in some cases all they needed was to spit out some lines to disk. In this case rolling their own logging library may have been the better option.

Furthermore, I find that in software there's often an inverse correlation between the number of third-party dependencies a product uses and its quality. If I go to a project on GitHub and see a dozen third-party libraries that need be downloaded or built first, I know I'm in for a rough time. Often those projects will use varied build systems that interact in unpredictable ways, and I find dependencies often trigger errors or unhandled exceptions when taking a look at the log files.

Using NIH software has other costs - threading models may differ, or it might rely on a design pattern that "just doesn't fit well" with the design of your software, causing so many problems that it may well have been easier just to DIY. It's not uncommon to see software that relies so heavily on a third-party library that the library starts spreading its tentacles throughout the code, like some kind of virus. Even worse is when your third-party library is written in a completely different language or even language ecosystem. One product I've worked with (I won't name names) uses no less than three different interpreters/VMs in addition to native code - the Python interpreter, the JVM and the Erlang BEAM VM. Would you suspect this is a bloat-free, performant, easy-to-deploy solution? Of course not. But hey, at least they didn't have "NIH Syndrome".

If you find yourself struggling with something wonky like how to call Java code from the .NET framework, or writing complex serialization/deserialization code or FFI binders, or designing complex message queues for communicating between various third-party software components, it may be time to roll up your sleeves and just write the dang thing yourself. Sure it's "NIH" but by the time you're finished it may have actually been less costly.

I don't want this article to be construed as "you have to build everything yourself." There are good solutions out there and they need to be taken into consideration! But don't fear the accusations of "NIH Syndrome." You have carefully assessed the benefits and costs associated with building your own implementation of some necessary product or component and decided it was worth the cost to DIY.