The 3 reasons infrastructure projects fail
*** Two quick ads:
I’m working on a guide to building biotech data infrastructure that I’m calling “Unblocking Biotech Data Teams”. If you want to read an early copy of it, fill out this form and I’ll send it to you.
I’m developing a service that leads startups through the process in the guide, with the goal of having a functioning prototype system within 2-6 weeks (depending on stage and complexity). I have the bandwidth to do this with three startups in Q1. If you might want to be one of them, you can read more about it and apply here.
End of ads ***
It’s another listicle this week. It was probably a bad idea to start the year by sneering at them. I’ll probably have a few more before all is said and done…
Last week’s listicle was about the three core things that data infrastructure must allow a biotech data team to do. Given how straightforward these three things are (or seem to be), it might be surprising how often biotech startups completely miss the mark. But having seen this kind of failure first hand, and talked to lots of others who have seen it, I’ve identified three main ways that data infrastructure projects fail. The approach to building data systems that I’ll be writing about for the next few weeks is designed specifically to avoid these failure modes, so I want to get really clear on what they are.
All three stem from the fact that there are two fundamentally different levels at which you can design/understand a data system: 1) The technical level of digital components that make up the system and 2) The conceptual, system-wide level of processes and conventions that tie the technical components together.
The technical components are what many teams usually think of as the system. They ensure that users can do what they need to do consistently and efficiently. The technical layer is the result of deliberate decisions about what tools to build or buy. The process layer, on the other hand, is often implicit - the result of individuals making pragmatic but isolated decisions about how to get their jobs done. The technical layer can facilitate the organization's work, but if it's not coordinated through a deliberately designed process layer, these individual functions will come to naught.
So ultimately, it's the process layer that is the most fundamental part of the system. The technical components support and automate the processes, allowing users to work more efficiently and ensuring they do so consistently. But these technical components can generally be swapped in and out without fundamentally changing the process layer.
Once you recognize these levels, the three failure modes are a lot less surprising:
Failure Mode 1: Addressing only part of the problem
Often, teams design data infrastructure to address the functions that the designers are most familiar with, or that they've experienced the most problems with. Because these projects ignore large parts of the process level, the technical components that they build in partial isolation don't integrate into the organization's broader processes.
One team I worked with built a meticulous system for registering and tagging data files. But the system didn’t have a good way of grouping the files, and the tags didn’t correspond to concepts that users outside their team cared about. So even though they knew where every file was, there was no way to know what was in the system if you didn’t know what you were looking for. They had only solved part of the problem.
More often, teams build applications that address one step in a larger process without thinking about how it will fit into users’ broader workflows. What applications are they using for the adjacent steps and will they be willing to switch applications for that one step? How will the necessary data in the other systems make it into the app?
Failure Mode 2: Relying on purely technical solutions
Even if project teams recognize the full scope of the system they need to design, they often focus the design on the technical layer, working around an implicitly defined process layer. If the existing process layer isn't well suited to your biotech's platform and science, or if it isn't consistently understood across the organization, the technical layer cannot be effective.
The classic example of this is experiment planning. Or rather, what many bench teams do is more like documenting than planning - They go into the lab and write down what they’re doing as they do it. Afterwards, the data team has to sift through the artifacts of this process to figure out the metadata they need. The data team may introduce tools to speed up this sifting process, but that can only do so much.
Instead, the overall process needs to shift from documenting to planning, with the lab team writing down what they’re going to do before they go into the lab. Then the data team can build tools to automate parts of the design process, possibly making it easier than the old documenting process. And those tools can guarantee that the metadata is in a consistent form from the beginning.
Failure Mode 3: Jumping to a long-term solution
Because of the sense of urgency in addressing the pain and friction of working with data, as well as the desire to do things the proper way, many teams start trying to build what they consider a long-term solution right from day one. But building long-term solutions takes time. While you're focused on the technical layer, you're not addressing the process layer, so the problems you have today will only compound.
For example, if you want the bench teams to start planning instead of documenting, it would be nice to have an intuitive app that they could use. But building that app can take months. And experiments are complicated enough that even then you probably won’t capture all the flexibility they need, which means more iterations. By the time it’s ready it could take close to a year. And that’s if you can get there at all.
Meanwhile, the whole time you’re building the app, the bench scientists are continuing to document instead of plan and your data team is continuing to spend days sifting through metadata from a lousy process.
Instead, the trick is to iterate the process and technical components in parallel. And since implementing technical components takes longer than introducing new processes, the first iteration should use the simplest possible technical implementation. In fact, you can often use the tools you already have so there’s essentially zero technical implementation. (This is why I target 2-6 weeks for the initial prototype in the System Design Program I advertised above.) This can be hard to imagine if you’re used to focusing on the technical level. But with some creativity and some practice, you’ll start to notice all sorts of opportunities for this.
Conclusion
If you’ve been working in this space long enough, you’ve probably seen one or more of these failure modes. Maybe you’ve even been responsible for some of them. (I certainly have.) So over the next few weeks I’ll get into how you can avoid them. Stay tuned!