You have a process, you just don't know what it is

Feb 14, 2024

The last few weeks, I've been writing about the different processes that define how biotech startup collect, organize and leverage data, with the goal of understanding the operational and technical components needed to support these processes. The idea is that the best way to get to an end-to-end system that meets your needs is to evolve these processes, incrementally and iteratively, rather than spending months developing an end-to-end system and trying to roll it out all at once.

This week, I want to introduce a system I’m calling levels of formality for understanding where in this evolution each process and component lives or should live. These are the levels I use in the System Evaluation program I recently rolled out to help biotech startups identify the highest priority places to invest in their data infrastructure. The scale goes from completely informal (You don’t know what the process is, but it somehow seems to get done) to completely automated (You know what the process is so well that you programmed a computer to do it for you.)

This is important because it’s often difficult to deal with large jumps in the level of formality between adjacent processes. For example, if metadata collection is at one of the lower levels - Excel sheets in arbitrary formats scattered around different file systems - there’s no way you’ll be able to automate the analysis that uses that metadata. So when you look at your entire system from this perspective, you can often find bottlenecks that are upstream from where you think the problem is.

Now, to be clear, your goal should not be to get every single process and component to the highest, most automated level of formality. The appropriate level depends on a lot of factors, most importantly the frequency of the process. The less often you do a process, the less well you understand it and the more flexible you need to be. Lower levels of formality are more flexible so they’re more appropriate for less frequent processes. This is basically a more fine grained version of the four stages of infrastructure I wrote about a few months ago.

In fact, you’ll probably have different versions of the same process at different levels of formality. For example, many startups have one core assay that they run over and over, then a long tail of assays that they run only in specific situations. So it makes sense for the processes around the core assay to be much more formal than the rest. Problems arise when you expect every assay to be at the same level of formality as the core assay, but you don’t have bandwidth to do this for all the long tail assays, so you just ignore them.

Each level is defined by a combination of how consistent the process is and how structured any data that results from the process is. Here they are:

Level 0: Informal

We’re starting with a semantic/semi-philosophical question: If no one knows what the process is, do you have one? I’m going to say that yes - if things get done, you have a process. It’s just informal, probably highly inconsistent, with any data captured in a completely unstructured form - emails, slide decks, post it notes or more often just in people’s heads. That’s why I’m starting the scale at level 0 instead of 1. This is where most processes start, and if you can’t write down what the process is, this is its level.

Level 1: Manual

At this level you have some shared understanding of the process, often some regularly scheduled meetings or a habit of scheduling meetings that drive the process. But any data that’s generated is stored in ad hoc formats that aren’t machine readable - text documents, slide decks, emails, etc. For example if program teams meets regularly to discuss upcoming experiments and maybe update a slide deck that will be presented at the next all hands, that counts as a level 1 experiment design process. The data can also be in formats that are structured but too inconsistent to be easily machine readable. For example, if bench scientists capture plate maps/sample sheets in spreadsheets but they make up the format each time, that’s still Level 1.

Level 2: Structured

This level has slightly more consistency than Level 1, but the main difference is that data outputs are structured in a form that is machine readable, possibly with minimal manual help. If the program team updates a shared Excel file with dates and experiment parameters (in addition to the slide deck or instead of it), that’s structured so we’re in Level 2. If the bench scientists use a template to capture plate maps or sample sheets in a consistent format, that’s Level 2.

Level 3: Assisted

At this level, team members use software to enforce both a consistent process and a consistent, structured output. Maybe the program team uses a LIMS to design the experiments (or something like Kaleidoscope). Maybe the bench scientists design the plate maps/sample sheets in an app specifically for that purpose. There’s still human judgement and manual sub-processes, but the overall process is constrained.

Level 4: Automated

This is the level where a computer does everything for you. So not only does the process itself need to be consistent enough to be turned into an algorithm, but the upstream processes that produce its inputs need to be extremely consistent - probably Level 3 at least. For our experiment design example, this might make sense for high-throughput screening where a compound list from an earlier process can be automatically turned into an experiment definition. Then for the plate maps example, the plate maps should reflect what was actually in the plate, not just what was planned. So to get that to level 4, you need physical lab automation - robots and whatnot - then code to turn their log files into plate maps.

Conclusion

Now that you’ve gone through these five levels, you can start to apply them to the different processes and components within your own organization - Which level makes sense for each one? Which level is it currently at? And which upstream components/processes are blocking your team from getting it where it needs to be? Over the next few weeks, I’ll continue to explore what these processes and components are. But in the meantime (I have to sneak in another plug) I’d love to walk your team through this process via my System Evaluation program.

Scaling Biotech

Discussion about this post