Is your digital twin identical or fraternal?

Apr 12, 2023

I recently proposed three reasons why Biotech ML/AI projects often struggle to have a tangible impact, and the last of these was difficulty getting data and metadata from the lab quickly and consistently. This is most acute near the end of the project, when you’re trying to build a repeatable process. But it’s also the problem that has the most technical aspect, because it relies heavily on the software you have available. So in this and the next few posts, I want to explore why this kind of software is so hard to get right. And to frame the discussion, I’ll argue that what this software really needs to do is create a “digital twin” of the lab.

But before you click unsubscribe and delete this email - Yes, the term “digital twin” has become a marketing buzz word, and has been misused in ways that would make Don Draper blush. But behind all that, there’s actually a tangible and meaningful definition: A digital twin of the lab is an accurate and up-to-date database and accompanying infrastructure that records everything that has happened, is happening and will happen in the lab: inventory, experiment planning, automation, data tracking, etc.

This is a lot of work, both to build the infrastructure and to keep it up to date. But as biotech research becomes more and more digital, all the data in this digital twin becomes essential for your data team to do its thing. The closer you can get to a digital twin, the more time you can spend doing the interesting work that you were actually hired to do.

Today, all the software that bench scientists use to collect data is essentially trying to create this digital twin, just with lousy marketing. And while these individual components may be succeeding or failing to different degrees, overall the end result is often failure:

Traditional ELNs encourage bench teams to wait until the end of the experiment to package and upload their records of what happened, rather than record it from the very beginning.
Off-the-shelf software never quite fits the shape of your lab processes, leaving gaps and inconsistencies in what it records.
Key information that that these systems don’t track, or can’t record at the right time, ends up in spreadsheets on laptop hard drives, or in presentation slides, or on post-it notes.

I don’t think it has to be this way. Morally, there has to be something better, but no clear silver bullet has emerged. So in the next few posts, I want to ask the questions: What is so hard about building a digital twin? Where are the current solutions failing? And maybe I’ll even suggest a few solutions. (No promises.)

Stay tuned!

Scaling Biotech is brought to you by Merelogic. We’ll help you turn your ML prototypes into tangible impact, whether it takes a few small tweaks to how your team operates or larger changes to your tools, infrastructure and projects. If you want to explore what this might look like for your team, send me an email at jesse@merelogic.net

Scaling Biotech

Is your digital twin identical or fraternal?

Discussion about this post