Here’s the scenario: Your drug discovery program revolves around a particular data type that needs to get generated in the lab and analyzed by your data scientists. But as you’ve moved from a single drug program to many, and from one data scientist to a team, things haven’t been working as well as they used to. There are delays, communication breakdowns and tangible frustration.
In my last few blog posts, I described an approach to building processes, tools and shared mental models that can address problems like this one. In the next few weekly posts of this newsletter, I want to go through a more detailed example to see how this process might play out in practice. In today’s post, I’ll give an overview of the problem. You’ll have to wait until next week to start seeing the solution.
To make things a bit more tangible, I’ll choose mass spectrometry as the data type, but you can substitute your favorite assay and it will be pretty much the same. The pipeline will involve the following steps:
A biologist in one of your many drug programs prepares a cell sample based on one of an evolving set of a dozen or so protocols.
The sample gets handed to someone on the mass spec team who follows a more standardized protocol to pass it to the instrument that generates the raw data.
The raw data goes to someone on the bioinformatics team who runs a pipeline based on how the sample was prepared and what question is being asked.
The output of the bioinformatics pipeline goes to a data scientist who runs a mostly custom analysis to answer the biologist’s question.
In an ideal world, the bioinformatics pipeline in step 3 would run automatically as soon as the raw data is ready, only requiring a bioinformatician to review and approve the quality report at the end. The analysis in step 4 isn’t quite as standardized, but there are a number of initial steps that are the same each time and could be automated as well.
Unfortunately, that’s not how it is. Instead, when the data comes out of the instrument, your bioinformatics team has to reach out to the biologist and to the mass spec team to figure out the parameters for their pipeline. Then your data scientist has to track down all three of them. So we end up with a lot of delays, missed opportunities and general frustration.
Getting from this frustrating mess to something resembling the ideal process is a big jump. But you can make it more manageable by breaking it into Minimum Viable Changes (MVCs). And in my next few newsletters, I’ll walk through what this might look like.
If you haven’t already, you can subscribe below to find out what comes next.
Just do this all in one system! With Sapio informatics Platform this is exactly what we had in mind when creating it.