*** A quick note/ad: As I’ve been working with the first cohort of startups doing my System Evaluation program, I’ve realized that a big goal for many of them is to create more consistent conventions/standards across teams. So I’ve reframed the description to emphasize this, and also renamed it as a Stack Audit, which I think more accurately reflects what it is. Check out the new description here. ***
One of the biggest reasons I started caring about the stuff that I discuss on this newsletter was seeing my colleagues in comp bio and data science working at the mercy of what felt like unforced errors. They would spend most of their time tracking down, reformatting and cleaning sample sheets and plate maps. They'd get urgent requests to have slides ready by Monday morning on data that they didn't know existed until Friday afternoon. And when the data turned out to be too noisy and too small, they still felt overwhelming pressure to make it say something positive.
So, since this is post number three in a series on how to motivate biotech organizations to get better at managing data, it seems like a good theme to explore: Computational Biologists and Data Scientists are expensive and hard to find. But they spend a lot of their time doing things that either should be unnecessary (reformatting and cleaning up data) or that are unreasonably stressful (short deadlines and squeezing results of lousy data). It seems like the organizations that hire and pay them should be interested in having them spend more time on productive things that won’t make them want to quit their jobs.
I’ve been trying to make that argument for years. In fact, that was the main theme of the guide to Unblocking Biotech Data Teams that I released a first draft of in January. And while I think most people agree in theory that this is a problem (I suspect many of you reading this have felt it first hand.) I’ve started to notice that it ends up having much less impact on actual decision making than I was expecting.
When I’m in a cynical mood, I start to understand why. And I think it’s worth calling this out. So this week, instead of writing about how to make this argument work, I want to explore why it doesn’t.
The work gets done
Computational biologists and data scientists, in general, are hard workers who want to produce results and don’t want to complain. Now, obviously this isn’t universal. There are some out there who care more about their mental health than about their jobs. But biotech seems to select for the ones who will spend their weekend making slides for the last minute urgent request using poorly formatted and fundamentally inadequate data.
And even if they complain about it afterwards (which they don’t all do) to the right people (which they don’t all do) and come with concrete, actionable suggestions to prevent this in the future (which is really hard), the fact remains that the work got done, the board members liked the slides and everyone knows it will get done the next time too.
So, this is not to say that the organization doesn’t care about the situation. Leadership wants their data scientists to be able to work on more important things. They don’t want them to rage quit after one too many late nights or long weekends working. But at the end of the day, if the work gets done they’re going to focus their time, attention and resources on the thousand other things that aren’t working.
Drawing the alternative
Now, to be clear, I’m not suggesting that biotech data teams should refuse to do work on last minute urgent requests, or on lousy data. I mean, I have suggested that in particular situations, but I don’t think it’s a universal answer. Most of the time it doesn’t work because it feels like an overreaction - The work could get done if they would just do it, right?
Instead, I think you have to make an argument for what these teams could do if they had better data and didn’t have to spend half their time cleaning it. Deeper insights? Better outcomes? Faster results? (Note: You can’t just say those things - you have to be more specific.)
Or you can make the case for other direct benefits of better data infrastructure and more organized data. This is why I like the notion of the Data Driven Biotech - it focuses on how a biotech organization that cares about data is fundamentally different. A side effect of caring about data is making life easier for data scientists. But that doesn’t have to be the reason an organization does it.
A fork in the road
Another reason I like the idea of the Data Driven Biotech is that it means organizations have to make a decision: They can keep doing things the way they always have in the past because it’s worked fine so far. Or they can recognize that the biotechs that own the future will be the ones that decide today to be more data driven.
For any biotech in that first category, data infrastructure is purely an expense. No argument will convince them otherwise because it would go against their strategy to spend resources on what’s worked before and is working fine now.
For any biotech in that second category, data infrastructure is an investment. The question isn’t why these things matter - it’s how to efficiently and effectively allocate those investments.
So I guess what I’m realizing is that it shouldn’t be about convincing your organization to care more about data. It should be about helping them decide which bucket they fall in. If they pick data data driven, you can help them figure out how to make it happen. If they pick the other one… well, that’s up to you.
I read somewhere (or was told) that in an organization it's better to be closer to profit centers (think sales) than to cost center (think IT). Being in the latter, it was not a fun thought.
This, it seems, is VERY much about the same---looking at data as a cost (infrastructure, people associated with it, all the risks) versus looking at it as the source of revenue (which I'd say it is ultimately).
$2/min. That was in 1988. That's what everyone's time is worth. However, don't expect to be paid for it. It's a metric, that's all. Consider a meeting with twelve people, lasting an hour and half. $2,160. Back in '88, $2 would cover cover coffee, donut, and tip. Today. it's $5, without the tip. When I meet two local friends we figure it's still $2 per, and we're paying $10 rent.
Work time is precious. I do a quick calculus as to whether a call, text, or email will be faster; do we need a trail, etc. Is there a subtext that warrants a f2f?
Jesse points to "unforced errors". Perhaps, I repeapeat repeat myself.
"fundamentally inadequate" , "lousy", "noisy". are words that apply to restaurants as easily as to data. Well run eateries understand scheduling, and making payroll. The value of time is not uniform. Football has different rules for the final two minutes; basketball is driven by the buzzer. Financial markets inversely value derivatives as expiration dates approach. Rideshares have surge pricing, and recently fast food has been testing the waters.
IN the lab, time is not laminar. It can be inordinately viscous. Good luck to all.