At a first glance, Eli Lilly’s recent announcement of a platform hosting foundation models trained on over $1 billion worth of internal data, doesn’t seem like anything special. Pharma companies often announce platforms like this for internal users, and we never get to hear whether they’re actually usable, or just sit on the shelf. But this one is different because their new platform, called TuneLab, isn’t just for internal users. Surprisingly, they’re also making it available to certain biotech startups.
(A quick side note: I’ve been working with Kaleidoscope on a project that they just published. Program & Project Management for Next Generation Biotech is along-form guide exploring ops/planning in biotech and pharma are different than other sectors (and what you can do about it). Go check it out! (After you read this post.))
The usual story for large, data-heavy technical projects at big pharmas is that sure, these companies have decades of data that was extremely expensive to collect, but it’s scattered across legacy systems so that any attempt to consolidate it into a single system or model will get bogged down in technical details, internal politics and concerns about large investments with dubious returns.
So pharmas generally end up partnering on these projects with biotech startups that have no legacy systems, less politics and venture capital to absorb the risks of dubious outcomes. Sure, it costs more for these startups to generate all new data. But expensive is better than impossible.
This Eli Lilly announcement flips this story around. In this version, the big pharma has actually managed to get its act together and collect those decades of data into one place, in a form that can be fed into a model. And its the startups who get to use it.
This must’ve been both a technical feat and a huge amount of cat herding. The claim that the data cost over $1 billion to generate seems like the right order of magnitude, though it’s probably only a fraction what Lilly has spent to collect data over the years.
This is yet more evidence that foundation models are finally providing motivation for the data organization and cleanup projects that most pharma companies have been talking about, then putting off for later.
Now, it’s not just any startups that get to use the platform and these foundation models. It will be the startups in Lilly’s existing Catalyze360 program, which is basically their venture arm. So these are companies that Lilly has invested in and hopes to one day partner with, or acquire.
They do say that part of the deal is that participating startups will contribute to the models through a federated learning platform called Rhino. So that’s some motivation for Lilly to do this. But I would expect that the bigger motivation is helping their investments succeed while attracting more startups to the program.
Either way, an announcement like this wasn’t on my bingo card, so I’ll be paying close attention to see what comes of it. Will the teams within Lilly and their partner startups use this platform to make significant breakthroughs? Will other large pharmas adopt this model for their own venture programs?
Only time will tell.