Discussion about this post

User's avatar
Jacob Wert's avatar

I like how this post also touches on a big issue in data capture in the Life Sciences - tribal knowledge. Different domains in science view things in slightly different ways, so while a biologist might think metadata field A means one thing, a chemist might view it as something different. There’s also the issue in schema design when the person who initially develops it leaves the company, and and someone new comes in and doesn’t understand why the schema was designed in the way that it was.

I think that LLMs can definitely help here, but it will be interesting to see how different domain experts prime LLMs in different ways. Grounding LLMs with clearly defined rules and documentation certainly helps, but depending on how the data is captured an out of the box model might extract and organize data in different ways depending on what group in a company generated that data.

Expand full comment

No posts