No, I'm not having a stroke. And no I’m not using mad libs to write my newsletter titles. And while this week's subject line may seem like it was written by an under-trained LLM, it actually means something. In fact, I'm quite proud of it, thank you very much.
Hear me out...
As I've been working with different biotechs recently, I keep running into the question of how to organize data within a directory structure - what's the best convention for folders and subfolders so you can always find what you need. And my conclusion is that the best approach shouldn't involve directories at all.
A directory structure forces you to settle on one way to look things up: will you start with project, then instrument? Or maybe by date first? Is it the date when the experiment started, when it ended or when the data was uploaded?
These are all pieces of metadata, and it's generally considered a bad idea to stuff metadata into file names. I don't think stuffing it into folder names is any better.
What you really want is a way to look up files and datasets by any metadata in whatever order makes sense at the time you're looking for it. The right solution is a searchable table/database. Then it won't matter what the directory structure is, or even if there is one.
And yet...
We keep using directory structures to organize data because it's a readily available solution - it's built into every storage system - and it's what users are used to. Building that search box is a lot of work and there's a decent chance users would still keep using folders to find things anyway.
Sounds familiar? If you've been following this newsletter, you may recall me writing very similar things about Excel's role in data collection. It's why Excel is both the bane of my existence and the first tool I recommend for prototyping processes before building something better.
I think directory structures can play a similar role as the duct tape that allows us to build the first of many iterations. In fact, I don't think we have a choice in the matter.
But let's all agree that we *can* build something better. In fact, for my sanity I think we have to.
If you could use some help building that prototype and iterating it into something better, check out my consulting company, Merelogic, where I help biotech startups design processes, conventions and digital tools to break through the tech problems that come between them and the science.
This is very well said. Mac OS has "Tags" which are close?
We did a project and could not decide on a tag versus folder structure. Flexible, metadata based organization is superior because it can give you back a directory or orient the display to any "directory structure" you can imagine.