On flexibility, consistency and schema enforcement

There's more than one way to manage data consistency, and not all are created equal.

Scaling a biotech research platform requires maintaining enough flexibility to collect data in an always-changing environment, but enough consistency to ensure that the collected data is usable.

Making sure your data conforms to your chosen schema is hard.

Giving yourself just enough flexibility to change the schema is harder.

Or, rather, it’s deciding what “enough” means.

In most situations, there will be four layers where you can enforce the schema: the user, the user interface, the API and the database.

Each layer is progressively harder to change, moving you towards the consistency side of the trade-off.

Excel relies on the user to enforce the schema: They can enter any data they want anywhere in the spreadsheet, so they better know what they’re doing.

Low-code solutions like Airtable rely on the user interface: The user tells the system how they want to enter their data, then the system makes sure the user follows their own rules.

After that we start to get into the choice between relational and NoSQL databases.

For these last two layers, the user doesn’t get to make choices any more - that’s the developer’s job.

Each of these approaches will be perfect for some situations and terrible in others.

It all comes down to where you want that particular data/process to live along the flexibility/consistency trade-off.