Who needs supervision?

Feb 5

*** Quick question before we start: Would you be interested joining a peer group of 6-8 people from similar biotech data roles/levels?

Read →

3 Comments

Luke

Feb 5

LSTMs could be trained in a self-supervised way, just not efficiently. Transformers allowed parallelization of training so you could scale up model size which was the main breakthrough

Expand full comment

Reply (1)

Jesse Johnson

Feb 11

Interesting. I thought LSTMs also had limitations on their "memory" that limit their effectiveness on long text. So they would still require something like attention to get the kinds of results you get from a transformer.

Expand full comment

Reply (1)

Luke

Feb 11

they do have limitations in "context window" (in LSTMs, it's a state vector that can theoretically contain information with unlimited context distance, but is limited by the information content of the vector itself), but you could scale them up the same way...except then they get really hard to train.

Transformers you just keep making them bigger and reap the benefits of the bitter lesson

Expand full comment

Scaling Biotech

Who needs supervision?