This is wild. The idea of using a single embedding space for DNA, RNA and proteins feels like a step toward a universal biological language model. The workaround with the expanded alphabet is clever,kind of like forcing a multilingual AI to think in concepts rather than just words. Makes me wonder what else could be thrown into the mix.
I love how you break things down, Jesse! Curious to see what you find the most useful aspects of Evo2 -- functional genome/variant annotation?
This is wild. The idea of using a single embedding space for DNA, RNA and proteins feels like a step toward a universal biological language model. The workaround with the expanded alphabet is clever,kind of like forcing a multilingual AI to think in concepts rather than just words. Makes me wonder what else could be thrown into the mix.