Datasets & Models
We believe in open science and every project from the lab is released under the MIT license.
AMPLIFY
A family of efficient protein language models pre-trained on large-scale sequence data.
Coming soonSaAMPLIFY
Structure-aligned variants of AMPLIFY, enriched with protein structural knowledge via a lightweight post-training.
Coming soonAMPLIFY Dataset
A curated large-scale protein sequence dataset built from UniProt, SCOP, and OAS, used to pre-train the AMPLIFY family of models.
Coming soon Code
FLAIR Codebase
A unified codebase for pre-training, fine-tuning, and evaluating the FLAIR Lab's protein language models.
Coming soon