by Teresa Zulueta-Coarasa, Fynn Beuttenmueller, Anna Kreshuk, Beatriz Serrano-Solano
In AI4Life, we believe that interoperability and standardisation are the backbone of a healthy AI research ecosystem, allowing data and models to be reused and combined across different research groups, institutions, and platforms. Without standards, valuable datasets and AI models often remain underutilised, difficult to reuse, reproduce, and less impactful than they could be.
One of the main goals of AI4Life has been to create and promote standards for sharing AI models and AI-ready datasets for biological images. By doing this, we aim to ensure that data and models are truly FAIR (Findable, Accessible, Interoperable, and Reusable) so they can support scientific discovery for years to come.
In January 2023, the BioImage Archive organised a workshop that brought together 45 experts from diverse backgrounds: data producers, annotators, curators, AI researchers, bioimage analysts, and software developers. Together, they defined recommendations for sharing annotated, AI-ready biological image datasets.
These recommendations are grouped under the acronym MIFA:
The MIFA guidelines have been published in Nature Methods (https://www.nature.com/articles/s41592-025-02835-8). They are expected to help researchers more easily train and evaluate AI models across diverse biological imaging tasks and unlock the value of archived imaging data.
In addition to datasets, AI4Life also supports a model metadata standard. This standard describes how pre-trained models should be documented so that others can find, reuse, and integrate them into their work. It is openly available and registered in FAIRsharing, a trusted global resource for standards, repositories, and policies.
The model standard is implemented through the bioimageio.spec Python package, which provides a versioned metadata format for models. It works with the bioimageio.core library offers utilities and adapters to make models compatible with different tools and frameworks.
With this approach, models can be shared in a way that is:
The BioImage Archive has developed the AI4Life Model Evaluation Platform to benchmark datasets and models more directly, building bridges between the BioImage Archive and the BioImage Model Zoo.
While pre-trained models are already very useful, they are even more powerful when bundled together with their training datasets and training code. Model metadata supports linking to datasets and code by introducing the corresponding metadata field and a minimal description format for datasets and notebooks.
The dataset description is currently available in bioimageio.spec serves as a starting point; plans are underway to extend this with deeper integration of the MIFA guidelines. In the future, this will make programmatic access to well-described datasets even easier, enabling researchers worldwide to train, compare, and improve AI models for bioimaging.