The BioImage Archive launches a collection of explorable AI-ready image datasets

by Teresa Zulueta-Coarasa

Artificial Intelligence (AI) methods have revolutionised the analysis of biological images, but their performance depends on the data the models are trained with. Therefore, to develop, benchmark, and reproduce the results of AI methods, developers need access to high-quality annotated data.

One of the missions of AI4Life is to democratise the access to well-annotated datasets which are standardised to facilitate their reuse, and presented in a manner that is useful to the community. As part of this effort the BioImage Archive has launched a gallery of datasets that can be explored in-browser without the need to download the images and annotations. Each dataset is presented in a consistent way, following community metadata standards that include information such as the biological application of a dataset, what type of annotations a dataset contains, the licence the data are under or what models have been trained using this dataset. Furthermore, because all images are converted from different formats into the cloud-ready file format OME-Zarr, there is potential for analysing these datasets in the cloud. 

The BioImage Archive team plans to keep enriching this collection with more datasets over time, with the aim of establishing a community resource that can empower the development of new AI methods for biological image analysis.