Online, 24-25 January 2023
AI4Life & BioImage Archive FAIR AI Workshop
by Matthew Hartley and Teresa Zulueta-Coarasa
Artificial Intelligence (AI) and deep learning (DL) are transforming the way we analyse biological images, benefitting particularly the processing of large and heterogenous image datasets. The development of these AI models relies on high-quality annotated images, which will determine the model’s performance, robustness, and scalability. Therefore, providing open access to useful, annotated datasets adhering to the FAIR principles (Findability, Accessibility, Interoperability and Reusability), is essential for the development, reproducibility, and reuse of AI models. However, sharing biological image AI datasets is challenging due to the lack of standards for representing annotation data widely adopted across the community.
The BioImage Archive (BIA), EMBL-EBI’s data resource for open life sciences image data, provides general purpose deposition services for any imaging dataset accompanying a publication, as well as reference image data. As part of the AI4Life project, we want to improve the BIA’s support for image annotations as part of AI-ready datasets and to develop annotation standards for the community.
To this end, we held a virtual workshop on the 24 and 25 of January with 46 community experts from various backgrounds, including data generators, annotators, curators, AI researchers and software developers. The participants discussed four main topics:
Each topic was first discussed in breakout rooms and afterwards each group presented their conclusions to the rest of the participants. This approach resulted in lively and insightful discussions and a series of recommendations.
The immediate output from the workshop will be a white paper, co-authored by the workshop participants, summarising community recommendations. Furthermore, The BIA team is now working on transforming the workshop outcomes into metadata standards for AI datasets, and into software and tools to facilitate the deposition and sharing of AI images and annotations.
Beyond these outputs, we expect that the workshop recommendations will help annotation generators and consumers work together more effectively, facilitating the construction of benchmark collections of datasets to test model generalisation and reuse. Thank you to all the participants for their valuable contributions!