Machine learning and computer vision scientists thrive on challenges. Online competitions to create cutting-edge AI tools have become essential for driving innovation in AI, rallying researchers to solve complex problems, and providing them with the most valuable commodity needed to build their models: high-quality training data.
These challenges have led to a host of new AI tools for medicine and science. For example, the Critical Assessment of Protein Structure Prediction (CASP) competition led to the development of AlphaFold, one of the most significant contributions of machine learning to the life sciences today. At Janelia, a 2016 challenge led to the development of new methods and tools to identify neurons, synaptic clefts, and synaptic partners in electron microscopy images that are widely used in connectomics research.
“Challenges are the tool in the machine learning community and computer vision community to move things forward,” says Stephan Saalfeld, a Senior Group Leader and Head of Janelia’s Computation and Theory research area. “People who are not directly working with experimentalists and with data use this as an opportunity to develop new tools – just grab a data set and see if a new idea works – and it has been very successful across the entire community.”
Now, Janelia is turning to this winning formula to improve on existing AI tools for segmenting high-resolution volume (3D) electron microscopy images. Today, Janelia is launching the CellMap Segmentation Challenge, which will give computer scientists around the world the chance to see who can develop the next best tool for automatically identifying many different types of structures inside cells.
The challenge is co-sponsored by HHMI’s Open Science Strategy Team, which is leading an initiative to explore novel ways of sharing and engaging with research data.
“The challenge also aligns well with AI@HHMI, HHMI’s new initiative to embed AI systems throughout every stage of the scientific process,” Saalfeld says.
“The challenge is a fantastic resource to the community as a whole, it’s a fantastic recruiting tool because you get to know people who are interested in that problem space, and it’s a fantastic way to contribute to science as a whole because you are bringing people into a space that is a little bit under-explored,” he adds. “We at Janelia are pretty good at developing these tools, but there’s the rest of the world and there are other excellent groups that are interested in these topics and if they develop better methods, then we all benefit from that.”
Seeding the challenge
While AI segmentation tools exist for 2D microscopy images, there are fewer models that work well on 3D images and can pick out many different cellular organelles in many types of cells and tissues, due primarily to a lack of high-quality training data.
To remedy that, the CellMap Segmentation Challenge is providing the world with a huge batch of annotated high-resolution 3D electron microscopy images to build their models.
The training data includes 289 annotated electron microscopy volumes, covering 22 different cell and tissue samples from four different organisms. Experts meticulously annotated 40 organelle classes, providing one of the most diverse and comprehensive training datasets to date.
The organizers of the challenge hope that this wide range of training data will enable researchers to design artificial neural networks that can pick out many different cellular organelles in many types of cells and tissues that have been imaged under diverse conditions. This will hopefully lead to models that perform better on new datasets than those trained on one specific dataset.
The data was generated by the CellMap Project Team, which segments and annotates high-resolution 3D electron microscopy images of various biological samples and makes them freely available to the scientific community.
“Everyone has models, but the real bottleneck is access to high-quality training data. That’s where this library comes in," says Aubrey Weigel, Project Scientist for CellMap. "Our expertly annotated dataset provides researchers with the foundation they need to push their models forward.”
The new challenge highlights the power and promise of Janelia’s Project teams, which were created to take on large-scale scientific challenges that are difficult to do in an individual lab or in a traditional academic environment, says Wyatt Korff, Janelia’s Senior Director of Project Teams.
He adds: “CellMap showed the value of applying machine learning to volume electron microscopy datasets in a couple of different cells and now the goal is: can that scale up, in terms of the size of the datasets, the number of things you can segment, the number of datasets, all of it.”
Media Contacts
Nanci Bompey