Grand Challenges in Biomedical Image Analysis

bibkeys: github: https://github.com/DIAGNijmegen/grand-challenge

We develop and maintain grand-challenge.org, an open-source platform for hosting challenges in the biomedical imaging, collecting expert annotations, and making deep learning models accessible to clinicians and researchers worldwide.

Introduction

The growing reliance on medical imaging for clinical decision-making, combined with an ageing population, has led to unprecedented demand for trained specialists to interpret medical images. As this demand cannot be fully met, physician workloads continue to increase, raising the risk of interpretation errors. Computer-aided detection and diagnosis systems, including deep learning algorithms, have been developed to help reduce this burden and support more reliable clinical decisions.

Biomedical image analysis challenges have proven highly successful in accelerating the development and validation of such algorithms. Since 2012, we have developed and maintained an open-source framework for hosting these challenges and operate an instance at grand-challenge.org.

The platform has hosted leading challenges across a wide range of medical domains, including:

  • PI-CAI – prostate cancer detection in MRI
  • TIGER – tumor-infiltrating lymphocyte assessment in breast cancer pathology
  • LUNA25 – lung nodule malignancy risk estimation in CT scans

Today, the platform is integral to our research workflow—from private challenges used in education, to ensuring reproducibility and enabling the sharing of validated algorithms.

Grand Challenge currently supports more than 120,000 users and hosts over 390 public and private challenges.

Features of the Platform

Challenges

Researchers can use the framework to organize biomedical (imaging) challenges with fully customizable content. Organizers define challenge goals, datasets, rules, and timelines through dedicated web pages. Public training datasets are typically hosted externally (for example on Zenodo or the AWS Open Data Registry), while private test datasets are securely uploaded to the platform.

Organizers also provide an evaluation method in the form of a Docker container. This container evaluates the predictions generated by the participants algorithm from the test dataset, producing standardized performance metrics that are displayed on a live leaderboard.

Participants download the public dataset, train their algorithms locally, and submit their solutions as Docker containers. Each submission is automatically executed on the private test dataset, evaluated using the organizer’s evaluation method, and benchmarked against other entries.

Reader Studies

Grand Challenge also provides a powerful reader study module for collecting expert annotations and assessments. Reader studies allow multiple readers to answer predefined questions across structured case collections. Images can be uploaded in many formats, including DICOM, MHA, NIfTI, PDF, and text, and can be displayed across multiple synchronized viewports.

Reader progress is tracked in real time, and results can be exported in CSV format. A broad range of question types is supported, including text input, multiple choice, with configurable widgets and custom validation options. The platform also supports extensive annotation types such as bounding boxes, polygon annotations, and binary or multi-label segmentations.

Editors can upload ground-truth annotations, enabling automated performance monitoring of readers and quantitative analysis of question and case difficulty. This makes reader studies suitable not only for research, but also for education and training.

Algorithms

A Grand Challenge algorithm consists of executable code that extracts information from medical data such as images, reports, or multimodal inputs. Algorithms can be kept private or shared publicly with verified users.

Users with access can run algorithms on their own data by uploading cases directly to the platform. The generated outputs can be inspected interactively in the browser or downloaded for further analysis, enabling direct clinical and research validation.

Platform Architecture

The grand-challenge.org project is developed on GitHub and consists of two core components:

  • The grand-challenge.org framework – a reusable platform for hosting biomedical challenges
  • CIRRUS – a medical imaging workstation platform integrated into clinical workflows

Many platform features are accessible programmatically through the REST API, with convenient access via the Python client library GCAPI.

While the Grand Challenge framework is open source and licensed under Apache 2.0, CIRRUS is currently closed source.

The platform is built as a web application using Django, backed by a PostgreSQL database and a Celery task queue with Redis as message broker. We use Amazon Web Services to provide the secure and scalable infrastructure required for handling the sensitive data used on our platform, and to ensure the data can be viewed in a responsive manner regardless of a user`s location.

The full application stack is distributed through Docker containers, which are automatically published to AWS Elastic Container Registry after successful test runs. A Docker Compose configuration enables both developers and administrators to deploy complete local or on-premise instances of the platform.

For detailed setup instructions, see the Grand Challenge GitHub repository and the developer documentation.

People

James Meakin

James Meakin

Lead Research Software Engineer

Henkjan Huisman

Henkjan Huisman

Professor

Diagnostic Image Analysis Group

Sjoerd Kerkstra

Sjoerd Kerkstra

Research Software Engineer

Paul Konstantin Gerke

Paul Konstantin Gerke

Research Software Engineer

Harm van Zeeland

Harm van Zeeland

Research Software Engineer

Diagnostic Image Analysis Group

Miriam Groeneveld

Miriam Groeneveld

Product Owner

Anne Mickan

Anne Mickan

Research Software Engineer

Chris van Run

Chris van Run

Research Software Engineer

 Thomas Koopman

Thomas Koopman