Early Detection in Lung Cancer

The Bonnie J. Addario Lung Cancer Foundation has set the audacious goal of making lung cancer a chronically managed disease by 2023. As with all cancers, the earlier lung cancer can be detected the better the patient outcomes. When the disease is still localized in the lungs, the five-year survival rate is 55%. For tumors that have spread to other organs, the five-year survival rate drops to just 4%.

Right now only 16% of lung cancer cases are diagnosed at an early stage. Clinical researchers are studying techniques for better early screening of high-risk groups. A 3-D scan of the lungs, called a CT scan, has been shown to detect the presence of cancer more effectively than traditional x-rays. In the hands of trained radiologists, these scans can help find early signs of cancer and save lives.

However, these scans also have an extremely high false positive rate – when the test suggests that cancer is present when it’s not. The risk of false positive results erodes trust in these tests and can lead to more unnecessary invasive follow-up procedures, financial burden, and worry for patients.

Enter Artificial Intelligence

Recent advances in machine learning are giving computers the ability to analyze data from images and find patterns that humans might miss. Some of the same approaches that help auto-tag your friends on Facebook or find cats in pictures on the web, are being applied to CT scans to separate the positive cancer results from the false.

In early 2017, data scientists from around the world came together in the Data Science Bowl presented by Booz Allen Hamilton and Kaggle to build open machine learning algorithms for early lung cancer detection.

Using CT scans already labeled by teams of radiologists, participants developed statistical models that tried to classify whether a scan contains a cancerous lesion or not. The algorithms that were best at diagnosing new images won the challenge and were released under an open source license. While still at an early research stage, these approaches hold significant promise for improving the reliability of CT scans.

A New Kind of Data Challenge

But, what if we could carry these early advances forward today so that they are useful to real clinicians working on the front lines of lung cancer screening? That’s where this challenge comes in.

There is a daunting chasm between research algorithms and clinical practice. We want to bridge this gap by developing an end-to-end application, as a community, that connects the predictive power of machine learning with functional software tested against errors and a clean user interface focused on clinical use.

This application will focus on three big challenges that can help radiologists detecting lung cancer in practice:


Analyze CT scan images to detect and pinpoint the location of concerning nodules from background tissue.


Use what we’ve seen from nodules in the past to predict whether the identified nodules are cancerous or benign.


Find the boundaries of nodules and create automatic measures to help radiologists refine and build out the computer-aided diagnosis.

Learn more about the technical details of these challenges, the available data, and the path to contribute in the Challenge Description.

Together we can use technology to help clinicians catch lung cancer early enough to manage its impact on patients and save lives.