With an approximate 100,000 people in the US diagnosed with melanoma each year and a total of 5 million diagnosed with a variety of other skin cancers, a lot of research has gone into improving current methods of early detection and speeding up diagnoses. Unfortunately, in addition to a highly-trained pair of eyes and significant waiting time, physicians require skin biopsies to confidently diagnose a skin lesion as melanoma, the deadliest form of the skin cancers. Thankfully, researchers at IBM have been studying how to take advantage of Watson’s computing capabilities, combined with recent advances in machine learning algorithms, to assist physicians in examining skin lesions. Medgadget recently had the chance to chat about this work with Noel Codella, a researcher at the IBM’s T. J. Watson Research Center.
Mohammad Saleh, Medgadget: Tell us about the science behind IBM’s new work aiming to diagnose skin cancer with a “selfie”.
Noel Codella, IBM: As you’re probably aware, skin cancer in the US is a big problem. There’s about 5 million new cases every year in the US alone. The most serious of these is melanoma, which accounts for about 100,000 of those instances and eventually leads to about 10,000 deaths. To really solve this problem, we need to improve technology to help reduce the number of deaths.
Right now, if a patient has some concerns, they go into a dermatologist’s office so they could look at the lesion with their naked eye. As a trained expert in melanoma detection, the expected performance from a dermatologist is about 60% accuracy. So we’re looking into ways to improve their ability to detect melanoma vs not-melanoma so they don’t get into unnecessary procedures.
Some have started to use a device called the Dermascope, which is a type of magnifying glass that both enhances the image size and eliminates surface reflectance. This device helps them magnify the lesion and visualize deeper layers of the skin so they can get a better picture of what’s going on biologically. Doctors who are specially trained to use this device have better levels of performance, at around 70-80% accuracy. But still, using this device is not perfect, so we tend to be very conservative.
Roughly about 9 excisions are performed for every one melanoma case. Catching the disease early gives patients a very high survival rate – something like 95% over the first 5 years. But if the disease starts to spread beyond the superficial layers of the skin and into the lymphatics, the survival rates drop below 60% and even down to 15% if far along.
So there’s two sides to the problem: you want to prevent deaths, but also want to prevent the unnecessary surgical excisions which may lead to scarring and patient discomfort.
Medgadget: So biopsies are typically being performed more than they’re really needed?
Codella: They’re done right now to be safe and make sure that as many disease cases as possible are caught. The cost of missing the disease is higher than the cost of excising something that turns out to not be diseased. So we’ve really been trying to help the community approach this problem from two different angles. On one side, we’ve been doing our own experimentation on using machine learning and artificial intelligence to see how well we can get this system to recognize disease. But we’ve also been working with larger organizations (particularly the International Skin Imaging Collaboration, ISIC) to really engage the community to also look at this problem and try to bring awareness and resources so that others could also tackle it and do their own experiments and push the field as a whole forward.
Within our own experimentation, we’ve been working on a combination of many different types of machine learning approaches – making them work together towards the task of analyzing these dermascopy images. We’ve published a report late last year where we took one of the systems we had trained on some data and evaluated it on a test dataset. Within this dataset, a subset of a hundred images were also evaluated by eight top experts in the field who were looking at the images and trying to evaluate which were melanoma or not. Of those hundred images, we drew a direct comparison between our techniques and the dermatologists. From that study, we saw that our system was able to achieve a 76% accuracy, while the average accuracy for the eight dermatologists on that dataset was 70.5%. So it was promising study that showed potential for this technology to be helpful here, and we continue to work on that angle.
An example of the subtle differences between dermoscopic images of malignant and benign skin lesions.
On the community-engagement side, we’ve been working very closely with ISIC. They’ve been putting together a large archive of images that are publicly available for the community, and they worked with a wide variety of academic and industry partners to put it together. We’ve worked with them to host two public challenges at the International Symposium for Biomedical Imaging. We’ve seen some very promising participation with these challenges – over these two years, we’ve had 125 submissions from different universities and industry organizations to try to push the technology further, better diagnose disease, and come up with new analyses to help clinicians.
Medgadget: Are these challenges for people to submit algorithms that make use of Watson?
Codella: No, these are people using their own algorithms and developing new and varied techniques to approach the same problem. So it’s really asking the community what can you come up with? How can we, together, push the field forward and improve the clinical performance of these systems? The challenges have been broken into three tasks – localize or segment a lesion to separate it from normal skin, identify some clinically interesting patterns that doctors actually look for when diagnosing a lesion, and lastly identify what type of disease this lesion is. We’ve got a lot of interest and participation, with a lot of people who haven’t previously looked at skin cancer now starting to give this topic some focus. These challenges are based on snapshots from the ISIC archive dataset, which are available to the public even after the challenges are over.
We still see individuals and organizations downloading this data, conducting some research on it, and coming up with interesting observations and new ways of approaching the problem. It’s really very promising and we’re glad to see ISIC and IBM making this kind of impact to the field. We’re continuing to work on this problem and advance it further, and we’re also interested in looking at other diseases and new modalities.
Medgadget: Can you talk a little bit about the technology behind this and how you’re teaching Watson to do this? What kind of data are you feeding it?
Codella: So that data also comes from the collaboration with ISIC. The way the system learns is interesting – we don’t rely on any single machine learning technique by itself. The system really uses a combination of many techniques and pools the them together to try and find optimal combinations of many approaches that work synergistically to enhance the performance on a given task. It’s a combination of deep-learning, hand-crafted features, unsupervised learning, and many different techniques put together to address this challenge. We see in our studies and others that this type of technique always seems to outperform the use of single methods alone. And the optimal combination of those techniques change depending on the task at hand.
An example of lesion localization using computer vision.
Medgadget: Is this technology meant to bypass the visual inspection or an unnecessary biopsy performed by a dermatologist? Maybe you could walk us through how you envision this technology will be used and where the physician would factor in.
Codella: We’re trying to figure out what is the best way to really help the doctors do their job of finding this disease and improving care, which is still being worked out with our partners and the community. There are a lot of ways in which the doctor’s clinical workflow could be augmented using these technologies, and it will be a group effort to figure out what is the best way to do that. You could draw an analogy to the way biological technologies are used today. There are many different types of blood tests that a doctor can use to try to detect diseases, but you also see that these tests are not solely relied on – they’re used in conjunction with observing symptoms, with patient history, a whole ecosystem of knowledge around a patient needs to be in place before a doctor makes the final decision. So we have to figure out where this technology will fit into that ecosystem to really enhance the doctor’s performance and minimize the number of unnecessary procedures.
Medgadget: Some tech enthusiasts go as far as claiming that AI will soon replace even our doctors. How would you respond to such claims, and to physicians who are resistant to the integration of AI technology, perhaps out of worry for their own role in the clinic?
Codella: What we’re doing with AI is really trying to find the best ways to augment the clinician and enable them to do their job better – catch more disease, minimize the number of unnecessary procedures, and improve care overall. There really is no intent within the technical community that I’m working with to replace doctors. Even imagining having a perfect system that could detect all diseases – you’ll still need to have someone who will be treating patients. The tech community really sees the problem as how we can help doctors, not how to replace them.
Medgadget: So why do you think some physicians are particularly resistant to the integration of AI in the clinic?
Codella: I think that whenever a new technology is developed, there is some degree of resistance. Some of that is founded on things that should be taken into consideration to make sure that these systems are implemented properly. As a community, we really need to listen to one another and make sure that these issues are reviewed in the best manner they possibly can be.
Medgadget: Would you classify this technology as a diagnostic or screening device, and does it provide any indication of disease progression?
Codella: You have to realize that the technology as is stands right now is an experiment. We don’t have a product and don’t have plans to be releasing a product right now. The community first needs to look at all the experiments that have been performed and decide together what is the best way to augment the clinical workflow in the future.
Medgadget: You briefly touched on the potential for other medical applications for this technology. Could you elaborate on that?
Codella: Many of the underlying algorithms come from other fields. There’s a lot of cross-domain and cross-application sharing of techniques and ideas within the computer vision community. IBM as a whole is working on projects beyond skin cancer, with projects working on radiology and other medical imaging. Some applications probably haven’t been discovered yet. So we really try to encourage students to really learn about machine learning and artificial intelligence, and to get involved with the community!
Get Involved: IBM’s OutThink Melanoma Initiative in Australia