In the past 40 years, there has been roughly a million-fold improvement in semiconductor technology. “We have witnessed essentially the same improvement in DNA sequencing in just ten years,” explained Andreas Sundquist PhD, CEO and co-founder of DNAnexus (Mountain View, CA) in his talk at FutureMed. There has been a 100,000-fold improvement in DNA sequencing throughput in only eight years, and nearly a million-fold reduction in the price of of sequencing in ten years. “Today, it costs more to manage and analyze the data than it does to produce the data,” he said.
Because the exponentials involved in DNA sequencing are so high, sweeping adoption of DNA sequencing is therefore imminent. “By 2020, every single one of you will have your DNA sequenced,” Sundquist said to those in attendance at FutureMed.
Sundquist became a computational genomicist by way of computer science, a field in which he holds a Ph.D. from Stanford. “I came into Stanford thinking I was going to do computational architecture,” he said. “But I very quickly saw what was happening in [genomics] and noticed that there are some amazing opportunities here.”
Nevertheless, there are several challenges, including the massive scale of the data, infrastructural hurdles, and systemic challenges, Sundquist said.
Data Scale: By 2014, we are likely to have sequenced one million human genomes. That amounts to around one exabyte (quintillion bytes) of data that will need to be stored somewhere. “This is really equivalent to what some of the biggest data power houses in Silicon Valley are working with today,” Sundquist explained.
Infrastructural Issues: Today, sequencing companies deliver genomic data to customers by shipping them a hard drive. “When you are talking about terabytes, petabytes, exabytes of data, this is a really big problem: getting that data where you need it to analyze it and work with it is actually a logistical nightmare.”
“What if instead of moving the data to where you need it, you actually do the reverse? You took the computation, the task you are going to perform, and move it to the data instead,” Sundquist said. That is technically possible to do, but it is very challenging because of silo-ization. It is hard to accommodate various computer cluster environments and data storage schemes.
Systemic Hurdles: DNA sequencing is an expert-dominated field. “It is sort of like the computer industry in the 70s, which was a homebrew world,” Sundquist said. “You had to actually put together your own computer and then write your own software to run on that computer that you built.”
Similarly, professionals in the DNA sequencing world must put together their own platforms and write their own software. This is a problem because physicians, who will help advance genomics in clinical practice, by no means have the time to develop custom genomic platforms.
Another systemic problem is that the tools developed for data sequencing are not built to work together; interoperability at present is lacking as data formats continue to evolve.
‘Operating System’ for Genomics
Sundquist explained that he had the above challenges in mind when he started DNAnexus. “We founded the company with a mission to unlock what we believe is a huge potential for DNA-based medicine and biotechnology.” The company is seeking to do that by building a collaborative and scalable platform. The firm has attracted financial support from Google Ventures and TPG Biotech.
This year, the company plans to debut “something akin to an ‘operating system’ for genomics,” according to Sundquist. The platform will bring together data and analysis into the same environment. “We want this to be ultimately be a community-driven ecosystem of genomic data, tools, and collaboration.”
Predictions
In two to five years, Sundquist expects genomics to “really take off” with a “rapidly growing ecosystem” that will attract regulatory scrutiny. In addition, broad scale progress will be achieved in the next few years. 1000-genomes projects will likely become routine. In total, in two to five years, we will have sequenced tens of millions of human genomes and it will be “commonly used for diagnosing rare Mendelian disorders and for tumors.”
There will, however, be a fair amount of ethical scrutiny that needs to be addressed. “There is always the potential for genetic discrimination. What happens when we are in a world where you can actually screen and maybe even select people based on their genomics?” Sundquist said. “If we are not cautious, we may be in a brave, new world.”
If the regulatory climate is not friendly in the United States, there may be much more sequencing performed offshore.
Ten years from now, DNA sequencing will be ubiquitious, Sundquist predicted. “It is going to be a cheap, routine lab test” and the resulting data will be merged into our medical records. Once at least one billion genomes have been sequenced, we can likely create a family tree of the world.
Whole new industries will be created around these technologies. For instance, it will be used in the creation of software that will personalize everything from clinical practice to fitness regimens.