Adenine, thymine, guanine, cytosine. Biology teachers repeated those molecules to us in high school. We had to memorize them—although they were already embedded in our brains, literally speaking. Working in tandem, those four bases compose a four-letter code giving instructions to every living cell, ultimately sketching a blueprint for life.
The code, DNA, is microscopically wound in the elegant double helix that James Watson and Francis Crick discovered in 1953. It seemed charmingly simple: Two ribbons of chemical code, weakly attracted to one another, could be unzipped and copied (allowing a cell to divide), or transcribed to RNA, which used the instructions to make proteins. These proteins, like miniature laborers, then flew off to whatever cellular task the DNA had in mind.
Something was odd about this system, though: Out of 3 billion bases, less than 2 percent actually manufactured the proteins making up the human body. The remaining 98 percent were apparently useless—discarded byproducts, some said, of millennia of evolution. Even Francis Crick, in a paper he co-authored in 1980, dismissed much of the genome as “little better than junk.” For scientists to “hunt obsessively” for the beneficial function of such DNA, Crick argued, would be “folly.”
Yet hunt they did. And now, 32 years later, they’ve painstakingly ferreted out a discovery revolutionizing our understanding of the genome: At least 80 percent of human DNA—perhaps more—is biochemically active. With the help of new DNA sequencing technology, researchers have identified functions for millions of DNA segments previously dismissed as junk. Their findings, announced on a raft of 30 scientific papers released Sept. 5, reveal a genetic code breathtakingly more complex than we ever imagined, and one that challenges Darwinism.
The papers result from a nine-year project called the “Encyclopedia of DNA Elements,” or ENCODE for short. Following up the Human Genome Project (which published the first draft of human DNA in 2000), ENCODE involved 442 researchers from the United States, United Kingdom, Japan, Spain, and elsewhere. They discovered that supposedly inactive regions of DNA actually contain some 4 million “switches” that turn genes on or off and control how cells behave. What’s more, these genetic switches could be the key to understanding, and perhaps curing, diseases ranging from lupus to Alzheimer’s to autism to diabetes.
For the ENCODE scientists, the task of finding these switches was colossal. They performed 1,649 experiments on 147 types of human cells. They endured 675 teleconference calls between 2008 and this June to discuss their findings. They produced 15 trillion bytes of data—enough to fill 3,200 DVDs. The entire project cost U.S. taxpayers $185 million.
Even FRANCIS CRICK, in a paper he co-authored in 1980, dismissed much of the genome as “little better than junk.” (Markus Schreiber/AP)
For all that, we still know little. The lead coordinator for ENCODE, Ewan Birney of the UK-based European Bioinformatics Institute, estimates we’ve uncovered about one-tenth of the human genome’s secrets. “I get this strong feeling that previously I was ignorant of my own ignorance, and now I understand my ignorance,” he joked to Scientific American. “It’s slightly depressing as you realize how ignorant you are. But this is progress.”
Birney suggested an updated description of the genome: A “jungle.”
It’s a jungle because the genome, we now realize, contains not one but several layers of information, several dimensions of complexity. The newly catalogued regulatory switches—all 4 million of them—are one of the dimensions. These stretches of DNA modify how a cell, and ultimately the body, reads other DNA segments. The switches can turn off certain genes, or slow down the production of a particular protein. Researchers knew such switches existed before ENCODE began, but didn’t realize how many.
John Stamatoyannopoulos, a University of Washington researcher involved with the project, told me the DNA switches are like long sentences that passing proteins know how to read: “When all the words in the sentence fill up with these proteins, the switch turns on, and then it goes and reaches over and turns on the gene that it’s supposed to control.”
Stamatoyannopoulos was part of a team studying how the gene switches might be affecting diseases. Past studies have focused on cataloguing spots where the DNA of healthy people slightly differs from that of people with diseases. Such catalogues enabled researchers to identify genetic markers associated with an increased risk of developing heart disease, celiac disease, asthma, or other illnesses.
The perplexing thing was that almost all the markers in those studies pointed to regions of DNA that don’t code for proteins—regions that appeared to be inactive. Stamatoyannopoulos’ team found that most of the disease markers actually land in regions of the genome now known to contain gene switches. One group of switches, for instance, is associated with Crohn’s disease, a digestive disorder.
It suggests a new way to study—and fight—diseases is by looking at the genetic switches associated with them, and learning what makes them flip on or off. Cancer, immune disorders, and even schizophrenia are linked to these various gene switches.
ENCODE researchers haven’t yet learned where all these switches are, or how they might trigger diseases. But their new data provide a roadmap for future investigation.
The discoveries of ENCODE are a fulfilled prophecy in the eyes of Stephen Meyer, the director of the Center for Science and Culture at the Seattle-based Discovery Institute, the nation’s leading intelligent design (ID) think tank. Meyer said ID proponents predicted back in the 1990s that much so-called junk DNA would turn out to be functional, and “that’s exactly what’s happened.”
The debate over junk DNA has roots as far back as 1972, when some biologists first used the term junk to describe segments of the genome that didn’t encode instructions for proteins. These segments appeared inactive, just tagging along for a ride in the chromosomes while the protein-coding segments did all the work. With less than 2 percent of the human genome coding for proteins, there seemed to be an awful lot of derelict DNA hanging around.
At the time, few biologists thought noncoding DNA had an undiscovered function. Others “seized on the notion of junk DNA as evidence for Darwinian evolution and against intelligent design—since a designer would presumably not have filled our DNA with so much junk, but centuries of mutations might have,” said by email Jonathan Wells, a senior fellow at the Center for Science and Culture. Wells documents the debate in his 2011 book, The Myth of Junk DNA.
Famed atheist Richard Dawkins, for example, wrote in his influential 1976 work, The Selfish Gene, that noncoding DNA was like a “parasite” within the genome, fulfilling an innate, evolutionary drive to survive in spite of being biologically useless. In subsequent decades, Dawkins and other neo-Darwinists raised the junk argument repeatedly: Why would a creator insert worthless code into a genome? This evolutionary talking point has persisted until today, surprisingly, even though the discoveries of the past decade have shown that less and less of the genome can accurately be described as useless.
So ingrained is the idea of junk DNA that when the ENCODE researchers announced their findings in September, they triggered an immediate academic backlash from critics who said they were “hyping” the data. “The creationists are going to love this,” complained Larry Moran, a biochemist at the University of Toronto, on his personal blog. “This is going to make my life very complicated.”
The critics’ major complaint was the “80 percent” figure: It describes how much DNA researchers found (with some extrapolation) being actively copied to RNA. But that widespread copying doesn’t prove every single piece of RNA is doing something useful in the cell, skeptics said. Much of the copying could be random—and therefore, much DNA might still be “junk” after all.
The debate was vocal enough to prompt a quasi-apology from Birney, the ENCODE organizer, who admitted on his own blog that “we could have used different terminology” to convey the magnitude of the findings. Yet, he insisted that even viewing ENCODE’s findings as conservatively as possible, we should still say at least 20 percent of the genome is actively involved in regulating genes. Some argue the data justifies a higher figure, 50 percent. Either way, the amount is a significant increase from the 2 percent to 5 percent once thought to be important.
The percentage may increase as researchers learn more. Incidentally, Birney thinks further research may show not 80 percent but 100 percent of DNA bases are biochemically active. In the meantime, he recommended his colleagues scuttle the junk DNA term.
The more we learn about the human genome, the more astounding its complexity becomes. Multiple dimensions of information encoded in DNA work in unison, offering a staggering array of combinations we barely understand, but which are crucial to the health and traits of each person.
The most basic unit of information, of course, is the adenine-thymine-guanine-cytosine sequence that forms the DNA code itself. A second layer of information involves components of the DNA strand, like histones, spools around which DNA is wrapped. (Chemical modifications to the histones influence DNA’s copy rate.)
Add to that the regulatory switches, constantly governing what portions of DNA should be read and what shouldn’t, dictating what proteins are produced and how genes are expressed. The regulators work differently in different types of cells—a brain cell and muscle cell wouldn’t have the same switches active, for example. (That offers clues to the mystery of how cells know whether to turn into muscle, bone, skin, or kidney.) Some switches are preprogrammed to flip on at certain points in the body’s development, while others might only respond to external stimuli.
As you might expect, the regulatory switches are often found right next to the genes they control. But sometimes they’re found hundreds of thousands of base pairs away. Which brings us to yet another layer of complexity—the three-dimensional shape of DNA.
Like a chain coiled in a bucket, DNA is tightly folded against itself, so segments (or links) that would otherwise be far apart lie close together. The precise folding of DNA means that ENCODE researchers often found regulatory switches actually touching the genes they control, even though they’re linearly distant. It’s a surprising new way to think about the importance of DNA’s shape.
“The creationists are going to love this,” complained LARRY MORAN. “This is going to make my life very complicated.” (Zuma Press/Newscom)
Together, these interdependent mechanisms make a case against Darwinian evolution, said Georgia Purdom, a genetics researcher at the creationist organization Answers in Genesis. Evolving an existing organ into something different would require a genetic choreography of mutations: “It’s not just changing one gene. You have to change not only that gene, but the regulation of that gene, and other genes that are involved ... in that particular biological pathway.”
Some DNA stretches may still appear inactive, but Purdom thinks they could contain segments that have lost function since creation—or that are only active at certain times, like during an embryo’s development.
Meyer of the Discovery Institute said the hierarchical layers of information controlling the genome have a particular benefit: They allow for the efficient storage of vast amounts of genetic information. “The functions they perform are so strikingly similar to functions that we’ve designed to solve similar information and storage problems in high tech digital computers.”
Meyer noted we still don’t understand all of the ways cells communicate, such as during the development of an animal embryo in utero. Some information needed for the animal’s growth is apparently conveyed through the three-dimensional structure of the embryo itself.
“It’s layers within layers of complexity. That’s what’s being revealed in biology. It’s mind-boggling.”
ENCODE has given us a glimpse of the complexity. It has hacked a path into the genetic jungle, and planted a few guideposts that will accelerate future discoveries.
The United States may spend $123 million supporting the next phase of the ENCODE project, in which researchers will continue charting out DNA functions within various cell types. They’ll try to understand how gene regulation occurs over time, gaining a movie-like perspective. They’ll try to figure out how gene switches combine, telling cells what type of tissue to become, or influencing visible human traits like height and weight, aging, or diseases like Crohn’s.
“For centuries to come, biologists will be making fundamentally new discoveries about the features of living things,” predicted Wells. “But probably not if they begin by assuming that those features are junk.