Cover Story
Andrey Prokhorov/iStock

Debunking junk

Science | Researchers are showing that the old Darwinian ideas about 'junk' DNA were simplistic. Welcome to the new genome

Issue: "Reassessing the genome," Oct. 6, 2012

Adenine, thymine, guanine, cytosine. Biology teachers repeated those molecules to us in high school. We had to memorize them—although they were already embedded in our brains, literally speaking. Working in tandem, those four bases compose a four-letter code giving instructions to every living cell, ultimately sketching a blueprint for life.

The code, DNA, is microscopically wound in the elegant double helix that James Watson and Francis Crick discovered in 1953. It seemed charmingly simple: Two ribbons of chemical code, weakly attracted to one another, could be unzipped and copied (allowing a cell to divide), or transcribed to RNA, which used the instructions to make proteins. These proteins, like miniature laborers, then flew off to whatever cellular task the DNA had in mind.

Something was odd about this system, though: Out of 3 billion bases, less than 2 percent actually manufactured the proteins making up the human body. The remaining 98 percent were apparently useless—discarded byproducts, some said, of millennia of evolution. Even Francis Crick, in a paper he co-authored in 1980, dismissed much of the genome as “little better than junk.” For scientists to “hunt obsessively” for the beneficial function of such DNA, Crick argued, would be “folly.”

We see you’ve been enjoying the content on our exclusive member website. Ready to get unlimited access to all of WORLD’s member content?
Get your risk-free, 30-Day FREE Trial Membership right now.
(Don’t worry. It only takes a sec—and you don’t have to give us payment information right now.)

Get your risk-free, 30-Day FREE Trial Membership right now.

Yet hunt they did. And now, 32 years later, they’ve painstakingly ferreted out a discovery revolutionizing our understanding of the genome: At least 80 percent of human DNA—perhaps more—is biochemically active. With the help of new DNA sequencing technology, researchers have identified functions for millions of DNA segments previously dismissed as junk. Their findings, announced on a raft of 30 scientific papers released Sept. 5, reveal a genetic code breathtakingly more complex than we ever imagined, and one that challenges Darwinism.

The papers result from a nine-year project called the “Encyclopedia of DNA Elements,” or ENCODE for short. Following up the Human Genome Project (which published the first draft of human DNA in 2000), ENCODE involved 442 researchers from the United States, United Kingdom, Japan, Spain, and elsewhere. They discovered that supposedly inactive regions of DNA actually contain some 4 million “switches” that turn genes on or off and control how cells behave. What’s more, these genetic switches could be the key to understanding, and perhaps curing, diseases ranging from lupus to Alzheimer’s to autism to diabetes.

For the ENCODE scientists, the task of finding these switches was colossal. They performed 1,649 experiments on 147 types of human cells. They endured 675 teleconference calls between 2008 and this June to discuss their findings. They produced 15 trillion bytes of data—enough to fill 3,200 DVDs. The entire project cost U.S. taxpayers $185 million.

crick.jpgEven FRANCIS CRICK, in a paper he co-authored in 1980, dismissed much of the genome as “little better than junk.” (Markus Schreiber/AP)

For all that, we still know little. The lead coordinator for ENCODE, Ewan Birney of the UK-based European Bioinformatics Institute, estimates we’ve uncovered about one-tenth of the human genome’s secrets. “I get this strong feeling that previously I was ignorant of my own ignorance, and now I understand my ignorance,” he joked to Scientific American. “It’s slightly depressing as you realize how ignorant you are. But this is progress.”

Birney suggested an updated description of the genome: A “jungle.”

It’s a jungle because the genome, we now realize, contains not one but several layers of information, several dimensions of complexity. The newly catalogued regulatory switches—all 4 million of them—are one of the dimensions. These stretches of DNA modify how a cell, and ultimately the body, reads other DNA segments. The switches can turn off certain genes, or slow down the production of a particular protein. Researchers knew such switches existed before ENCODE began, but didn’t realize how many.

John Stamatoyannopoulos, a University of Washington researcher involved with the project, told me the DNA switches are like long sentences that passing proteins know how to read: “When all the words in the sentence fill up with these proteins, the switch turns on, and then it goes and reaches over and turns on the gene that it’s supposed to control.”

Stamatoyannopoulos was part of a team studying how the gene switches might be affecting diseases. Past studies have focused on cataloguing spots where the DNA of healthy people slightly differs from that of people with diseases. Such catalogues enabled researchers to identify genetic markers associated with an increased risk of developing heart disease, celiac disease, asthma, or other illnesses. 

The perplexing thing was that almost all the markers in those studies pointed to regions of DNA that don’t code for proteins—regions that appeared to be inactive. Stamatoyannopoulos’ team found that most of the disease markers actually land in regions of the genome now known to contain gene switches. One group of switches, for instance, is associated with Crohn’s disease, a digestive disorder.

Comments

You must be a WORLD member to post comments.

    Keep Reading

     

    Attack bac

    Research points to possible way to target superbugs

    Advertisement