Decoding ENCODE

When I was a science writer at Cold Spring Harbor Laboratory, back in the early 1990s, I attended the annual genome meeting and heard Sydney Brenner make his pitch for the fugu genome. The puffer fish—known to sushi aficionados and neurobiologists for its tiny gland that produces the neurotoxin tetrodotoxin—Brenner, said, has a marvelously condensed genome, free of “junk DNA.”

The standard dogma of the day was that the human genome was 99% junk—an evolutionary midden-heap, strewn with the discarded wrecks of past experiments, genes that had mutated out of all functionality, mind-numbing repetitive sequences where the polymerase got stuck and churned out the nucleotide version of Nebraska, and spare parts that might be used in assembling some future genetic component. The Human Genome Project would take far longer and be far more expensive if we tried to sequence all of it, Brenner said. By studying the fugu genome, he argued, we could cut to the chase, learning about the genes without sifting through all this trash; perhaps then we could use fugu genes to identify the functional sequences in the human genome. But Brenner’s Fugu Genome Project went into science’s own scrapheap. Not long after this meeting, Craig Venter’s shotgun sequencing techniques began to accelerate the Human Genome Project, shortening the projected finish time and slashing budget projections. The Fugu Genome Project continued, but it had nothing like the impact Brenner envisioned.

Barbara McClintock, from CSHL Archives

Then I went back to grad school, studied the history of science, and began my dissertation research on Barbara McClintock, the maize geneticist who worked at Cold Spring Harbor for half a century and who won a Nobel prize in 1983 for her discovery of mobile genetic elements. Late in her career, McClintock became deeply interested in all forms of gene regulation. Development and evolution were united in her mind by means, dimly understood, of turning genes on and off and modulating their activity. She was convinced there was a higher-order organization that controlled the genes; phenotypes resulted from patterns of gene action. Most people think that McClintock’s discovery of transposable elements was ignored or dismissed by the scientific community. I found that wasn’t true. They believed that McClintock had found movable elements. They just didn’t believe those elements controlled evolution. McClintock’s late work continued her theme of gene regulation and interaction. The genome, she wrote, was a “sensitive organ of the cell,” dynamic and responsive—not a blueprint or an instruction manual.

Study of the human and other genomes has revealed that “junk DNA” is itself junk—much of that noncoding sequence is involved in gene regulation. Some of it is of the sort that McClintock envisioned; some is beyond even her imagination. The genome is now understood in terms much closer to McClintock’s mystical-sounding notion. As I wrote in my book, The Tangled Field, she deserves more credit as an early proponent of the complex, dynamic genome.

This week, the genome community has been all aflutter with news of the ENCODE project, a sophisticated genome database that catalogs patterns of gene activity. It turns out that most of Brenner’s junk DNA, isn’t. Much of that non-gene sequence is deeply important to gene function: it’s full of regulatory sequences, what the press are calling “switches,” that determine how and when and in what context the genes act. Had the National Institutes of Health invested as heavily in fugu as Brenner had hoped, it would likely have taken much longer to reach this level of subtle understanding. As Michael Eisen points out in a nice critique of the science media machine, none of this is actually news. The junk DNA model has been out of style in science for years, and the ENCODE project has not identified “millions of switches” that regulate the genome. More accurate to say, it has identified millions of potential switches—all the science of those switches still has to be done. Many of the science writers who have plumped this story simply dressed up the ENCODE project’s press release, dumbing down a lot of complex science into an easily digestible but historically misleading narrative.

Oversimplification is endemic in both science and science journalism. The former is a set of methods for making the complex simple—and the latter is a set of methods for making science simple. I did both before studying history, which is a set of methods for making the simple complex—or, rather, for decoding the complexity in what we oversimplify. Addressing subjects as massively complex and integrated as the genome—or the brain, or the immune system, or an ecological community—requires both approaches.