Skepticism and Science
As many of you who read science news will know, the ENCODE data set is currently lighting up the world of molecular biology. For the first notable time since the Human Genome Project, loads of genomic data is accessible to anyone with an internet connection; seriously, though, there’s an iPhone/iPad app. Furthermore, where the Human Genome Project said, “Here’s the human genome,” ENCODE has now responded, “Here’s what the human genome does” - or at least has moved us closer to answering that question. Where the Human Genome project is a dictionary, ENCODE is an encyclopedia - and a very valuable one at that. No doubt about it, ENCODE is awesome.
The press coverage of ENCODE has not been up to par. While science journalism can be hit and miss at the best of times, ENCODE seems to have caused a giant “miss” among several major, highly-regarded news sources. (New York Times, I’m looking at you). While ENCODE has been lauded by the press for debunking “junk DNA”, some of the claims made about ENCODE’s research (however cool it may be) are just not true. In fact, I would argue that by misrepresenting the facts in news stories, journalism has clouded the amazing contribution ENCODE has made to molecular biology - one that no scientist will contest, as it was a massive, 10-year international project featuring 442 scientists that has spawned 30 research papers in different journals - basically, some seriously hard-core research.
As a case in point, several sources (see below) have attributed the journalism revelation that “junk DNA isn’t actually junk” to the ENCODE project. In fact, scientists have known for decades that protein-coding genes are regulated by non-coding DNA sequences - “gene switches” - found in the “junk DNA”, or non-protein-coding sequences of our genome. That’s uncontested, and there are plenty of reviews on the subject (as in this review, and its references) that were written long before ENCODE’s publication.
As Mike White, Ph.D., current Department of Genetics and the Center for Genome Sciences and Systems Biology member at the University of Washington School of Medicine, and frequent science blogger, says, “ENCODE is significant because they’ve provided a very useful data set, and not because they’ve a) shown that non-coding DNA is important (we knew that), or b) most of the genome has phenotypically important regulatory function (it does not) or c) that most of the genome is evolutionarily conserved (not true either). What they have shown is that much of the genome is covered by introns, and it’s hard to find biochemically inert DNA, which those of us who have tried to generate random, ‘neutral’ DNA sequences (for, say, spacers in synthetic promoter experiments) will agree with.”
Ryan T. Gregory, an evolutionary biologist at the University of Guelph in Canada, has compiled a list of news sources covering the ENCODE beat with the title, “The ENCODE media hype machine.” Let’s have a look at just a few offenders.
The New York Times is perhaps most disappointing, confusing activity with necessity:
The human genome is packed with at least four million gene switches that reside in bits of DNA that were once dismissed as “junk” but that turn out to play critical roles in controlling how cells, organs, and other tissues behave. The discovery, considered a major medical and scientific breakthrough, has enormous implications for human health because many complex diseases appear to be caused by tiny changes in hundreds of gene switches…
As scientists delved into the “junk” - parts of the DNA that are not actual genes containing instructions for proteins - they discovered a complex system that controls genes. At least 80 percent of this DNA is active and needed.
At least 80% of this DNA is biochemically active, according to the ENCODE - not needed. Furthermore, ENCODE did not discover a complex system that controls genes; they discovered exactly how complex the network scientists already knew existed is.
USA Today also seems a bit lost, stating that the 80% of the genome that ENCODE found biochemically active contains promoters and enhancers:
International research teams have junked the notion of “junk” DNA, reporting that at least 80% of the human genetic blueprint contains gene switches, once thought useless, that controls the genes that make us healthy or sick.
While Wired is so confused, it’s hard to know where to start:
Molecules that didn’t form protein-coding genes were mostly overlooked, partly because they were considered less important, but also because new tools and techniques were needed to study them.
It’s definitely news to me they’re less important - if anything, they’re more important than the coding sequences, as we understand less about them and they serve to significantly regulate the protein-coding bits of our genome.
In the ENCODE data are thousands of newly identified structures known as pseudogenes, fossil genes and dead genes, which look like protein-coding genes but perform other functions.
Pseudogenes perform other functions? Oh really?
Sure, I’m being a bit pedantic. But honestly, science journalism has gotten out of control. There are obviously very reasonable parts to all of these articles as well, but they’re drowned in so much hype and “catch phrases” designed to grab attention that the end result is a total distortion of some totally awesome scientific research that deserves to make the front page for what it’s actually accomplished. Especially in an age when we have eminent Harvard researchers fabricating data, we really don’t need journalists drawing false conclusions about meticulously collected data just to jazz it up and make it more interesting to the layman.
Finally, I think the scientists quoted in these pieces are part of the problem. For example, NPR had a very good piece about ENCODE, by all accounts, but its credibility was slightly tarnished by this quotation:
“Most of the human genome is out there mainly to control the genes,” said John Stamatoyannopoulis, a geneticist at the University of Washington School of Medicine, who also participated in the project.
There’s nothing empirically wrong with this statement, except that it’s drastically overblown, and would never be made at a genetics conference as it would be torn to shreds. In fact, that’s the problem with most of the scientists I’ve seen quoted in these articles: They make absurdly broad claims for function using an extraordinarily loose definition (“reproducible biochemical activity.”) It’s very, very tricky to demonstrate function. And, more importantly (those of you who hated statistics, prepare to groan) they’re operating without a serious null hypothesis: What exactly do you expect non-functional DNA to look like?
As Mike White also pointed out, it’s not going to be inert. “Nucleosomes have low sequence specificity, and so we expect, in a large genome, many regions that, just by chance, have a random piece of DNA that reproducibly positions nucleosomes. Transcription factors recognise short, degenerate sequences that occur, again, just by chance, all over the genome. And so again, in a large genome, we expect plenty of reproducible but functionally irrelevant TF binding. That’s going to lead to pervasive, tissue-specific transcription at low levels, along with various chromatin marks. Transcription factor binding sites turn over fairly rapidly by evolution, and so we expect dense, complicated networks just by chance,” he writes. If the biology terminology proved a bit much, basically what he’s saying is that it’s really hard to define “inert” DNA, as transcription factors, the DNA-copiers, will bind to specific sequences that will appear by chance throughout supposedly neutral DNA, causing low-level biochemical activity (transcription) that doesn’t serve any valuable function. Evolution can cause these degenerate sequences to turn into dense, complicated networks just by chance.
The moral of the story? Be a skeptic! Just the other day, when I was writing my post “Harnessing Viruses”, I read the PLoS Genetics paper, a ScienceDaily article, and a PopSci article about the same set of results. I found some significant discrepancies between the PopSci article and the actual scientific paper itself - for example, the paper itself lauded weakening the tumour’s defenses as being paramount to fighting cancer, whereas in its article PopSci interpreted that as “beefing up the body’s defenses.” It may seem insignificant, and PopSci’s claim may even be factually accurate, but it was not supported by the paper they cited as their source. Popularising science is a great goal, and I think it can really inspire people; I know how intimidating scientific papers can be to read, and communicating science is instrumental in getting everyone to see its importance, value, and to create a more scientifically literate society. That being said, though, science has to be communicated carefully: When it’s done badly, as with ENCODE, it can be just as bad as having shoddy science in the first place.
Read. Ask questions. Be skeptical. And do celebrate ENCODE for the contributions it has made, because it’s an extraordinary data set that I have no doubt will contribute to molecular biology and medicine for years to come, much like its predecessor the Human Genome Project.
Images above: The ENCODE logo, and a picture taken from one of the ENCODE papers by Gerstein et all. in Nature. The images are entitled: “Visualisations of networked linkages between genetic components broadly across the human genome (right), and a smaller, hierarchically arranged subset (left).”
Regardless of phase, all Canadian snowshoe hare populations follow a characteristic boom-and-bust cycle. As their numbers climb toward carrying capacity, the hares begin to exhaust available food supplies and become increasingly vulnerable to predation by lynxes, coyotes, and raptors. After crashing in spectacular fashion, the population remains small until competition eases and food reappears. Rinse and repeat every 10 years.
The image above shows the degree of synchrony in Canadian snowshoe hare population cycles, using data collected from 1931 to 1948. The red regions represent hare populations that are ‘ahead’ - those that reach their peak population densities, the ‘boom’ in the cycle, up to two years earlier than average - while green regions correspond to delays. Thus, the ‘central’ hare population in the middle of the red area (along the border of Saskatchewan and Manitoba) is about four years out of sync with certain populations in Yukon and southern Ontario, as indicated by the contour lines.
Figure taken from What Drives the 10-year Cycle of Snowshoe Hares? (Krebs et al., 2001). Give it a read!
God, you’re beautiful, cytochrome C
(Picture: Cytochrome C by Irving Geis)
The Methylococcus capsulatus bacterium on the left contains a functional copy of the newly-discovered gene hpnR, which methylates hopanoids (a type of lipid) at the C-3 position. The image on the right shows what happens if this gene is deleted: the bacteria begin to lose their intracytoplasmic membranes (the gray bundle-looking things).
These membranes are thought to play an important role in stress tolerance in bacteria. In oxygen-deficient environments, the intracytoplasmic membranes promote increased methane oxidation, enhancing survival. Read the paper for a nice, albeit technical, account of why you should care about the gene (eating oil spills! geologic dating! extreme life!)
Photo credit: Paula Welander / MIT, via
Finished The Immortal Life of Henrietta Lacks today. It’s a bit repetitive (this was not a story that needed 300 pages to be told), but it’s compelling and well-researched, so give it a shot if you don’t mind tearing up a little.
Essay by JBS Haldane on the subject of basic morphological considerations. May very well make you think entirely differently about the subject of size. Can you imagine what it would be like to live with no consideration of gravity?
CTrombley, who linked me this, also said that some bacteria are so light that they can “climb the air”, due to viscous forces having more effect than gravity at the low scale. The sky isn’t the greatest place to live, but life continues there nonetheless.
Haldane is wonderful; too bad about the rest of the reading list.
Image by Christian Gautier, BIOS/PHONE Photo Agency.
Nature, red in bill and claw? You’ve probably heard of kleptoparasitism in cuckoos; now, video evidence has confirmed anecdotal accounts of the same nasty behavior in honeyguides. From the recent (open-access!) article A stab in the dark: chick killing by brood parasitic honeyguides.