Friday, October 19, 2012

Writing and Statistics

It is clear that some writing students are struggling to comprehend how statistical research is referenced and presented in writing for general audiences. What seems to be the problem, what was called a "contradiction" in class, is how science relies on precise data to report imprecise results. I am including a link to an essay I composed this morning in an attempt to explain why generalized statistics (the goal of all research) are not a contradiction with precise writing. The problem is that few people understand scientific research methods or statistics.

The right approach in a class is to ask for clarification, attempting to understand why "imprecise" can be the "most precise" data for an article or essay. Writers learn to use two or three sentences to accomplish this balance, educating the public while also using data appropriately. Scientific findings and data are never precise. That's one of rare instances when "never" is the best word. Science assumes everything we assume to be true might someday be disproved. Therefore, we express findings in terms of percentages of confidence. As I expressed in the first few nights of class: science admits to imprecision and doubt, while literature scholarship makes absolute claims.

This is a study reported in the New York Times:
The average American watches 34 hours of television per week … which sounds suspiciously like a full-time job. That slightly depressing, slightly baffling statistic comes to you courtesy of the Nielsen Company, which reports that total viewing rose about 1% in 2010.
Now, of course some people watch more, and some people watch less. A statistical mean (average) is imprecise. It is a best estimate, based on complex statistical models. That is how research data are collected. I cannot make a range map of the scarlet tanager (a beautiful bird) by tracking every single scarlet tanager in the world. Instead, I study a subset of banded tanagers and hope they represent the entire population. I then report with a percentage of confidence what the range of the species is. However, I might be 80 percent certain of the range and there will still be several scarlet tanagers who find a nice golf course in Minnesota to call home. They don't make my research any less valid: they are statistical outliers.

A writer needs to appreciate that science is uncertain, while writing with the highest level of precision possible. If a study you cite claims that the average American watches 34 hours of television, that's the best data we have. We know that not every household was studied — that would be impossible. We also know that some people don't have televisions. Statistics are, by nature, imprecise. You write about statistics precisely, yet you also either assume or explicitly explain that statistics are incomplete realities.

Read the essay I've composed and try to understand the complication is not a contradiction. Being as precise as possible does not require perfection. It requires writing based on the best evidence available, while knowing all evidence is flawed. If you want to learn about writing with as much accuracy as possible, within the limits of research and statistical methods, you need to accept that science is uncertainty.

Science seeks generalizations, which is not a contradiction. We generalize about gravity. We generalize about evolution. That's science. Our generalizations are based on precise data and precise correlations. Yet, they are always presented within confidence intervals that are never 100 percent.

