One of the people who replied to my post on readabiity formulas asked me what I thought of cloze testing. The timing of the question was perfect because I've recently been thinking and reading about cloze testing as an alternative to standard readability formulas.I've been thinking about this subject because I recently FAILED a cloze test that Jakob Nielsen included in one of his Alertbox newsletters on web usability.
In responding to the question, my answer grew so long--a not unusual occurrence in my case--I decided to make cloze testing the subject of a new post because I think my experience with a cloze test illustrates some of its problems (It has to have problems; otherwise how could I do so badly?).
If I scored the passage according to the advice of the cloze procedure's inventor W.L. Taylor--Taylor believed the reader had to fill in the blank with the exact word originally deleted--I got very few right (And no, I am not revealing how many I got wrong or right). If I went with one of the alternatives, which Nielsen suggests i.e. accepting synonyms as correct answers, I got a little over 60% correct, which technically means that the text was within my range of comprehension as long as you accept that method of scoring. My husband also took it, and he got a score of 100, again following Nielsen's suggestion that synonyms were fine.
My humiliating experience on Nielsen's cloze test highlights what are just two of the problems related to using it as a test of comprehension: (1) There's a good deal of disagreement about what a correct answer is (Some people get around the debate by making the tests multiple choice and supplying several possible words or phrases to fill in the blanks, which I think is a good idea) and (2) The reader's background knowledge or lack of it plays a big role in performance. The passage was about Facebook privacy policies. My husband is on Facebook. I am not (which is clearly the reason he scored higher than I did, at least this is what I tell myself)
But there are other problems or issues that are much debated in the literature on cloze. For instance, should the test-maker eliminate every 5th word (I think that's the most common choice). Or should she eliminate every 6th,7th, or 9th word in the "mutilated" passage. (That's how some researchers describe passages with deletions. I mention it here because I find the word choice just hilarious).
Another question is, should the elimination be random or should it focus on deleted nouns versus deleted verbs? Some researchers think it's nouns that are the big meaning carriers. Thus, eliminating them makes the test too hard to be useful. I could go on here, but I think my point is clear: Using cloze testing to determine either ease of reading or rate of comprehension raises an awful lot of complex questions about both test creation and scoring.
All that being said about why not to use cloze tests to determine comprehension or readability, I still think cloze tasks, or exercises, are a great way to sharpen students' sense of how a text builds meaning step by step or sentence by sentence. I also think cloze-based exercises make readers focus on context cues, not just for vocabulary but for overall meaning.
For those reasons,I would definitely suggest including cloze tasks in the classroom. They are a terrific way to get a sense of, for instance, how well or poorly students make use of linguistic cues that tell readers about relationships between ideas. I'm working, for example, on a cloze exercise where all explicit connectives have been eliminated, i.e. transitions, conjunctions and opening adverbial clauses.
Used in this way, I think cloze testing, or more precisely, tasking becomes an excellent and very specifically-focused diagnostic tool. Of course, it also becomes, technically, more a fill-in-the-blank exercise than a formal cloze exercise, where the deletions are usually decided on by a formula, or at least they were when Taylor introduced the cloze procedure in 1953. But Taylor's formula has been fooled with so much since he first published it in the Journalism Quarterly, more than half a century ago, I don't feel compelled to follow it too abjectly, which is pretty much my attitude toward all formulas now that I think about it.
Thursday, March 10, 2011
Wednesday, March 2, 2011
Thoughts on Readability Formulas
Although I want to return to the subject of voice in writing, a topic that really intrigues me, this post is on readability formulas, a subject I'm frequently asked about in relation to my textbooks.
There is, however, a specific stimulus for this post. In an effort to go paperless in my office, I’m re-reading old journal articles and deciding which ones I want to scan into my online notebook. To that end, I re-read a 1982 article in the Reading Research Quarterly (V.18 #1 p.23) in which the authors described their research on informal reading inventories and commented that using passages with different levels didn't seem to affect students' performance. In other words, as the grade level of the text went up or down based on the readability formulas used, students’ comprehension scores didn't go up or down with them.
The authors then went on to write: “Although readability formulae reflect word difficulty and sentence complexity, they fail to account for one's familiarity with a text Presumably, this failure to control for students' familiarity with reading material diminishes the validity of both commercial and curriculum-based inventories developed with readability formulae."
Despite the resurgence of readability formulas since that article was written (in the 1980s, they were roundly and repeatedly criticized and both the IRA and NCTE discouraged their use in the creation of written materials) I think the authors' suggestion that readability formulas are not fully adequate to the task of revealing how well students might or might not understand what they read is still timely. Actually, as many discussions of readability formulas and their history point out—See, for just one instance, the Plain Language Association website—the formulas were meant to measure ease of reading, not comprehension.
If you find that distinction confusing, you're not the only one. But after pondering it a bit, I think it means a passage coming in at a low grade level, based on a readability formula, could be easily read if you define "reading" as knowing and pronouncing the words. However, that ranking can't tell you whether or not readers can readily grasp the concepts or relationships expressed in the passage. That is, a passage given a low grade level by a readability formula is not necessarily easy to understand. By the same token, passages that earn high grade levels aren’t necessarily hard to read.
Readability formulas don't measure comprehension because they do not take into account key comprehension factors such as the syntactic complexity of the language, familiarity of the vocabulary, reader’s background knowledge, and the text’s conceptual difficulty. Instead they rely mainly on length of sentences and numbers of syllables with some formulas including elements like passive constructions and prepositions.
For instance, a readability formula would treat ennui and boredom as equals because both words have two syllables. Yet the truth is most students would immediately know the meaning of boredom and be dumbfounded by the word ennui. Similarly, inconceivable is shorter than unimaginable, but that certainly doesn't make it easier for student readers to interpret.
Readability formulas also rely heavily on length of sentences to identify ease of reading. Sentence length, though, doesn't tell the whole story. As the web usability guru Jakob Nielsen points out in his discussion of writing for the web, these two sentences are the exact same length but conceptually, they are far from equal:
He waved his hands.
He waived his rights.
As Nielsen correctly says, "everybody understands what the first sentence describes,[however] you might need a law degree to fully comprehend the implications of the second sentence."
Given what I see as the limitations of readability formulas, it's always disconcerting to be asked what readability formula I use to write my textbooks because, in all honesty, I have to say “none.” I use the Flesch Kincaid readability feature of Word strictly as a predictor of potential difficulty. If a passage comes out higher than I think it should be for the book’s audience, I check it for syntactical and linguistic features, known to cause problems, i.e. distance of pronouns from references, embedded clauses, passive constructions, etc. (The Purdue Online Writing Lab has a list of five principles for readability that I find invaluable, available as a PDF or PPT series). And if I really want to cover a topic that consistently comes out with a high, grade level, for instance, passages on the brain with all those pesky references to the multisyllabic word hemispheres, I will classroom test to see how students do with the passages in question.
While I could wax even longer on how readability formulas should be used with extreme caution, I’ll end with a quotation from a study put out by the University of Illinois at Urbana, available on the web at the Eric Clearing House and titled “Conceptual and Empirical Bases of Readability Formulas”:
They study was published in 1986, but to my mind, the sentiments are not the slightest bit dated.
There is, however, a specific stimulus for this post. In an effort to go paperless in my office, I’m re-reading old journal articles and deciding which ones I want to scan into my online notebook. To that end, I re-read a 1982 article in the Reading Research Quarterly (V.18 #1 p.23) in which the authors described their research on informal reading inventories and commented that using passages with different levels didn't seem to affect students' performance. In other words, as the grade level of the text went up or down based on the readability formulas used, students’ comprehension scores didn't go up or down with them.
The authors then went on to write: “Although readability formulae reflect word difficulty and sentence complexity, they fail to account for one's familiarity with a text Presumably, this failure to control for students' familiarity with reading material diminishes the validity of both commercial and curriculum-based inventories developed with readability formulae."
Despite the resurgence of readability formulas since that article was written (in the 1980s, they were roundly and repeatedly criticized and both the IRA and NCTE discouraged their use in the creation of written materials) I think the authors' suggestion that readability formulas are not fully adequate to the task of revealing how well students might or might not understand what they read is still timely. Actually, as many discussions of readability formulas and their history point out—See, for just one instance, the Plain Language Association website—the formulas were meant to measure ease of reading, not comprehension.
If you find that distinction confusing, you're not the only one. But after pondering it a bit, I think it means a passage coming in at a low grade level, based on a readability formula, could be easily read if you define "reading" as knowing and pronouncing the words. However, that ranking can't tell you whether or not readers can readily grasp the concepts or relationships expressed in the passage. That is, a passage given a low grade level by a readability formula is not necessarily easy to understand. By the same token, passages that earn high grade levels aren’t necessarily hard to read.
Readability formulas don't measure comprehension because they do not take into account key comprehension factors such as the syntactic complexity of the language, familiarity of the vocabulary, reader’s background knowledge, and the text’s conceptual difficulty. Instead they rely mainly on length of sentences and numbers of syllables with some formulas including elements like passive constructions and prepositions.
For instance, a readability formula would treat ennui and boredom as equals because both words have two syllables. Yet the truth is most students would immediately know the meaning of boredom and be dumbfounded by the word ennui. Similarly, inconceivable is shorter than unimaginable, but that certainly doesn't make it easier for student readers to interpret.
Readability formulas also rely heavily on length of sentences to identify ease of reading. Sentence length, though, doesn't tell the whole story. As the web usability guru Jakob Nielsen points out in his discussion of writing for the web, these two sentences are the exact same length but conceptually, they are far from equal:
He waved his hands.
He waived his rights.
As Nielsen correctly says, "everybody understands what the first sentence describes,[however] you might need a law degree to fully comprehend the implications of the second sentence."
Given what I see as the limitations of readability formulas, it's always disconcerting to be asked what readability formula I use to write my textbooks because, in all honesty, I have to say “none.” I use the Flesch Kincaid readability feature of Word strictly as a predictor of potential difficulty. If a passage comes out higher than I think it should be for the book’s audience, I check it for syntactical and linguistic features, known to cause problems, i.e. distance of pronouns from references, embedded clauses, passive constructions, etc. (The Purdue Online Writing Lab has a list of five principles for readability that I find invaluable, available as a PDF or PPT series). And if I really want to cover a topic that consistently comes out with a high, grade level, for instance, passages on the brain with all those pesky references to the multisyllabic word hemispheres, I will classroom test to see how students do with the passages in question.
While I could wax even longer on how readability formulas should be used with extreme caution, I’ll end with a quotation from a study put out by the University of Illinois at Urbana, available on the web at the Eric Clearing House and titled “Conceptual and Empirical Bases of Readability Formulas”:
Problems arise when difficult words and long sentences are treated as the direct cause of difficulty in comprehension and are used in readability formulas to predict the readers' comprehension. Readability formulas are not the most appropriate measure and cannot reliably predict how well individual readers will comprehend particular texts. Far more important are text and reader properties which formulas cannot measure. Neither can any formula be a reliable guide for editing a text to reduce its difficulty.
They study was published in 1986, but to my mind, the sentiments are not the slightest bit dated.
Labels:
comprehension,
readability,
readability formulas
Tuesday, March 1, 2011
For Jordan and Julie
When I first decided to post about topics I found important, my friend Jordan Fabish--whose wonderful essay on dual reading I have posted on my web site--said with some skepticism, "Now there's another thing you will have to keep up." Typically, Jordan was more prescient than I was about my ability to post regularly, and it's been, ahem, a long time between topics.
I have, however, been inadvertently shamed into trying again by Julie Williams of Ramapo College who wrote me a really nice e-mail saying that she agreed with what I wrote about the voice on the page, even if it was written a year ago! So within the next day or so, I’ll post on a topic that I’m asked about a lot--readability formulas. And if I dawdle, I’m hoping Jordan and Julie will hold me to my promise and get me moving.
I do want to mention though for anyone who responds, please don't tell me what you think via e-mail. I will just start exchanging e-mails on the subject and, once again, forget about posting anything here, when my goal really is to get an exchange going among people who share my interest in best practices or current problems teaching reading and writing.
I have, however, been inadvertently shamed into trying again by Julie Williams of Ramapo College who wrote me a really nice e-mail saying that she agreed with what I wrote about the voice on the page, even if it was written a year ago! So within the next day or so, I’ll post on a topic that I’m asked about a lot--readability formulas. And if I dawdle, I’m hoping Jordan and Julie will hold me to my promise and get me moving.
I do want to mention though for anyone who responds, please don't tell me what you think via e-mail. I will just start exchanging e-mails on the subject and, once again, forget about posting anything here, when my goal really is to get an exchange going among people who share my interest in best practices or current problems teaching reading and writing.
Subscribe to:
Posts (Atom)
About this blog: For years now, whenever I wanted to test out a new exercise or figure out how I’d like to address a new topic,I’ve been sending out an SOS to teachers I’ve worked with or met at conferences and online and asking them what they thought of my approach or if they had another way of addressing say improving students ability to stay focused while reading on the Web.
Probably later than it should have, it’s now occurred to me that a blog might be a good way to bring others into these online discussions, which, for me anyway, have been incredibly valuable. So every week or so, I’m going to post my thoughts on a topic that I consider really central to the teaching of reading and writing. In every post, I’ll include practical strategies for addressing the topic discussed.
My hope is that other instructors will respond with their thoughts and, over time, we can come up with a repository of teaching methods geared to specific objectives like teaching coherence in writing or using linguistic cues in reading and a host of others.
Probably later than it should have, it’s now occurred to me that a blog might be a good way to bring others into these online discussions, which, for me anyway, have been incredibly valuable. So every week or so, I’m going to post my thoughts on a topic that I consider really central to the teaching of reading and writing. In every post, I’ll include practical strategies for addressing the topic discussed.
My hope is that other instructors will respond with their thoughts and, over time, we can come up with a repository of teaching methods geared to specific objectives like teaching coherence in writing or using linguistic cues in reading and a host of others.