Wednesday, February 11, 2015

PARRC Test Readability, Part 2: Looking at the Questions

This is the second in a series on issues related to the readability of the PARCC sample tests; tests that many students in the country will be taking for the first time this year. In the previous post I focused on the texts students were to read for the test. With one notable exception, I found the reading passages were aligned with the revised Lexile levels the test makers used for guidance, but about two years above grade level on other widely used measures of text difficulty.

Readability, however, is about more than the level of difficulty of the text itself. It is also about the reading task (what the student is expected to do with the reading) and the characteristics of the reader (prior knowledge, vocabulary, reading strategies, motivation).

In this post I will look at the second aspect of readability that must be considered in any full assessment of readability: the task that the reader faces based on the reading. Since this is a testing environment, the task is answering reading comprehension questions and writing about what has been read.

In any readability situation the task matters. When students choose to read a story for pleasure, the task is straightforward. The task is more complex when we ask them to read something and answer questions that someone else has determined are important to an understanding of the text. Questions need to be carefully crafted to help the student focus on important aspects of the text and to allow them to demonstrate understanding or the lack thereof.

There are many different types of questions that can be asked of a reader, of course. Some questions are easier than others. Bloom's Taxonomy offers a useful scaffold for understanding the levels of questions moving from literal to inferential to evaluative, but for our purposes here, I will use the categories developed by literacy professor, Dr. Taffy Raphael, known as Question-Answer Relationships or QARs.

QARs are useful because they describe the relationship between the question and the kind of work the student must do to answer the question. Four general types of questions are identified in this scheme,

  • Right There: The answer to the question is right in the text and is usually in the same sentence. The reader can point to the answer in the text. Bloom would call this the literal level of understanding.
  • Think and Search: The answer to the question is right in the text, but not in the same sentence. The reader needs to think about different parts of the text to come up with an answer. Bloom would designate this as the comprehension level.
  • Author and You: The answer is not in the text. The reader needs to think about what s/he knows, what the author says, and how these two things fit together. Bloom would call this the inferential level.
  • On Your Own: The text got you thinking, but the answer is inside your own head; it is not directly answered in the text. On Your Own questions are not usually a part of a standardized test.
I looked at the questions that followed one reading passage for each grade level 3-8 on the PARCC sample tests. The passages I chose were the same as those that I analyzed for readability in the previous post (A Once in a Lifetime Experience, Just Like Home, Moon Over Manifest, Emancipation: A Life Fable, The Count of Monte Cristo, and Elephants Can Lend a Helping Trunk). I looked at a total of 34 test items, which represented all of the questions asked about these passages. Using the QAR framework I found the following:
  • Right There Questions                  0
  • Think and Search Questions        16
  • Author and You Questions           18 
  • On Your Own                               0
Looking more closely at the questions, some other patterns emerged. The structure of the test called for students to answer one question and then determine what textual evidence supported their answer in the next question. In other words, if you got the first question wrong, it was very likely you would get the second question wrong because the second question was dependent on a correct answer on the first. Of course, it is also possible that the second question would help the test taker revise the answer to the first, but I think that would be more likely to happen in an instructional situation than a testing situation. Fully 16 of the 34 questions examined were dependent on a correct answer to the previous question. Essentially, if a student made one error on these comprehension questions, it likely would become two errors.

Here are some more discoveries from a closer look at the questions:
  • Five questions were vocabulary in context questions
  • Sixteen questions focused on identifying textual support for an answer
  • Three questions dealt with the main idea of the story
  • Two questions dealt with the structure of the story
  1. The questions are generally focused on higher order thinking skills (inferential level). Literal understanding of the text (which would be assessed by Right There questions) is either assumed or not valued by the test designers. This is, I suppose, consistent with the CCSS call to "raise the bar", but it will certainly impact student scores on the test.
  2. The questions indicate that the test designers are very focused on the idea of "citing textual evidence" as a dominant reading skill. Textual evidence is repeatedly cited in the CCSS and so I suppose it is unsurprising that it plays a prominent role in this CCSS aligned test. Citing textual evidence is a highly valuable and necessary reading ability, whether it deserves to be addressed in almost fifty percent of the questions on a test of reading comprehension is certainly open to debate. 
  3. There are other ways to know if students are using textual evidence. One way is, of course, to ask a Right There question. In order to answer a question that is "right there" in the text a student must, by definition, use textual evidence. The PARCC test completely ignores this way to assess the use of textual evidence in favor of inferential level questions tied to other inferential questions.
  4. The test designers seem to be more interested in textual analysis than on reading comprehension. Analysis of text is, of course, a part of reading comprehension, but it is only a part. Many of the questions on this test could be answered without a good understanding of the text as a whole. The focus seems to be on small bits of understanding (vocabulary in context, textual support) rather than on a more generalized comprehension of the text. The irony here is that as the CCSS calls for a higher level of critical thinking about text, these questions focus on a very narrow view of the text itself.
  5. There are some things to like about these tests. First of all, the tests use complete stories, rather than truncated segments of stories. This should assist students in building the understanding they need to reply to questions. Secondly, the test highlights an important aspect of being a thoughtful reader, citing textual evidence. This emphasis will surely spur teachers to emphasize this in their instruction, which is a good thing. 
Whenever a new test is rolled out, we know through past experience that test scores will go down. Over time schools, teachers, and students adjust and the trend then is for scores to go up. It will be no different with the PARCC tests. As the scores rise, some questions will arise like, "Have we been focused on the right things in these tests?" and "Have the tests led to better, more thoughtful readers?" Based on my analysis of these test questions, I am not confident.

In my next post, I will look at the third variable in readability: the reader.