This week, Occupy Math looks at math tests — and some other tests — from the perspective of fairness. It turns out that questions that test the same skills can have extremely adjustable difficulty levels. There is also the issue of tests designed for failure. For that, we will look at some examples of cosmically unfair questions. On the issue of math tests, this post discusses the differences between easy and hard questions for the same topic. Occupy Math can probably dial the average grade on a test across a range of 20% by playing with the way questions are phrased. All this will give you some perspective on how to survive a test (it helps to be able to spot structurally hard questions) — but mostly the message is this.
Fairness is largely an illusion and enforcing it is close to impossible. Hope for competent teaching and mercy instead.
In the 1960s the state of Louisiana used a literacy test as a qualification to vote. An article on this test appears in Slate and is worth a read. The nominal reason for this was that people who could not read were probably too ignorant to vote (this hypothesis was not tested and, given that people talk to one another, was probably false). Worse than that, because of past discrimination this test was much harder for black citizens than white citizens. Finally, it was administered at the discretion of the parish officials and thus many illiterate citizens who did not have brown skin were not tested. There is no hope this test was fair — but, on top of everything else, it was well designed to be difficult to pass. This latter issue is the focus of today’s post. Consider Question 25 from the test.
25. Write down on the line provided, what you read in the triangle below.
The triangle creates a reason for breaking the words into strange, irregular blocks and placing the double occurrence of “the” on different lines makes it somewhat invisible. This is not a literacy question — if it were, the question would be “what is the error in the phrase below?” Rather, this question exploits the human mind’s ability to extract meaning from noisy sources. A lot of people will filter out “the the” and make a correct sentence. This special ability of the human mind helps people get the wrong answer — which makes this question an anti-literacy test? Let’s continue on to Question 28.
28. Divide a vertical line in two equal parts by bisecting it with a curved horizontal line that is only strait at its spot bisection of the vertical.
This also is only nominally a literacy question. Many people who could read, but had not yet had a geometry course, would get this one wrong because their vocabulary does not include “bisecting” and Occupy Math finds “curved horizontal line” to be really confusing as well. All it does is make the problem harder to answer correctly without enhancing its ability to check literacy. Let’s finish our trip to 1960s Louisiana with Question 30.
30. Draw five circles that one common inter-locking part.
From a mathematical perspective the correct answer is to not draw anything at all! The question is nonsensical and so (mathematically) commands no action. If you correct it (add “have” before “one”) it becomes arguably impossible (try it!). Occupy Math doubts, however, that not drawing anything would have been viewed as a correct answer by the polling examiner. In fact, almost any possible answer to this question has a reason that it is wrong. It is actually pretty clever to construct a question that has no correct answer.
Occupy Math also resents that mathematics was being co-opted in this vile voter suppression effort. If you have time, look over the questions in the copy of the test in the Slate article. How many of these are actually literacy questions? How many are confusing math questions? Consider this — to make a literacy test very hard to pass, the Louisiana officials made it a math test. That hurts. It also supports Occupy Math’s slogan: “Math is the right of all free people.”
What about fair questions on math tests?
A perennial topic in math is the slope of a line. The slope is usually defined as rise over run — to find the slope you take a segment of the line and divide how far up it went by how far over it went. If the line went down, well, down is negative up, and the line has a negative slope. Here is a picture of a line with a slope of one — each of its segments goes up exactly as far as it goes over. This should be simple, right, just rise over run?
Let’s look at some questions that test the student’s knowledge that slope equals rise over run.
- Compute the slope of the line through the points (2,1) and (6,9).
- Compute the slope of the line through the points (3/2,5/2) and (7/4,-3).
- Find the pair of points from (1,1), (3,4), (2,5), and (7,3) so that, of all the pairs of points, the one you pick has a line through them with the largest slope.
- Determine values for (a,b) that cause the line through the points (-1,2) and (a,b) to have a slope of -5/2.
The first question has a number of kind features. The distance over is 6-2=4, the distance up is 9-1=8, and 8/4=2 means that all the numbers are positive whole numbers. This question tests the knowledge that slope equals rise over run in the least stressful way possible.
The second question is exactly like the first in form but the content is very different. There are lots of improper fractions, the answer is negative, and computing rise over run requires the student to divide fractions. The answer, if you want to check your prowess, is -22 (which is also large enough and strange enough as a number to add to the difficulty).
The third question is made more difficult by the obvious mechanism of actually requiring the students to compute or estimate six different slopes! Except it doesn’t. If you make a picture and plot all four points, only slopes involving (1,1) are positive and the line from (1,1) to (2,5) obviously has the steepest slope. This problem is harder — but it also rewards cleverness. Ploughing through and computing all the slopes is not only a lot of work, it also gives the students lots of chances to make an arithmetic mistake. Making a picture also engages the powerful human visual system to help check your conclusions. Here is the picture with blue positive slopes and magenta negative slopes.
The fourth question also suffers from improper fractions and negative slopes, but the real reason it’s harder is that it has variables or unknowns in it. Worse — this question has an infinite number of correct answers. To solve it you just pick some value for a (other than -1), do the rise-over-run calculations to find b, and then report the result. Both the use of variables and the fact you’re free to just pick one of the two numbers in your answer make this a very hard question. “Just pick anything” is a counter-intuitive step for most students.
To summarize — all four of these questions would merit a yes if you asked “does this question test the knowledge that the slope of a line is rise over run?” but they do it in such different ways that figuring out which question is fair is really hard. Do you want to reward cleverness as in Problem 3? Are you trying to force the student to learn to deal with variables as in Problem 4? Do you need almost everyone to pass, a goal supported by Problem 1? Occupy Math hopes that you can see that fairness is both tricky and depends, in part, on which students you have and what you’re trying to help them learn.
Using other people’s test questions.
Occupy Math has occasionally been present at epic disasters of exams where a question was asked that had a severe flaw in it. One of Occupy Math’s fondest memories from his undergraduate days is of a computer science final in a discrete math class. The test had seven questions on definitions and one mathematical proof. Definitions require study, nothing more, and most people can master them — those questions were of the “are you paying attention” variety. Occupy Math polished them off and turned to the proof. The instructor had unintentionally asked the class to prove something that wasn’t true, so Occupy Math wrote down a counter-example and turned in his exam, 12 minutes into a two hour test. The instructor took the exam, looked at the space to do the proof, and then said several very rude words. Loudly. This confused the other students and, sadly, of the several hundred people in the room, only three noticed the proof was correctly answered with a counter-example. The rest received partial credit for trying to prove something false.
This example shows why good test questions are precious and why instructors save exams from year to year and borrow questions from one another. There is a danger here: you should never use someone else’s questions without working them yourself! This lets you check that the questions really test what you want them to at an appropriate level for your class. For that matter, you may have taught a slightly different batch of techniques to your students than the other instructor did, meaning that a problem the other instructor thought was hard was really easy for your students, or vice-versa.
A Phantom Menace: Corporate Test Questions!
This issue of using other people’s questions becomes really sharp-edged in the context of the ongoing death spiral of textbook publishers. The price of the textbooks we use in university has gotten completely out of control. Occupy Math is fighting back — he wrote his own textbook for first-year calculus at one-third the price of the book we were using and is working to bring the price down even more — but the text book publishers are fighting back too. If you’re going to charge $225.00 for a textbook, then one way to justify it is to offer more than just the text. Occupy Math has seen several publishers that have software to generate practice problems and which can even write and grade quizzes and exams for you.
Occupy Math has looked at three of these systems and, so far, they are a menace to mathematics education. To put it another way, the questions they ask are like the first question in the “slopes of lines” questions above, except occasionally they have no right answer at all. Part of the problem is that the people writing the software are programmers, but they are not — and apparently are not working with — actual math teachers. This will probably get better over time, but even with a good digital tutor hundreds of dollars is too much for a text. This situation has gotten so far out of hand that Occupy Math got a memo from the President of his university about containing textbook costs. Beyond that, the tests a teacher gives are an important yardstick for their professional competence. That alone makes outsourcing your test questions perilous.
Back to Fairness
The take-home message here is that the fairness of a question or a test is context-dependent. That’s not to say there are not some completely bogus questions or testing strategies. At a minimum the question should be about what was taught or, possibly, things you were supposed to already know. The question should make sense, respect the vocabulary the students have, and should serve some educational goal. Within those bounds there is a lot of room to kick the difficulty of a “fair” test up and down. That suggests a second important point: don’t make your teacher angry. You won’t like the tests that are written when they’re angry.
Even though Occupy Math is pretty sure that there is a large component of illusion in fairness, he strives to be fair to his students. One technique he has hit on is tests that ask students to pick some questions off a list and answer them. Since everyone gets the same questions to pick from, this is fair in at least one sense. This type of test permits the students to demonstrate what they have learned — no one can protest “you asked the one question I didn’t know!” A more subtle point is that part of the test is choosing the right questions (for you). Near the end of Indiana Jones and the Last Crusade, one of the bad guys chooses a false Holy Grail, drinks from it, and gets his face melted with trademark Spielberg special effects. The warden of the Grail observes, laconically, “He chose poorly.” A part of this sort of choice-based exam is to try to avoid having face-melting questions in the mix. Maybe one. You know, to remove the unworthy.
Occupy Math hopes this discussion of how fairness does and doesn’t work has been at least a little helpful. It is distilled from years of writing tests, with a few really bad questions in the mix. Occupy Math collects terrible questions and excellent questions. True, some questions could be either depending on the context (if you send in an example, the context is a nice thing to include). Occupy Math would prefer to write things related to your interests and concerns: you are invited to comment or tweet!
I hope to see you here again,
University of Guelph,
Department of Mathematics and Statistics