Goldilocks Information

Today Occupy Math is going to try and share some hard-won wisdom about how to design math problems, conduct meetings, and think a little more clearly. The post is inspired by the incredible work of Claude Shannon, the father of information theory. Most people think of information, from a stock tip to the dates of the War of the Roses, as something that you overhear, look up, google, or learn in school. This point of view is not wrong, but there is another view. Speaking colloquially, Shannon discovered that information has its own existence, its own physics, and can be tamed and manipulated like steam or electricity. Living DNA is a collection of biological macromolecules, but it is also an extraordinary information repository. Its sequence, geometry, and epigenetic context jointly hold the notes and chords of life. Shannon’s work is key to everything from understanding the totality of life to making your hard drive work efficiently. Today’s topic concerns getting the amount of information “just right”.

The story Goldilocks and the Three Bears is short but has sparked a lot of discussion. Is the story about why it is better for children to behave? Do the bears, examining the break-in, exemplify detectives? People also have trouble identifying the good guys, at least once they get past the fact that the main character is a cute little blond girl. Since I hang out with scientists, the main thing I notice is that the story is not about thermodynamics – it always bugs scientists that mama bear’s medium bowl is cold while baby bear’s small bowl is just right (served at the same time, the smaller bowl would cool faster). In spite of this Goldilocks has come to mean “just right”.

The zone around a star where an orbiting planet could have liquid water on its surface is called the Goldilocks zone. The economists are in on this – an economy with moderate sustainable growth is a Goldilocks economy. Similarly, in marketing, Goldilocks pricing is finding the price that maximizes profit. A very low price yields high sales; a very high price yields high per-sale profit. Somewhere in between is the correct price for highest profit. Almost any discipline has things they need to get just right. Occupy Math has spent decades trying to get test questions for his math courses just right.

When you write a math problem, say for a test, you must consider a bunch of factors including:

  • Did you ask exactly one question? This is as opposed to asking two or more questions or, even worse, less than a whole question.
  • Given what you covered in class and the homework, is the problem one the students can do in at least one reasonable way?
  • Does the problem, however nifty, test the material it is supposed to be testing?
  • Will the responses the students turn in be things you can grade in a reasonable amount of time?
  • Is this question a duplicate, relative to the skills it tests, of others? Repetition should be absent or intentional.
  • Is the language used to phrase the question clear and unambiguous? See the well-known example, below.

The clearer question would have been “Find the value of the variable x.” That question gives more information about which answer is desired. When my wife was in University, she had a professor whose test questions started “Prove or disprove and salvage.” This is a perfect sort of question for a fourth-year math major or graduate student. It would kill first-year students.

The trick is that both the information in the question and the information in the answer need to be Goldilocks – not too much, not too little.

When I took my topology course in graduate school, the first exam was a week-long take home exam. I was up all night three nights that week and turned in 27 pages of solutions. When we turned in the exam, the instructor asked the class what we thought of the exam. After a really long, sleepy silence he said “Come on, say something!” I replied “I hope your dog dies.” This was a bit rude, but that exam was a giant rookie mistake (this was the first time the gentleman had taught a course in his research specialty). He tried to test us on everything he’d covered with problems that were ornate, beautiful, and showed how clever he was. Decades later I still remember the feeling of ultimate, deathly exhaustion. I had been hit over the head with way too much information and been compelled to cough up huge amounts of information.

Shannon’s work showed that information is a measurable quantity (the units of information are bits of entropy). The theory of information uses this for secret codes, efficient communications channels, and even thermodynamics. Occupy Math wants to try to present some of the results as rules of thumb and practical guidelines.

Have you ever been at a meeting where a decision must be made?

Meeting participants try and come up with all the relevant factors so that the decision can be made with complete information and no error. Even if we ignore the fact that this is only possible for the very simplest decisions, there is the question of how much information each factor contains. Suppose we are deciding between two possible textbooks for a class. Both cover all the topics needed. The committee is producing one bit of information – which book to choose. If a member of the committee is looking at individual problems in the book (which has thousands of problems) then he is probably mining relevant information with such low information content that there is no chance it should change the decision. Notice: “should”, not “will”.

The concepts of critical requirements and deal breakers are really helpful in situations like this.

If there are high-information factors that make it pretty obvious what the decision will be, look for them. Hard. For textbooks this might be price and availability. For a new piece of software you need to look at other people’s experience and the way service works. Identifying the high information factors ahead of time can seriously reduce meeting duration.

Thinking about information as a quantity like mass or charge was a very odd idea at first. Now Shannon and his disciples have made it a commonplace. Data scientists try and pull knowledge out of the huge amounts of information in big data. Statistics and machine learning use information theory to decide what advertisements you should see when you look at a Youtube video. Do you have good high-information decision factors to suggest or that have helped you in the past? Have you ever encountered an awful test question? Please comment or tweet to let us know about them.

I hope to see you here again,
Daniel Ashlock,
University of Guelph,
Department of Mathematics and Statistics


One thought on “Goldilocks Information

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s