Time for a Confession... I lied

Okay folks... it's time for a confession... last week I lied to a lot of you...

But first read this...

"Don't take anything for

for granted"

Now go back and read it again. If you were one of the 80% of people who didn't spot the typo the first time, don't fret -- you are not alone.

It is a quite well known phenomenon that it is very difficult to spot your own typos.

Last week, I posted a "survey" on here regarding a change to a new survey platform I was considering switching to and asked for some feed back regarding it. In all honestly, I already use Typeform as a survey platform and find its simplicity and fluidity is second to none.

The actual reason for that survey was to confirm a phenomenon that I have seen in my own business (and trying to use data to determine whether it is merely a psychological effect for myself or whether there is some real truth behind it.

If you only have 2 minutes here is the TL;DR:

  • My data supports previous findings that there are 2 cognitive systems in your mind. System 1 which is your intuitive thinking system and system 2 which is based more upon cognitive thinking and reasoning.

  • The questions asked were intended to automatically engage System 1 of your mind - they all had intuitive answers (unfortunately, the intuitive answers were incorrect).

  • Making text harder to read by either decreasing the contrast of the text (grey on white) or making the background a strange colour (white text on blue background) both led to around a 20% increase in in test scores overall.

  • If you are proof reading text in Microsoft Word, change your page background to black (this automatically makes your text white) - it really works!

Now if you have more time to read, here are the full details.

My Observation

I have always found when proof reading a document, that printing it out and viewing it in a different medium seems to improve my ability to pick up typos in the document and, conversely, when reading a document on screen I could read a sentence multiple times and still not pick up an otherwise glaringly obvious typo.

I made nothing of it until I read Shane Frederick's paper on the Cognitive Reasoning Test which discusses previous research about the two patterns of brain function - to recap: System 1 (intuition and quick judgment) and System 2 (logic and cognitive thinking). That paper also discusses a very simple 3 question test which can be used to show which part of the brain is being primarily engaged.

I wondered whether or not, the same factors could explain why I was finding it so hard to pick up typos in my own documents. After all, being so familiar with it and, being a native english speaker, reading is effectively second nature and does almost seem automatic.

I then came across another study which attempted to  switch a number of factors in order to engage the System 2 brain process - in particular, they chose particularly difficult to read  font to stop people from answering with their "gut" instinct and instead engage their System 2 brain process. Using difficult to read fonts, there was an overall 25% improvement in test scores.

My obvious thought was that a similar process was behind why proof reading on the "familiar" computer screen generated such results compared to proof reading on the "unfamiliar" paper format. My second motivation was to find a better way to proof read so that I wouldn't needlessly print documents out (and waste our valuable natural resources).

Method

The 3 question cognitive reasoning test from Shane Frederick was used as a quasi-CAPTCHA test on a "survey". Participants were not told that their cognitive reasoning was being tested only that they were being asked to provide feedback on a question and answer platform.

The three questions are:

  1. A bat and a ball cost $1.10 in total.
    The bat costs $1 more than the ball.
    How much does the ball cost?
    (Intuitive answer $0.10; correct answer $0.05)
     

  2. It takes 5 machines 5 min to make 5 widgets.
    How long would it take 100 machines to make 100 widgets?
    (Intuitive answer 100 minutes; correct answer 5 minutes)
     

  3. In a lake, there is a patch of lily pads.
    Every day, the patch doubles in size.
    If it takes 48 days for the patch to cover the entire lake, how long would it take for the patch to cover half of the lake?
    (Intuitive answer 24 days; correct answer 47 days)

An A/B split-testing platform was used to direct traffic to the exact same survey on 3 different sites as follows:

  1. A webpage with normal black text on a white background.
     

  2. A webpage with 20% grey text on a white background.
     

  3. A webpage with white text on a blue background.

Participants were directed to a singular link (and generally were not aware of the A/B split test).

Results

Normal

Average Score 0.94

Average Time to Complete 4.02 minutes

Average Satisfaction 8.95

Results: 23 out of 43 impressions

Grey on white

Average Score: 1.67

Average Time to complete: 6.74 minutes

Average Satisfaction: 7.94

Results: 18 out of 42 impressions

White on blue:

Average Score: 1.48

Average Time to Complete: 4.43 minutes

Average Satisfaction: 9.24

Results: 18 out of 43 impressions

Discussion

My hypothesis appears to have been confirmed. These results should of course be cautious given such a small sample size but there was a vast difference in results.

It does not appear that anybody discovered the true purpose of the test; however, there were many puzzled comments regarding "I don't know what I just did" and the like.

Some of the other conclusions I have drawn are:

  • The average scores were much lower than that of the Harvard studies described above. That is to be expected though given the selection bias involved in conducting the test on Harvard students and the difference in medium (students paid to take the test versus a voluntary
     

  • The rates at which the poorly designed (i.e. harder to read) surveys were completed were significantly lower. They were also, the results for which outlier answers for the CRT had to be removed (nonsense answers like "ADSAD"). It is apparent that a poorly designed survey will generate fewer results and more lower effort answers (although that is to be expected).
     

  • While using a very light shade of grey on white does generate the best scores, it also gives the biggest time penalty and lead to the overall lowest satisfaction score.
     

  • Overall, I would recommend the "opposite" colour scheme because it has a lower time penalty to using light grey but still generates a significant increase in scores. This accords with my own empirical experience that reading from paper makes it easier to pick up typos (it is not the extra time spent that is critical but the unfamliarity).

    Also, in the context of written text, it is extremely rare that you would apply a background colour to an entire page. However, it is much more likely that you would apply different colours to the text in headings. For practical purposes, this means that you can switch the background colour back to the original without too much difficulty; however, it will be much more difficult to reverse the text colour change without also reversing the corrections made to the text. This is the same reason why changing fonts as described in the above study is also difficult (because it common for headings and body text to use different fonts).