Does teaching philosophy to children improve their reading, writing and mathematics achievement?
(Guest post by @mjinglis)
I’ve been getting a bit concerned that the EEF’s evaluations of educational methods,
which were meant to help provide a more solid evidence base for teaching,
are actually leading to the same sort of unreliable research and hype that we have seen all too often in educational research. The following guest post is by Matthew Inglis (@mjinglis)
who kindly offered to comment on a big problem with the recent, widely-reported study
showing the effectiveness of Philosophy for Children (P4C).
On Friday the Independent newspaper tweeted that
the “best way to boost children’s maths scores” is to “teach them philosophy”.
A highly implausible claim one might think: surely teaching them mathematics would be better?
The study which gave rise to this remarkable headline was conducted by
Stephen Gorard, Nadia Siddiqui and Beng Huat See of Durham University.
Funded by the Education Endowment Foundation (EEF),
they conducted a year-long investigation of the ‘Philosophy for Children’ (P4C)
teaching programme. The children who participated in P4C engaged in group dialogues on important philosophical issues – the nature of truth, fairness, friendship and so on.
I have a lot of respect for philosophy and philosophers.
Although it is not my main area of interest, I regularly attend philosophy conferences,
I have active collaborations with a number of philosophers, and I’ve published papers
in philosophy journals and edited volumes.
Encouraging children to engage in philosophical conversations sounds like a good idea to me.
But could it really improve their reading, writing and mathematics achievement?
Let alone be the best way of doing this?
Let’s look at the evidence Gorard and colleagues presented.
Gorard and his team recruited 48 schools to participate in their study.
About half were randomly allocated to the intervention: they received the P4C programme.
The others formed the control group. The primary outcome measures were Key Stage 1 and 2 results for reading, writing and mathematics. Because different tests were used at KS1 and KS2,
the researchers standardised the scores from each test so that they had a mean of 0
and a standard deviation of 1.
The researchers reported that the intervention had yielded greater gains for the treatment group than the control group, with effect sizes of g = +0.12, +0.03 and +0.10 for reading, writing
and mathematics respectively. In other words the rate of improvement was around a tenth
of a standard deviation greater in the treatment group than in the control group.
These effect sizes are trivially small, but the sample was extremely large (N = 1529),
so perhaps they are meaningful. But before we start to worry about issues of statistical significance*, we need to take a look at the data. I’ve plotted the means of the groups here.
Any researcher who sees these graphs should immediately spot a rather large problem:
there were substantial group differences at pre-test.
In other words the process of allocating students to groups, by randomising at the school level,
did not result in equivalent groups.
Why is this a problem? Because of a well known statistical phenomenon called regression
to the mean. If a variable is more extreme on its first measurement, then it will tend to be closer
to the mean on its second measurement. This is a general phenomenon that will occur
any time two successive measurements of the same variable are taken.
Here’s an example from one of my own research studies
(Hodds, Alcock & Inglis, 2014, Experiment 3). We took two achievement measurements
after an educational intervention (the details don’t really matter),
one immediately and one two weeks later. Here I’ve split the group of participants into two –
a high-achieving group and a low-achieving group –
based on their scores on the immediate post test.
As you can see, the high achievers in the immediate post test performed worse
in the delayed post test, and the low achievers performed better.
Both groups regressed towards the mean. In this case we can be absolutely sure
that the low achieving group’s ‘improvement’ wasn’t due to an intervention
because there wasn’t one: the intervention took place before the first measurement.
Regression to the mean is a threat to validity whenever two groups differ on a pre-test.
And, unfortunately for Gorard and colleagues, their treatment group performed
quite a bit worse than their control group at pre-test. So the treatment group was always going
to regress upwards, and the control group was always going to regress downwards.
It was inevitable that there would be a between-groups difference in gain scores,
simply because there was a between-groups difference on the pre-test.
So what can we conclude from this study? Very little. Given the pre-test scores,
if the P4C intervention had no effect whatsoever on reading, writing or mathematics,
then this pattern of data is exactly what we would expect to see.
What is most curious about this incident is that this obvious account of the data
was not presented as a possible (let alone a highly probable) explanation in the final report,
or in any of the EEF press releases about the study. Instead, the Director of the EEF was quoted
as saying “It’s absolutely brilliant that today’s results give us evidence of [P4C]’s positive impact
on primary pupils’ maths and reading results”, and Stephen Gorard remarked that
“these philosophy sessions can have a positive impact on pupils’ maths, reading
and perhaps their writing skills.” Neither of these claims is justified.
That such weak evidence can result in a national newspaper reporting that
the “best way to boost children’s maths scores” is to “teach them philosophy”
should be of concern to everyone who cares about education research and its use in schools. The EEF ought to pause and reflect on the effectiveness of their peer review system
and on whether they include sufficient caveats in their press releases.
You can TCR software and engineering manuals for spontaneously recall – or pass that exam.
I can Turbo Charge Read a novel 6-7 times faster and remember what I’ve read.
I can TCR an instructional/academic book around 20 times faster and remember what I’ve read.
Introduction to Turbo Charged Reading YouTube
A practical overview of Turbo Charged Reading YouTube
How to choose a book. A Turbo Charged Reading YouTube
Emotions when Turbo Charged Reading YouTube
Advanced Reading Skills Perhaps you’d like to join my FaceBook group ?
Perhaps you’d like to check out my sister blogs:
www.innermindworking.blogspot.com gives many ways for you to work with the stresses of life
www.ourinnerminds.blogspot.com take advantage of business experience and expertise.
www.happyartaccidents.blogspot.com just for fun.
To quote the Dr Seuss himself, “The more that you read, the more things you will know.
The more that you learn; the more places you'll go.”