Many people who teach English reading and spelling are under the impression that, as they talk, English speakers select from 44 sounds called phonemes. In this picture, we combine these phonemes to create words, which then become the stuff of speech. Some of these experts believe that these facts, when combined with the way letters are used to symbolise these sounds, form the basis of what they call the ‘science’ of reading.
I am always a little sceptical when those holding controversial opinions seize on the word ‘science’ with which to dignify them. The business of science being complex, however, I shall withhold judgement about the appropriateness of the term ‘science of reading’. What I can say with reasonable confidence, though, is that the notion that English speakers form their speech from 44 sounds called phonemes is a frighteningly long way from the truth.
For a start, if every English speaker spoke with the same 44 sounds, we’d have a lot of trouble distinguishing each other’s voices. Quite evidently, the noises we each produce when we speak are different. Recordings of two people from exactly the same part of the south of England would reveal, for example, that the ‘u’ sound in ‘mud’ was produced differently enough that, just from hearing that single word, their nearest and dearest could tell immediately who was speaking. Our voices do differ on an individual level, and this is mostly to do with the particular shape of our mouths, nasal cavities, and tongues, along with the tone of our voice box. But it is also, of course, equally obvious that there would be enough similarity between the noises made by out two speakers to be recognised as the same sound for the purposes of speech. These sounds are treated as belonging to the same category on the basis, not of the particular shapes and sizes of the vocal organs, but on what is actually done with them – chiefly, the way and extent in which our mouth opens, and the position of our tongue within it. The categories of sound defined in this way are what real scientists (that it is to say, linguists) call a phone – and certainly not a phoneme.
A phone is thus a class of sounds, which are created in the same way, and which, in general, we recognise when hearing them as being essentially the same, individual voice quality notwithstanding. Given this, I suspect many readers will at this point be highly puzzled, since what I have described as being a phone, they thought was a phoneme. This is a very common confusion, but the two are pretty distinct concepts – unlike a phone, a single phoneme can in no way be clearly identified with a single type of sound.
To understand this point, consider the way that differences in the sounds of speech are much more noticeable when two speakers with different accents pronounce the same word. If, for example, we compared a recording of one of our English southerners saying ‘mud’ with one of someone from the north England, most listeners would not only be able to distinguish the speakers’ voices, but they would also likely consciously register a different sound being used. This is because for most English northern accents the ‘u’ in mud is said roughly so as to rhyme with ‘would’, whereas in the Southern accent the two vowels are clearly distinct. In practice, the comment we might make on hearing the difference would not be so much ‘oh, that’s Jim speaking and that’s John’ but, ‘oh that’s a northerner speaking, and that’s a southerner’. The difference in this case is indeed a difference in vocal organ position: the tongue and mouth position is actually different, so that a truly distinct phone is produced by each speaker.
Crucially, though, while there would be a distinction in phones between the northern and southern speakers, this difference would have no impact on meaning: there would most likely be no confusion that the pig might be rolling in ‘mad’ or ‘mid’, rather than ‘mud’. Because the difference between these two sounds has no impact on meaning here, the two phones in this situation can thus be treated as simply two types (called ‘allophones’) of the same ‘super category’ – and it is this super category that linguists generally call a phoneme. A phone, then, is a category of sounds, while a phoneme is a category of category of sounds.
It follows from the way in which they are defined that phoneme categories are not firm and fixed, as a mathematician might like them – they alter with context. The two variant phones used to pronounce ‘mud’ are allophonic (ie, the same phoneme), since their difference has no impact on meaning. But when a northerner uses the same vowel to pronounce the action of ‘putting’ in golf, the difference is phonemic, since the southerner is at least in some danger of getting confused – ‘did you put or putt the ball in the hole?’ – one’s cheating, the other is the whole point of the exercise. The flexible range of a phoneme across a range of possible sounds is even evident, not just across accents of two speakers, but within variations that can occur in the same speaker’s speech – such as mine. The difference between the short and long ‘a’ in words like ‘bath’ is perhaps the most powerfully recognisable marker of the difference between a northern and southern English accent. I was brought up in a northern short ‘a’ area of England, but in a southern ‘long a’ family. My accent still wobbles inconsistently between the two, so that this phoneme for me has two quite distinct realisations – one matching the phone I use to pronounce ‘bad’ and the other a longer phone, matching that I use to say ‘bard’. The central phoneme in ‘bath’ is thus demonstrably unstable in my accent, ranging between two quite distinct sounds.
You might be thinking that your accent, unlike mine, is ‘pure’, and that I am an oddity. Well, your accent may well be purer than mine, but linguistic research is pretty conclusive that very few people speak as consistently as they might think. Another, common example of this sort of thing is evident, in British English speakers at least, in the two, equally ‘correct’ pronunciations of the past tense of ‘spill’ – ‘spilled’ or ‘spilt’, which is an unusual example only in that it is reflected in two variant spellings. The two pronunciations have the same meaning, so the two sounds – ‘t’ and ‘d’ – are, in this context, allophones of the same phoneme. But the difference between them is obviously phonemic when comparing, for example, the pronunciation of ‘to’ with ‘do’ , where the sound change signals two clearly distinct meanings, and so two distinct phonemes.
The phonemic system is therefore a shifting and flexible structure of differences – not one in which a finite and invariable set of symbols are simplistically and consistently selected, as with a code. This is evident in the speech of foreigners, which can be almost entirely ‘wrong’ when it comes to sound selection, in particular of vowels, and yet still make sense. Read aloud, and compare the following utterances (better still, for the fairest test, read aloud to someone else, and see what they hear):
1. Keen yai tool ma thoo wa te tha boonk
2. Ken ya till meh thee wee ta thee benk?
Chances are, 2 was much easier to understand. And that’s despite the fact that, just like 1, all the vowels are wrong. The point is that, while the vowels in 2 are all wrong, they stand in the right relation to each other – they are related to each other in roughly the way they need to be, so that we can readily identify them as allophonic to a ‘correct’ pronunciation. 2 of course is much more plausible an account of what you might hear when listening to a leaner of English – non-native speakers come to English with an implicit understanding of phonemic difference derived from their own languages – they may miss the mark with the phones, but they have a reasonable chance of getting the crucial differences at least right-enough.
Phew! It’s really no wonder that so many folk get confused about all this, but I do wonder why some of them have to do so in such a shouty ‘we-know-all-the-science-why-can’t-you-hippies–just-get-with-the-programme’ way. Unfortunately, this confusion has led to the idea taking hold among many working in literacy that writing is a straight forward one-to-one ‘coding’ of letters to sounds ( this is discussed in ‘The Advantages of Phonics are Immodest’ as the ‘alphabetic thesis’). Thus there is assumed to be a simple and consistent correspondence between a fixed set of physical written symbols – graphemes – and a fixed set of physical sound symbols – phonemes. As we have seen, though, the list of phonemes in a language is not a list of fixed sounds, far from it – it is a rather abstract and flexible system for registering a complex series of structural differences. Writing is therefore a much more subtle interaction between abstract categories, and not a direct one-to-one mapping between fixed sets of physical entities.
And, it should be said, the over-simplification of the ‘decoding’ model occurs at both ends of the story. Just as a phoneme is not a fixed sound, but a flexible class of similar-enough-sounds, so a grapheme is not actually a squiggle, but a class of similar-enough squiggles. Somehow, though, we – both educators and people in general – are more reconciled to the subtlety at the writing end of things. We readily accept that, for example, letters have to be written well to be read, but that individuals will each have their own handwriting style, which may well contain perfectly forgivable idiosyncrasies. We know that my written ‘a’ is not the same as yours, or even the same each time I write ‘a’, and that all of my and your written ‘a’s are probably quite different from a printed ‘a’. In some people’s handwriting, an ‘a’ may end up looking more like an ‘e’ in someone’s else’s. And all this variation is OK, so long as meaning is clear. Exactly the same kind of variation exists in speech but, perhaps because speech is so fleeting – lost in time the moment it is produced – the intuitive sense of the complexity of writing is not matched when we think and talk about it, with some potentially pretty negative outcomes.
A student once approached me, for example, and complained that they couldn’t spell. They told me that they knew that they didn’t speak properly, and that this was the reason they struggled with spelling. They did not have a speech impediment, but their accent was perhaps a little broader than those of the other students in the class. This somewhat heart-breaking conversation was, I fear, very much the result of an excessively simplistic approach to spelling taken in the instruction I and others had delivered to this young person. We had affirmed the message, again and again, that the key to a word’s spelling was in identifying its phonemes and syllables, and that these could be easily accessed simply by listening to yourself speak. This student, conscious of the natural differences between the phones of her speech and those of their peers, had come to feel that their inability to work out how to spell a word like ‘university’ was down to their sub-standard speech: ‘proper’ speakers talk in phonemes, but I must talk in phonic sludge.
There is, it must be said, a great deal of obvious truth in the alphabetic principle. Quite clearly, writing connects to speech, and the idea of graphemes-as-letters linking to phonemes-as-sounds is undoubtedly helpful, especially for very early readers. The problem comes, as we have seen, when teachers lose sight of the complexity. In my student’s case, the fact that some of their phones were not identical to those of their peers, combined with an instruction that confused phonemes with phones, had led them to the most destructive and harmful misconception about the source of their difficulties.
And it’s not just an issue of accents. Many teachers – and experts – fail to realise, not only that speech is essentially composed of phones not phonemes, but that many of the phones we speak can’t even be clearly categorised as phonemes at all. To see what I mean, try saying the following sounds, as naturally as you can, and saying the bolded parts more strongly:
1. Yoo-niv-ersity
2. Yoo-nav-ersity
3. Yoo-nev-ersity
4. Yoo-nuv-ersity
5. Yoo-nov-ersity
The chances are that, while one or more of these options may have seemed more natural to you, or closer to the way in which you would say ‘university’, none of them is likely to have sounded outright ‘wrong’. This is because the second syllable in this word is unstressed. This unstressed syllable may be said with a whole range of sounds, and yet the meaning of the overall word is perfectly preserved.
Now, we’ve come across something like this before: recall the way that northern England and southern England speakers produced different ‘u’ sounds in ‘mud’. Since this variation had no impact on meaning, the different sounds were designated as allophones – similar enough for meaning that we counted them as being the same phoneme. I suppose that the variant sounds in the pronunciation of ‘university’ could be described as allophones in this way, but this seems to me to be pushing the notion to the limits of its usefulness. The choice here is not just between two quite distinct and relatively stable phones, but a sound whose permissible realisations range dramatically across a wide spectrum of possibility. Even within a single speaker’s speech, the realisation of these unstressed vowels can shift and change from word to word. Indeed, there is such irregularity in their production, so little that actual matters about the way that they are said, that it would perhaps be better to think of them as oral punctuation marks – meaningful only in their capacity to create the necessary breaks between the other, fully meaningful sounds of the word.
This fact has huge implications for spelling. Multisyllable words invariably contain an unstressed vowel, so huge numbers of phones used in English cannot be properly aligned with a phoneme. Their spelling thus floats away from the speech of modern English, and more often than not indicates some historic pronunciation, and its ultimate motivation is generally to do with the meaning of the word, or word-part, that they occur in, and not how it is said.
Take the word ‘opportunity’ as an example. The over-zealous partisans of the alphabetic principle will usually have taught children to believe rigidly that the spelling ‘or’ represents the ‘or’ sound, as might be found in the word ‘port’. But, of course, rarely does anyone actually say ‘opportunity’ in this way – ‘op-port-tune-it-ty’. That’s because the relevant syllable is unstressed, like the ‘iv’ in university. You could, if you wanted, say ‘or’ without any harm to meaning, but people generally tend to say something like one of these:
o-pe-chune-et-ee
o-pi-chune-et-ee
o-pa-chune-et-ee
o-pu-chune-et-ee
o-po-chune-et-ee
Given these permissible variations in pronunciation (which, to be clear, are barely audible, given the unstressed quality of the vowel), it’s impossible to simply listen to ones speech to work out how to spell the word. The only thing you can do is look to the word’s meaning and its history. In fact, the second syllable of ‘opportunity’ is spelt ‘por’ because the word’s origin lies in the Latin expression for a wind which is ‘directed to port’ – this being a pretty good opportunity if you were one of the Roman sea-captains who presumably coined the word.
When thinking about how to spell a word in English, etymology – the history of words – and morphology – the meaning of words bits – is thus just as, if not in many cases more, important than the way that we speak. This fact is hardly a revelation: it has been well understood for as long as there has been a notion of ‘correct’ spelling (which, in the scheme of things, is not all that long – only a couple of centuries or so). It is in danger of being forgotten in some circles, however, because of an excessive reliance on the alphabetic principle, and the reason for this is almost certainly the growing over-reliance on the alphabetic principle in the teaching of the opposite skill of spelling: reading.
The American Congress Report on Literacy, considered by many to be a final word on the research into reading instruction, states pretty clearly that drilled, explicit instruction in the alphabetic principle (what is sometimes called ‘phonics’) is for most students only of much use in their very first year of formal schooling (chapter 2, page 94). And yet, increasingly, primary school children are subjected to programs of study in the relations between phonemes and graphemes that are considerably longer than this. The UK government has introduced a test for students in these relationships that takes place at the end of the second year of schooling, ensuring that students will have to study this material for well over one year, if not two. And, worse, many private providers of reading programmes encourage schools to buy into a packages of three or even four years of this kind of instruction. Older students identified as ‘poor spellers’ are often subjected to even more phonics, which seems pretty perverse, given the way that spelling actually works.
In my experience, most people who use this material presume the misleading definition of a phoneme as a fixed sound, and end up explicitly teaching children that this is the case. And the effect of this overkill has been to skew understanding – in the minds of both students and teachers – about how English spelling truly works.