Unlike universal grammar — the hypothesis that humans have innate structural language rules — it is hard to find advocates for universal vocabulary.
If we are not born with any semantic understanding of words, we must learn them. By watching how others use language we acquire knowledge of how words are intended to be used. There are continuing dust-ups on what meaning is, but that meanings (whatever they are) are learned through empirical experience is uncontroversial.
Most semantic knowledge relies on an understanding of words, indeed this statement is almost tautological. Even our theories of nature, though they may be succinctly stated in mathematical symbols, are empty without words. We might agree the Schrodinger equation is beautiful, but it takes a textbook’s worth of words to ground its abstract elegance.
The greatest part of the edifice of personal knowledge is based on one’s understanding of words. How strong are these foundations? Where do they come from?
Here is a personal calculation:
I include details on how I arrived at these estimates in an appendix section. The figures are based on rules of thumb and coarse estimates of how I spend my time. For example, a film script is about 10 thousand words, suggesting one thousand words for every 10 minutes of film or TV and I watch about two films worth of content in a week.
Much more dramatic than a current weekly total is to consider a lifetime diet of words. Combining the above estimates with ones from other phases of my life gives a figure very close to 1.5 billion words:
No doubt this calculation is imperfect. I’ve not accounted for words heard on the bus, or seen on cereal packets. I’ve brushed aside subtle questions about when a word is a word rather than a patch of squiggles or an incoherent sound. Yet I believe this accurately captures the scale of language encountered in a lifetime.
To add some color to this, the English language articles on Wikipedia total about 3.9B words. So 1.5B is not too shabby by this measure. On the other hand, it is a fraction of 1% of the words within the 14M books on the shelves of the Bodleian Library.
It is curious that the written and spoken word vie for my attention with almost equal success. I am also surprised that leisure time provides such a dominant share of my printed and parol nutrition. As is often the case, however, I struggled to disentangle work from leisure in my accounting. For example, much of what I have categorized as leisure reading is work-related. I have not marked it as such since I do it outside of work hours. Perhaps part of the reason leisure outdoes other settings is that in work and education the balance may be towards production rather than consumption — a calculation for another day perhaps.
It is always sobering to see a reminder of the finitude of one’s experience. Though this is but one yardstick to measure the size of a life, it is a significant one. As Fernando Pessoa writes via his mouthpiece Bernardo Soares:
“Prose encompasses all art, in part because words contain the whole world, and in part because the untrammelled word contains every possibility for saying and thinking. ”
You’ll find Pessoa/Soares in different spirits some pages apart:
“A man of true wisdom, with nothing but his senses and a soul that’s never sad, can enjoy the entire spectacle of the world from a chair, without knowing how to read and without talking to anyone.”
I don’t think either quote is quite true or is even intended to be so. Yet there is something here. Words are important, but so are the wordless spaces they span.
Over our lives, words will be granted to us day by day, the wages of living. They take us on a journey from knowing nothing but what we see, to inhabiting worlds we could not have imagined. Set your course carefully, you can never see the whole world.
Notes
Quotes are from the Book of Disquiet by Fernando Pessoa, translated by Richard Zenith.
It is germane to note that there are proponents of deep physics — an innate feeling for how objects behave, and also for deep psychology. The implications of this would be that while we are not born knowing the names for certain categories, such as hard, soft, stationary we have conceptual slots waiting for a label to be attached. If this is the case, it could well be that our vocabulary learning is underdetermined by the data. Such a view extends Chomsky’s poverty of the stimulus argument from grammar universals to encompassing lexical learning.
Estimates
My current consumption:
My current consumption | Weekly words | |
Books | One novel-length book per week | 100k |
News & periodicals | ⅓ of The Economist each week & a miscellany of other articles | 50k |
Conversing socially | The amount of time I spend listening to flowing speech socially compresses to perhaps 45 minutes per day (close to 90min of two-way conversation). I assume 125 words per minute. | 40k |
Podcasts & radio | My mileage varies considerably. I often listen to Podcasts while doing chores. Estimate: 45 minutes per day. | 40k |
TV & film | There are ~10k words in a movie which suggests a simple rule of thumb: 1k words for every 10 minutes of content. I, almost religiously, watch one film per week, and about the equivalent length of TV | 20k |
Social media | I rarely venture onto FB and Twitter and I confess to only reading a portion of group conversations on WhatsApp | 3k |
Meetings | 2 to 3 hours in meetings on a typical day. Again assuming ~2 words per second but discounting ~15% of these (since they come from me) | 85k |
Emails | A fairly long email is around 500 words, most are short and many I skim | 15k |
Slack | At Opensignal we rely on Slack a lot. While the quantity of messages eclipses email, they are (typically) concise. | 20k |
Misc. work reading | Pull requests, papers, Jira tickets | 20k |
My forecast consumption:
Consumption forecast — retirement | Weekly words | |
Books | One novel-length book per week | 100k |
News & periodicals | Roughly the same as current levels | 50k |
Conversing socially | Assuming a 50% increase | 60k |
Podcasts & radio | Doubling to 90 minutes per day | 80k |
TV & film | Assuming three films worth per week | 30k |
Meetings | I will keep myself busy with something, perhaps an hour a week on Handforth parish council | 6k |
Social media | Against the tides of fashion, I plan to increase my social media engagement | 6k |
Emails | The idea of a plentiful correspondence appeals to me, though I doubt it will reach Darwin’s levels | 8k |
Consumption estimates for my former selves
Consumption estimates — school | Weekly words | |
Books (home) | One novel length book per month | 25k |
News & periodicals | A couple of features and several articles per week | 20k |
Conversing socially | Assuming a 50% increase on current levels | 60k |
Podcasts & radio | Fewer podcasts, but quite a lot of radio | 40k |
TV & film | Slightly more than current levels | 20k |
Lessons (talking) | Assuming 15 minutes of talking in each lesson and 6 lessons per day | 45k |
Lessons & homework (reading) | Assuming another 15 minutes of reading in each lesson, another 30 minutes at home. | 120k |
Social media | A tiny bit of MSN | 1k |
Consumption estimates — early years | Weekly words | |
Books | We spend perhaps an hour a night reading to my son, with many interjections on his part. At nursery, it is probably quite similar. | 35k |
Listening to adults | Even if we’re not talking to him, he’s listening! | 40k |
Conversing socially | There is a lot of talking in the nursery I’m sure! | 90k |
Songs | Assuming 30 minutes per day, 1k per every 10m minutes | 20k |
TV & film | He does watch more TV than I do, perhaps 1 hour per day | 40k |