Fiat Lex: A Dictionary Podcast

Getting A Word Into The Dictionary

May 17, 2018

Welcome to Fiat Lex, a podcast about dictionaries by people who write them! Yes, really.

Meet Kory and Steve, your intrepid and nerdy lexicographer-hosts who will give you the drudge's-eye view of English and dictionaries in all their weirdness. In our first episode, we:

- blow your minds by telling you that "the dictionary" doesn't exist;
- talk about how new words get into dictionaries (not by petition, so STOP ASKING) and how that's not as straightforward a process as you would think;
- explain how lexicographers find new words, which sometimes involves beer and diapers;
- touch on how words get taken out of dictionaries, and how that's not as straightforward a process as you would think, either. Assuming you think about such things. (Who are we kidding here?)

- Kory spells a word aloud correctly, which will probs never happen again;
- Steve channels Chumley the Walrus and then goes right into fancy linguist talk about velars and coronals;
- Tennessee represents!


Steve:   Hi, I'm Steve Kleinedler

Kory:     and I'm Kory Stamper.

Steve:   Welcome to Fiat Lex,

Kory:     a podcast about dictionaries by people who write dictionaries.

Steve:   We're so glad you're here listening to us talk about this. So we've been thinking about doing this for while.

Kory:     Yeah, and we just want to give you a little intro. What's the whole point of doing a podcast about dictionaries? Well, dictionaries have lots of interesting information in them and everyone uses them.

Steve:   And who are we, you might be wondering? Why should you be listening to us as opposed to anyone who has a concrete thought about anything under the sun? Kory and I have both worked on a dictionaries for several years. I was on staff with the American Heritage Dictionary for over 20 years,

Kory:     and I was on the staff of the Merriam-Webster dictionaries for over 20 years. Gosh, we've probably got 50 years of editing experience between us.

Steve:   Yeah. Especially if you count all the stuff we did beforehand. I worked on a lot of dictionaries for a company that was called National Textbook Company that has since had been eaten and subsumed by other media conglomerates. They might be part of Tronc now for all I know.

Kory:     TRONNNC

Steve:   The Tribune group. And my background is I have a degree in linguistics. I took a lexicography course at Northwestern and I started getting freelance work from my professor after I graduated, and one thing led to another, as they say.

Kory:     And I have no degree in linguistics. I have a degree in medieval studies and I fell into this job-- literally, almost tripped on a newspaper which had the want-ad for the Merriam Webster position.

Steve:   Well, medieval studies though, are hugely important in this field from the standpoint of etymology or just understanding how words work.

Kory:     Yeah, that's true. There are a lot of medievalists in dictionary companies. We could run our own Ren Faire.

Steve:   Yes. And that ties in also--we have both written books. I have written a English textbook called "Is English changing?" published by Routledge and the Linguistic Society of America,

Kory:     And I have written a not-textbook, regular-book, called "Word by Word: The Secret Life of Dictionaries," which is out in paperback this year.

Steve:   And in that book you can find out how Kory literally tripped over a newspaper and ended up in the position that she did.

Kory:     So to speak. All right, so again, dictionaries. What are they? Why are they? Who uses them? Who cares?

Steve:   Everyone uses them to some extent, whether-- Even though people may not use print ones as much as people used to, certainly people look up words all the time, whether they enter terminology into a search bar or look it up in print. That content comes from somewhere.

Kory:     And we are the people who write that content. One of the questions we get all the time and we thought would be a great question to address today in our inaugural podcast, is how words get into the dictionaries that you use

Steve:   and how they get out of them.

Kory:     Yes. Yeah. Let's talk about--let's talk about how words move in and out.

Steve:   Well, it's important to note that some people-- you hear people refer to "The Dictionary" as if there were only one in one authority, kind of like the Bible--which is also laughable because there's multiple versions of the Bible as well. Dictionaries are still in the process of being written, compiled, dictionary entries are being drafted, edited, written, and existing ones change over time.

Kory:     Yeah. And not only do they change, but different dictionaries serve different purposes. So different definitions are going to look different depending on who the audience is, who's--which companies writing those dictionaries. You know, Steve and I wrote for different dictionary companies though everyone assumes that we wrote "The Dictionary."

Steve:   Everyone also assumes that we're constantly at war.

Kory:     We're not, we're buddies.

Steve:   We are. We're friends.

Kory:     Yay, friends forever!

Steve:   And as Kory mentioned, there are different audiences for dictionaries, not just different companies. So you could, for example--there are several different legal dictionaries out there and they are going to take a more ingrained approach to the legal defining than a general purpose dictionary will. And you will find all sorts of dictionaries. Slang dictionaries, for example.

Kory:     Yep. So, so with that in mind, we'll just talk about general dictionaries, which are dictionaries that we've both worked on. So how do words get into the dictionary?

Steve:   The answer is not whimsy.

Kory:     Sadly. So quit asking me to put your damn word in the dictionary

Steve:   Oh, actually: we're talking about how words don't get put in dictionaries, but a good way to not get a word included in a dictionary is to write to a dictionary company and say, "Hey, I invented this word," or "I think we should add this word." Even if you are a third grader who writes a very cute, plaintive letter. Sorry, but that's not how it works.

Kory:     Those are the worst letters, too, because we have to write back and say "no,: which is, you know...I mean.

Steve:   Who wants to to shatter the dreams of a third grader?

Kory:     Yeah. We are basically just autonomous thesauruses, but we still do have feelings. We don't like hurting other people's feelings. The way that words get in generally is through usage. Not usage as in, like, "I'm writing a dictionary and I've used the word now in print once, and so, enter it," but sort of sustained and widespread usage. And, generally, written usage, which is kind of a bugbear, but that's what we got.

Steve:   It also depends on the kind of word: you know, what realm it is, what category it falls into. Some words--and these are in the vast minority--have a very easy path. So if you are a scientist who has a synthesized a new chemical element, you and your team get to name that, and as long as the governing board approves it, that's the name. And you know what? In it goes, because the people in charge said so. So tennessine, for example, which was synthesized by researchers in several universities in the state of Tennessee, [they] named element 117 that. And uh, there you go. That's all you need.

Kory:     Tennessine?

Steve:   Tennessine.

Kory:     T-e-n-n-e-s-s-i-n-e? How do you spell it?

Steve:   [Chumley the Walrus voice] That's right, Charlie.

Kory:     [laughter] The amazing thing is that I just spelled that aloud, and I can't actually spell aloud.

Steve:   And that was a Chumley the Walrus imitation. I'm dating myself there. [Chumley the Walrus voice] Sorry, Tennessee.

Kory:     Alright, so usage. I said "written usage" and this is a bugbear. But the reason that we use written usage is it's a standard way that we can do it. So why don't we take spoken usage? Because that's actually that's how words get created first, is usually in speech. They usually don't get written down first.

Steve:   The words that are used in the spoken vernacular are completely 100 percent valid. And there are outfits out there that track this type of thing. Corpuses, which are large collections of words. There's some corpuses that compile a written documentation and other ones that compile samples of recorded speech. Dictionaries, however, tend to focus on words that have been written. Generally, but not always, and more so in the past than now. Not just written, but from edited sources.

Kory:     Yeah. Edited, prose sources. So poetry doesn't really count, because you can use a word with a really nonstandard meaning in poetry--or with no meaning in poetry, you can just use it for sound. But the part of the reason that's difficult is because we now have access to more transcripts of spoken English, and the problem with that as a lexicographer is, it's really actually hard to transcribe a word you've never heard before from speech into print. You can misspell it, you can mishear it. You can not understand the context. So. That's one of the reasons why we focus on written, edited English. Though the "edited," even that's kind of going away these days.

Steve:   More and more, you will see references to things in blog posts which aren't always edited, or even, you know, the comment section, or that kind of thing. And as to the spoken ones, you can phonological determine the phonemes that are used. But if you were transcribing-- it's the same problem that newspaper journalists have in quoting people. Usually the quoted English in newspaper articles is written out in standard English. Even though when you speak informally, you're changing the velar "-ng" at the ends of words like "going" to the coronal "-n," like "going" to "goin'", and you're probably not going to write "g-o-i-n-apostrophe" in most examples of written transcriptions. However, that is what is being said. So, would you include that? Would you not? In the past when you had the finite print page, that limited what you could put into a book. Especially when there's a regular phonological change like that velar to coronal nasal pattern that I mentioned.

Kory:     Right. So the other thing that's interesting about this is, this is how all words get in, and the way that you find new words to put into the dictionary has also--I think it's changed over even the last 10 years.

Steve:   Absolutely. In the past, when I first started, you had boxes and boxes of note cards on which someone had dutifully typed or printed out and pasted onto that note card, a usage of that word, also known as a "citation." But even in the nineties when I started, that shoe box of cards was already supplemented with returns from what we call a KWIC concordance. This program that overlays on top of a large corpus. You can search on a specific word and it will show you every instance of that word with five or 10 or 12 words, whatever you decide on either side of it, to get some context by it. So even in the nineties--and before then, I just wasn't working before then-- you're juggling these cards and these citations in your concordance.

Kory:     But even the way that we got citations I think has changed. It used to be--so at Merriam Webster, it used to be that all of the editors read for at least an hour, maybe two hours a day. We had a source list that was a list of magazines, journals, books--not just journals and magazines, but trade journals, specialty journals. And we would go through as an editorial floor and divvy stuff up and say, "You're going to be the one who's reading _National Review_ and _The Nation_, and you would read-- I mean, ideally you read every issue that got delivered to you, and you read looking specifically for words that caught your eye, which were generally new words or new uses of old words. And that's how we used to get citations. This was before these, these big corpora were available. I mean, not just available for purchase, but just available, period.

Steve:   The first edition of the American Heritage Dictionary back in the sixties used a corpus called the Brown Corpus, from Brown University. But in addition to these collected citations. So corpus material had always been used. However, editors still read in the manner Kory described and collected citations well into the mid-2000s, by which time, you know, much like every other corporation in the world, outside pressures meant more people were doing more things. And that was one thing that, because information was so much more easily obtainable, reading time for markup decreased over the years. But it wasn't just books or periodicals that you were assigned to. I remember once when we were discussing what the proper plural of "pierogi" is--is "pierogi" a plural? You know, those little Polish potato dumplings? Is the singular "pierog," which is what it would be in various Slavic languages, but not in English? I took a box of Mrs. T's Pierogies and cut the carton and pasted that onto a note card as citational evidence. And you will find in the files, not just handwritten stuff from way back when or, taped or glue- on photocopies. But sometimes you will find like portions of boxes or whatnot appended to these note cards.

Kory:     Oh yeah. I used to bring in things. At Merriam Webster, we had a filing cabinet where you put all of your marked materials, and we had a typists room--these poor women, their whole job was to type up citations and put them in our database and put them on cards. And I remember one day coming in--it was really early, early on in my time--coming in and someone had put like a Lean Cuisine box in the marking pile, and I went to go throw it away because I thought it was trash, and I saw someone had marked it. And then I went crazy. I think I've marked beer bottles and left them there. I remember marking diaper boxes when my kids were little. People mark menus, take-out menus--

Steve:   What's with the focus on food that we're all marking?

Kory:     I'm really hungry. Yeah.

Steve:   Speaking of those poor women, we had a poor intern in the early 2000s--for some reason we had our main citation file, but there was also a separate one that had been started for a separate purpose. And it was annoying because you'd always had to check in two places. So over the course of three summers with three different interns, they had to alphabetize this smaller set of cards into the main ones--which, not only putting it in the right place, but then that of course forces everything back.

Kory:     Right.

Steve:   So it was, for three summers, this is basically what a college student did.

Kory:     That's life skills right there. I'm sure that's worth some kind of college credit.

Steve:   Yeah. And so through examining these citations, you find evidence of how long a word might have been used, how widespread it is. We generally don't enter terms that are hyper-specific to one, you know, one occupation or one location. It's a general purpose dictionary. So there's usually some type of general frequency. By the time a specialized term has also reached the general public, that's one indication that it's time to go in.

Kory:     Yeah. And I think the rate at which some specialized terms sort of become widespread is different. So I remember, both "AIDS" and "SARS" got into Merriam-Webster dictionaries really quickly, because it was, just sort of--all of that evidence was there right away. You knew that these were syndromes and diseases that were not going to go away.

Steve:   Ditto with us for "Zika."

Kory:     Yep. But the other thing that's really interesting is that, when you've got sort of this big body of words in front of you, you also see these really weird patterns of usage. Like, sometimes you'll have a word show up in print once every couple of years or once every five or 10 years, and then boom. And other times you have a word that shows up and booms right away, and then drops out of use really quickly. And particularly in the old days, when everything was dead-tree publishing, you couldn't justify entering a term that was brand-new unless you could justify that it was going to be around for another 10 years, because that was the lifecycle of a dictionary revision. And I mean, it sounds ridiculous, but in print publishing, you can't afford two or three lines on a page for a word that is just not going to be common in five years.

Steve:   It's this test of ephemerality that used to be very important. Of course, nowadays you can just add a term online, and it won't necessarily make it into print. I remember one of the very last words we entered for the fourth edition of the American Heritage College Dictionary was "dotcom," and it was, this was still in the late '90s. It was, I think, right before or during the bubble. It was probably a little sooner than we normally might have, but it was like, "all right, this is now or never. This word is probably going to stick around." In that case, it's like, let's err on the side of caution and put it in. But even at that point, the writing was on the wall, as they say.

Kory:     Yeah. And often, I mean, I don't know if it was like this for you, but I often found whenever we did revisions and we started looking through the citational evidence, I would always find more and more and more words to enter. And then you have to do this very weird--you have to get very choosy in weird ways.

Steve:   Or, if you're working on a printing--and again, this refers back to the day of... Did I just use "refer back" right? Is someone going to ding me on that?

Kory:     Sure, I don't care.

Steve:   I don't care either. Ding me if you want.

Kory:     Sense two! Sense two of "ding."

Steve:   yes. Uh--what were we talking about? Referring back? What am I referring back to?

Kory:     To print.

Steve:   Oh, right. So if you're doing a new printing and, say, someone has died and you have to "open that page" to fix the death date, then you can go anywhere on that page! It's like, "oh, I can add this, I can add this." So just by the sheer alphabetic accident of where the word falls, it's like, "This page is open, I can insert this word." Whereas if it was spelled slightly different and fell on a different page, you might not have been able to do that.

Kory:     Right. And which kind of--so, this underscores something that's really interesting too about dictionaries: that nobody realizes dictionaries are a commercial proposition. Everything is driven by how much will it cost, how much time will it take, will we recoup our expenses? And that's just, you know, that just doesn't happen very much with language.

Steve:   Here's an anecdote. The fourth edition of the American Heritage Dictionary was in full color.

Kory:     oh ho ho

Steve:   Which of course was expensive, but one thing it did: because the headword was in its own color, it meant that you didn't have to reverse-indent the entry.

Kory:     Ooooh.

Steve:   And because of that, the entries could be flush on the left margin, which gained us, like, two characters for every line of an entry after the first line. The Savings in space by getting those extra two characters aligned was one of the things that offset the cost of going into color. But of course, then we ate it up by just cramming that much more into it. The amount of space--I mean, when people...And this ties into our next bit about how do words come out of a dictionary (and the short answer is, not often), when we talked about all the new words that were added to the Fifth Edition that weren't in the Fourth Edition, and people said, "Where'd the space come from, it's the same length?" A lot of it was interesting design choices. Oh-- I'm sorry, that was between the Third and the Fourth. The fact that you didn't have to take up that space for the indent saved us, you know, allowed us to keep thousands of words. I mean, when you, look at two characters per line, over 2000 pages, that really adds up.

Kory:     And you know, when people ask about getting a word into the dictionary, one of the other parts of the commercial bit that no one realizes is that, you know, we are _never going to be caught up_ with getting words into the dictionary. We are always, always, always behind, always having to make these weird editorial choices that are half-based on, is this page going to be open? Or if you're going online, even, how many people can we get on staff who are going to be able to do this kind of defining quickly? And then we need to have someone proofread it, and we have to have someone copy edit it, and then the pronunciation editor needs to go through it, and then the etymologist need to go through it. It's not just me farting around at my laptop saying, "I'm going to enter the word 'CRISPR' today!" That doesn't happen. It still needs to go through, you know, anywhere from five to 10 other sets of eyes before it makes it online.

Steve:   "CRISPR" the gene editing?

Kory:     Oh yeah. Naturally.

Steve:   Shout out to Carl Zimmer. We can tweet at him after this podcast now.

Kory:     So, so that's how words get in. It's through written usage. That's not historically always been how it is. The earliest English dictionary, the word lists were just sort of... In the 1600s and early 1700s, they were mostly just words that the single author thought of. So whatever they thought was worth entering, whatever they thought was worth studying. So early dictionaries were hard-word dictionaries mostly, and they were written mostly by wealthy white dudes.

Steve:   And then, we're, of course, talking about living languages. If you are writing a dictionary of a dead language, it is possible to include every word. Because, you know, again, I always go back to Tocharian B. We know what words were used and unless there's another archaeological find where they find more inscriptions, the words that we have are the words that are there. And so you can have that finite list. Kory, how do words come out of a dictionary?

Kory:     With difficulty. So I don't know what the criteria at American Heritage is, but generally speaking, once a ,word gets into the dictionary, people keep using that word or people feel like they now have license to use that word more. They feel like the word has been made official even though that is not at all what the dictionary does.

Steve:   And like you said earlier, just that test for ephemerality. Because we're not adding words until we think they're going to stick around, there's, there's less chance of a word having to come out because it hasn't stuck. And you never know when it's going to come back to life.

Kory:     Oh God. "Snollygoster"!

Steve:   Oh yeah--you do "snollygoster" and then I'll do mine.

Kory:     "Snollygoster!" So very quickly, the way that we determine whether a word is eligible to be removed from the dictionary at Merriam-Webster is, you need to prove that it has had no significant historical written usage, and that it has no current written usage. And that's within a timeframe of, it really depends, but I think when we were doing the Collegiate, we were aiming for 50 years of no written use. Which, that's actually impossible to find now that everything is digitized. Now you can go on Google Books and you can find one dude in 1956 who has used this word consistently in every article he's written now it breaks it. So, actually, we enter far more words than we end up taking out. And when we do take words out, it has to be well considered. Enter "snollygoster." So "snollygoster" is a word that's a noun, it refers to a shrewd or unprincipled person. And it was removed from Merriam-Webster's Collegiate dictionary for the 10th edition, I believe. So that would have been '93. And at that point, you know, they reviewed the evidence and said, eh, has a lot of use back in the forties and fifties, but not really much since. And we need the space. You always need the space. So they pulled it out and then it turns out that William Safire _really_ loved the word "snollygoster" and began using it in his columns. And then Bill O'Reilly_really, really_ loved "snollygoster" and began using it on his TV shows. And so for the 11th edition, pretty recently, we had to put "snollygoster" back in, because now people are using it again.

Steve:   And the example I like to use about the danger of removing words: in the late nineties when we were finishing up work on the Fifth Edition and we needed space on this one page, we talked about dropping the sense of "chad" associated with punch cards. Because usually when we do drop things for space, they tend to be geographical entries that are suburbs of Los Angeles or Chicago or something that's encyclopedic information. The space is much better used for a vocabulary word. But obsolescent technology is--

Kory:     Oh yeah, that's a big one--

Steve:   It's a fertile ground for possible deletions. And we almost deleted "chad." And then I remembered when it was going back and forth among the editors, I remembered that there were still some states that used punch cards for voting, and we're like, oh, well we should keep it in then. And lo and behold, one year later, right after the book came out, uh, _Florida_. And it's good that we kept it in, because suddenly "chad" was on everyone's lips.

Kory:     Yeah. Hanging chads, pregnant chads--

Steve:   all those chads. Oh Chad.

Kory:     _Chad._

Steve:   So, it's about that time. We hope that you have found this entertaining.

Kory:     Yeah. And if you want to tweet at us, you can tweet at us. We are @FiatLexPodcast, F-I-A-T-L-E-X podcast. One of us will answer you. If you have things you want to hear on the podcast, let us know. Actually,both of these questions, how do words get in and how do words get taken out, were suggested by faithful Twitter followers.

Steve:   Don't tweet at us that "FiatLex" is combining Greek and Latin. We know that and we'll talk about that in a later podcast.

Kory:     Yeah, you'll have to get over that. So thanks for joining us. We'll see you next time.

Steve:   Bye.