Black Box Testing in Linguistics

Nachele Thanhthu and Nyklus Affanita
Dept. of Linguistics and Name Science
Orvall Oryan School for the Onomastically Challenged

“If it’s a black box, one linguist can pick it up; no number can, however, open it. If it’s any other color, it isn’t of any theoretical interest.”
SpecGram Letters Editor

Despite recent stunning advances in neurolinguistics and long-standing claims of “psychological validity” in other areas of linguistics, the human language faculty is still, in the technical sense, a black box. That is, so the metaphor goes, there is no good way to see inside the box to figure out how it works; instead, in a nutshell, the experimenter must surmise the function of the box by comparing inputs and outputs, hypothesizing about the relationships between them, and testing those theories with new inputs.

One generally unchallenged but implicit assumption in the black box metaphor is that the black box doesn’t behave differently because it is being observed, tested, or studied. When dealing with language, that is often not true. Subjects of linguistics experiments who know they are subjects of linguistics experiments may very well behave differently as a result of that knowledge. The assumptions one may confidently draw in such a situation are all of the type that begin with phrases such as: “In a controlled laboratory setting...”. This is akin to training your cat to stay off the dining room table. Usually all you can really accomplish is to keep the cat off the table while you are in the house.

So, we decided to apply a black-box-within-a-black-box test to the study of pragmatics (one of the even murkier portions of the black box of Language) by sending the following question to the editors of several reputable linguistics journals.

“How many linguists does it take to pick up a box from the ground?”

None responded except the editors of SpecGram. By a fortuitous accident, our team of poorly-trained and unskilled linguistics undergrads sent two copies of the letter to SpecGram, and the letters editors answered it a second time. Then we intentionally sent it a third, and a fourth, and a fifth time. Apparently, different editors answered each time, and the editorial team as a whole did not notice the repetition.

The responses of the SpecGram editors are reproduced below.

  2. Picking up a box from the ground requires at least 1 + N linguistsone to identify the language in which the action is to be conceptualized, and then a variable number of additional linguists depending on the particular language and the various analytic stances on it among the linguists who study it. After all, in some languages one might not pick a box up from the ground, but rather move it off the ground’s head, or shove it antigroundwards. If you’re studying English, assume N = 250 ±100. Generativists and their modern equivalent can be discounted, since to them, picking up the box is irrelevant; it’s the valid potential for pickupability that’s the key, and that exists even if there are no people to pick up the box, or no box, or no ground. If you’re dealing with a South American language, a good rule of thumb is to assume N = (the staff of the SIL)/5. If it’s a language only studied by the Russians, N will remain undefined, since we can’t read their articles; if it has only been studied by the French, N will likewise remain undefined, since if we read their articles, they will gloat because they’ve forced someone to find French relevant.

  3. At the universities we attended, none. That’s what the grad students are for.

  4. Provided the box isn’t very heavy, most linguists are able-bodied adults of average build, and very few should have any difficulty lifting a box off the groundor, more specifically, the average linguist would have no more trouble lifting a box off the ground than the average human being. Why would you think otherwise?

  5. You need at least...

    • one armchair theoretician to conceptualize the situation and its desired future outcome
    • one typologist to interpret that into its (formal) grammatical equivalent
    • one lab assistent to write it down and draw a nice picture of it
    • one fieldworker to go out and find an informant who can pronounce what he sees in the picture
    • one phonetician to record what the informant says
    • another phonetician to listen to the first phonetician’s tape and say, “Hmm, fascinating”
    • another typologist who can write a grammar fragment based on the tape’s content
    • another fieldworker to find another informant who can make a grammaticality judgement
    • and probably several others

    It seems unlikely that real linguists would ever get around to actually performing the task, though, unless it’s done as a test in order to elicit a sentence from an informant.

We have to admit to some difficulty determining the scope of the unit being tested. Is it the presumably individual editors who provided such diverse responses? Is it the collective epistolary editorial entity? As this is only an initial investigation into the efficacy and analyticity of the approach, we elected to evade an exact answer, and the audience may adjudge independently as the analysis unfolds.

Response (1) surprised us. We thought our covert experiment had been exposed or recognized, but the eventual presence of response (2) alleviated our fears. Using a new discourse coding system we have been developing over the last decade (Affanita and Thanhthu, to appear), we would mark this response as {+metaphorical}, [+snarky], <+theoretical>, 〘+true〙, and ⁅+unhelpful⁆.

Response (2) surprised us again, considering its length and pointlessly baroque detail. We encoded it as ⁽+elaborate⁾, ⦑+fuzzy⦒, ⦓+numerical⦔, ⦅+snarky⦆, ₍+theoretical₎, ⦉+true⦊, and 『+unhelpful』.

Response (3) was also surprising, but only because we didn’t expect a third response at all. We encoded it as +concise, ⭆+concrete⭅, !¿+degrading?¡, ↜+exploitative↝, ↪+snarky↩, ↷+true↶, and ↺+unhelpful↻.

Response (4) surprised us simply by the lack of surprises it contained. We encoded it ⬝+concrete⬞, ⬨+deliberately_obtuse⬧, ⬖+general⬗, ⭑⭒⭑+obnoxious⭒⭑⭒, ⬔+practical⬕, ⬘+snarky⬙, ⬬+true⬭, and ⬡+unhelpful⬢.

Response (5) was predicted by our Black Box Neural Network Predictive Process™ (Affanita and Thanhthu, to appear) after feeding it the first four responsesexcept that all the e’s were replaced with q’s and vice versa. Both the accuracy of the prediction and the e/q isomorphism were shockingly surprising. We encoded it ⨭+concrete⨮, ⧰+elaborate⧱, ⦪+fuzzy⦫, ⨫+general⨬, ⪻+meta-theoretical⪼, ⫷+practical⫸, ⨴+predictable⨵, ☻+snarky☺, ⚇+true⚇, and ✿+unhelpful❀.

Unfortunately, we don’t have the space to expound on the ⪻delimiters⪼ used in our feature encoding system, but suffice it to say that the feature names are generally transparent and the delimiter shapes are generally iconic. Even with the potentially limited detail provided by the underspecified encodings, a clear pattern emerges.

The features common to all responses are variants of the UR-features +SNARKY, +TRUE, and +UNHELPFUL. In an earlier but as-yet incomplete analysis of the pragmatics of mathematician’s speech, we discovered a recurrent theme of +TRUE, and +UNHELPFUL among mathematicians (Thanhthu and Affanita, to appear). This trend is exemplified in the following sample of reported speech:

Two scientists riding in a hot air balloon had been blown off course by strong winds. They had not seen another person in hours, and they were completely lost.

As they approached a mountain range, they spotted a woman atop a mountain, and yelled to her, “Can you tell us where we are?”

After some thought, the woman replied, “Yes.” The scientists immediately yelled back, “Where are we?”

The woman thought for a few minutes, and just before the scientists floated out of shouting range, she replied, “You are up in a balloon, over a mountain range, travelling at 15 to 20 miles per hour.”

One scientist turned to the other, crestfallen, and said, “Well, she wasn’t very helpful.” The other replied, “Yeah, she was a mathematician.”

“What?” responded the first, stunned. “How do you know she was a mathematician?”

“Well, she took a ridiculously long time to answer a simple question, her answer was indubitably correct, and it was totally useless.”

These findings are consistent with the stereotype of abstruse and introverted mathematicians. However, the addition of +SNARKY in the case of the editors of Speculative Grammarian is metamorphic, to borrow a geological metaphor. The lack of ABSTRUSEness is quite telling. So as not to jeopardize our chances of publication, let us merely declare satirical linguists, as a class, to be ☯+pesky☭.


While this type of black box testing is difficult to perform, its scope difficult to define, and its results difficult to interpret, such testing should take a rightful place in the toolbox of ambitious linguists.

