Linguistic Topology

I. Juana Pelota-Grande
Centre den Geometrik Linguistiken

Table 1
OBV-IUS Renderings of
Romance Languages
Latin Italian
Spanish Portuguese
French Romanian

Table 2
OBV-IUS Renderings of
Slavic Languages
Russian Polish

Table 3
OBV-IUS Renderings
of Germanic Languages
High German Low German
Dutch Afrikaans

Table 4
OBV-IUS Renderings of Afro-Asiatic
Languages, and Persian
Hebrew Berber
Arabic Persian

Table 5
OBV-IUS Renderings of
East Asian Languages
Mandarin Cantonese
Japanese Korean

Table 6
OBV-IUS Renderings
of Sign Languages
American Sign
British Sign
French Sign
German Sign

Table 7
OBV-IUS Renderings of
Finno-Ugric Languages
Finnish Hungarian

Table 8
OBV-IUS Renderings
of Isolate Languages
Basque Burushaski
Pankararú Ainu

That many groups of human languages are sprung from some common source is obvious to any student of the subject, whether linguist, philologist, or polyglot. However, the detailed nature of these genetic relationships is often difficult to unambiguously determine--and likely the subject of heated debate.

The impressive array of analytic methods brought to bear on the problem is a testament to the inherent difficulty of the task. However, one of the human species' most amazing tools for analysis--vision--has never been fully and properly applied to the problem.

Language and vision are two towering pillars of human cognition, yet the traditional bridge between them--the written word--is a weak conduit at best. Visually rich representations of linguistic information, such as syntactic trees, often allow the high-bandwidth processing of our visual cortexes to provide key insights.

Visual intuitions, while not necessarily rigorous, can point the way toward otherwise obscured solutions to numerous problems--hence the ascendency of computer-aided data visualization in so many fields.

In order to further harness this amazing innate capability, we at the Centre den Geometrik Linguistiken have devised a method of projecting large amounts of aggregate information about a language into a six dimensional space.

The original eight dimensions included three basic dimensions for syntactic, morphological, and phonological information, as well as five more dimensions used to represent semantic and distributional information about the lexicon.

Factor analysis of the data allowed us to find a set of normalized independent basis vectors spanning the syntactic/morphological/phonological space. These composite dimensions of a language we have given the intuitive labels ostentatious, brittle, and votive. The OBV co-ordinate system maps naturally to three spatial dimensions.

Similarly, we were able to collapse the five dependent dimensions of the lexicon down to three independent dimensions, which we call instinct, understatement, and schadenfreude. The three dimensions of the IUS co-ordinate system map readily onto RGB color co-ordinates.

Thus we can reduce the eight original linguistic dimensions to three spacial and three color composite-dimensions, allowing us to render this grammatical signature of a language in OBV-IUS co-ordinates, which are easily interpreted by the human visual system.

The results are remarkable--family resemblances are unmistakable. Consider first some members of the Romance family, shown in Table 1.

The genetic relationship is clear. Note that Romanian visually stands out from its sister languages, but the reasons for this are clear when we look at a couple of Slavic languages, in Table 2.

The mixed heritage of Romanian is made obvious by OBV-IUS. The pedigree of other languages which have blended to various degrees with genetically unrelated language families is similarly clear. Consider the Germanic languages, including the Romance-influenced bastard, English, in Table 3.

The result of importing so much Latinate vocabulary and the long and intimate contact between Old English and Old French is the visually distinct appearance of English in the OBV-IUS system. In fact, it seems clear that English probably has as much claim to being a Romance language as Romanian does.

While Persian, an Indo-Iranian language, is not directly related to Arabic, the influence of a shared writing system and long cultural contact is quite apparent from Table 4.

Looking at the languages of East Asia in Table 5, it is unsurprising that the so-called “dialects” of Chinese are in fact closely related. The influence of both Japanese and Chinese on the Korean language is also made plain.

Sign Languages were more challenging to fit into our system, but with a little creativity, they were in fact made to conform. The various relationships between sign languages are interestingly portrayed by the OBV-IUS interpretation in Table 6.

French Sign Language and American Sign Language are genetically related, while American Sign Language and British Sign Language share a superstrate language (English). Note the particular similarities among those three, but also the superficially similar “shape” of German Sign Language. Also, note the similarities between each sign language and its superstrate language.

OBV-IUS renderings of small language families and Isolate Languages play up the distinctive features of those languages and families which make them so hard to categorize.

Table 7 shows a representative sample from the Finno-Ugric family, which is clearly distinct from the other languages previously rendered in the OBV-IUS way.

Some of the singularly Isolate Languages show quite interesting and unusual features in their OBV-IUS renderings.

Table 8 shows the grammatical chaos of Basque, the unprecedented uniqueness of Burushaski, and the self-reflexive qualities of Pankararú.

The last entry in Table 8, Ainu, defies all logic and explanation. No wonder it is still an Isolate.

This brief introductory paper offers merely a taste of what the study of Linguistic Topology has to offer the wider field. Visual analysis will eventually bring many Topological Universals to light, while also illuminating Topological Constraints on, for example, language change and acquisition.

It is with much anticipation that we at the Centre den Geometrik Linguistiken await feedback on this revolutionary new theory from linguistics practitioners world-wide. New data for Topological Analysis of other languages is also muchly welcomed.

