It would be a great leap forward in humanities-nerd–science-nerd relations if we could convince the STEMmier nerds that linguistics is at least kinda interesting. Of course, linguistics really is interesting, but we have to rope them in first; baby steps are needed before we go full-tilt “minimally transmogrificational principals and parametric aleph movement Chornskyan hegemony” on them. We need to ease them in with simpler things—fun stuff like auto-antonyms, and useful stuff like what distinguishes en dashes and em dashes, or how to properly use a semicolon.
It seems fair to expect our partially hybridized computational linguist friends to serve as the bridge to that foreign land. As with many of the techier nerds, though, they aren’t always the most conversational bunch. To provide a metaphorical railing for this figurative bridge—or is it an allegorical trail to the symbolic bridge? Has this analogy gone off the metaphorical rails? Is it actually a trestle? Have I misplaced an entire train of thought? Where was I?! Right... To encourage our CompLing brethren and sistren to converse with other techies, I propose as a conversation starter the topic of improving Roman numerals! It’s not really relevant to linguistics—or any other humanities, or any of the sciences—but it is kinda interesting.
Our first motivation is to naturalize a sense of quasi-exponentiality in counting. Computer programmers are already overly fond of powers of two—though by 131,072 most have certainly lost the thread—and data scientists love taking logarithms of everything. (Is logarithmancy still a thing? I bet it is in whatever unholy places there may be where data science intersects the Harry Potter fandom.)
A handy method for pseudo-exponential reckoning is to enumerate and/or approximate as follows: 1, 2, 5, 10, 20, 50, 100, etc. Each step is 2 or 2.5 times the size of the last, each number is familiar and arithmetically “friendly”—no awkward 7s or ungainly 13s here!—and there’s only ever one significant digit to keep track of.
Our second motivation is to get rid of the subtractive nature of Roman numerals. While there is a logic to it—even potentially a non-subtractive logic to it†—it is still a big turn off for many people.
We synthesize these objectives by introducing new symbols for 2, 20, 200, and 2000—filling the gaps in the Roman numeral system needed to enable our 1–2–5–10 progression, and obviating the need for subtraction. The new symbols—most being the letter following the one that stands for a given power of ten in the modern Latin alphabet—are as follows:
A good topic for further discussion is how best to continue: is the apostrophus (ↁ) or vinculum (V̅) better for expanding to 5,000 and beyond? Is there a superior, novel solution?‡‡† (Nerds love solving problems, so let them go to town!)
And while the proposed system doesn’t actually make addition and multiplication with Roman numerals easy, it does make both at least somewhat tractable.‡‡‡ Addition is merely a regrouping exercise.
38 + 17 is YXVJI + XVJ.
Regrouping gives YXXVVJJI.
XX is Y, VV is X, JJI is V, so now we have YYXV.
YYX is L, so we have LV, which is 55!
Multiplication is aided by recognizing certain multiplicative patterns. I×10 = X; X×10 = C; C×10 = M. Similarly for the J–Y–G–N and V–L–D series.
A simple example follows.
3×4 is JI×JJ.
Distributive regrouping gives us JI×(J+J) = JI×J + JI×J.
Multiplying by J is just writing it down twice, so now we have JIJI + JIJI.
JIJ = JJI = V, so we can regroup again to get VI+VI = VVII = XJ = 12!
Compare that to the legacy system version:
3×4 is III×IV.
Distribute to get III×V − III×I.
Multiply through and group VV = X to get VVV − III = XV − III.
V − III = IIIII − III = II, so XV − III = XII = 12.
Subtraction is a bummer!
Let’s try a slightly larger problem—134×19:
134×19 is CYXJJ×XVJJ.
Distributive regrouping gives us CYXJJ×(X+V+JJ).
Multiply through to get CYXJJ×X + CYXJJ×V + CYXJJ×JJ.
Multiplying by ten just moves one step up the I–X–C–M, J–Y–G–N, or V–L–D series, so CYXJJ×X = MGCYY.
Admittedly tedious, but in contrast briefly consider—but do not pursue—the subtractive horror of CXXXIV×XIX.
With our kinda interesting, semi-exponential, eminently tractable numeri upgrade in hand, we’re ready to go out into the world and have appropriate, mutually beneficial relations with our new STEMmy, chummy, and oh-so yummy friends!
† You can think of counting in standard Roman numerals as an increment function (“add ‘I’ to the end”) and some rewrite rules to apply sequentially to the end of the Roman numeral:
IIII → IV / ____#
IVI → V / ____#
VIV → IX / ____#
IXI → X / ____#
Similarly:
XXXX → XL / ____#
XLX → L / ____#
LXL → XC / ____#
XCX → C / ____#
And of course:
CCCC → CD / ____#
CDC → D / ____#
DCD → CM / ____#
CMC → M / ____#
So, starting at CLXXXVIII (188) and incrementing by I to CC (200) we get:
CLXXXVIII + I = CLXXXVIIII
IIII → IV / ____#, giving CLXXXVIV
VIV → IX / ____#, giving CLXXXIX
CLXXXIX + I = CLXXXIXI
IXI → X / ____#, giving CLXXXX
XXXX → XL / ____#, giving CLXL
LXL → XC / ____#, giving CXC
CXC + I = CXCI
CXCI + I = CXCII
CXCII + I = CXCIII
CXCIII + I = CXCIIII
IIII → IV / ____#, giving CXCIV
CXCIV + I = CXCIVI
IVI → V / ____#, giving CXCV
CXCV + I = CXCVI
CXCVI + I = CXCVII
CXCVII + I = CXCVIII
CXCVIII + I = CXCVIIII
IIII → IV / ____#, giving CXCVIV
VIV → IX / ____#, giving CXCIX
CXCIX + I = CXCIXI
IXI → X / ____#, giving CXCX
XCX → C / ____#, giving CC
Honestly, it seems pretty straightforward when you look at it like this, but people still don’t seem to like it. This also goes to show that most subjects are amenable to linguisticky interpretation if you look hard enough. Ah, Linguistics! Is there anything it can’t do!?
‡ Alas, D is already taken, but G’s similarity in shape to C makes for a pleasantly felicitous alternative.
‡† Note that a number of nattering nobodies insinuate that this makes nine somehow “naughty by nature”. Nincompoops!
‡‡ In Canada, Z would be an acceptable alternative for the value of 1, but only when dealing with quantities of 41.
‡‡† Of course there is a superior, novel solution, but it’s usually best to let your nerd try to figure it out for themselves.
For those among the curious for whom the superior, novel solution isn’t immediately obvious...
The legacy system has 1-based I–X–C–M and 5-based V–L–D, to which we’ve added 2-based J–Y–G–N. Adding more simple symbols (as opposed to the typographically challenging apostrophus or vinculum) just makes sense. For the 1-based and 2-based extensions, we add similarly shaped pairs: E/F (10,000 and 20,000), H/K (100,000 and 200,000), O/Q (1,000,000 and 2,000,000), P/R (10,000,000 and 20,000,000), and S/Z (100,000,000 and 200,000,000). We can extend the 5-based run by taking graphemic inspiration from the apostrophus and adding W (a doubled V = 5,000), T (a flipped and doubled L = 50,000), B (stacked Ds = 500,000), A (flipped, barred V = 5,000,000), U (rounded V = 50,000,000), and finally ß (similar to S and Z, which are maximally valued in their respective runs = 500,000,000).
For those who need to go farther, I propose a more typographically propitious billion (109) multiplier—the grave accent—giving the series Ì–X̀–C̀–M̀–È–H̀–Ò–P̀–S̀ (1 billion to 100 quadrillion), J̀–Ỳ–G̀–Ǹ–F̀–K̀–Q̀–R̀–Z̀ (2 billion to 200 quadrillion), and V̀–L̀–D̀–Ẁ–T̀–B̀–À–Ù–ß̀ (5 billion to 500 quadrillion). While that it sufficient for many of one’s more pedestrian numerical needs, it is by no means the end of the show!
The quintillion (1018) multiplier, the acute accent, gives Í to ß́.
The octillion (1027) multiplier, a dot above, gives İ to ß̇.
The undecillion (1036) multiplier, a diaeresis, gives Ï to ß̈.
The quattuordecillion (1045) multiplier, a tilde, gives Ĩ to ß̃.
The septendecillion (1054) multiplier, a breve, gives Ĭ to ß̆.
The vigintillion (1063) multiplier, an inverted breve, gives Ȋ to ß̑.
Once you get to numbers of a reasonable size, the silly names like million, billion, vigintillion, jillion, and godzillion become more a hindrance than a help, so we continue on with just good ole scientific notation.
a circumflex indicates a multiplier of 1072 (Î to ß̂)
a caron indicates a multiplier of 1081 (Ǐ to ß̌)
a macron indicates a multiplier of 1090 (Ī to ß̄)
a ring indicates a multiplier of 1099 (I̊ to ß̊)
a prime indicates a multiplier of 10108 (I̍ to ß̍)
a double prime indicates a multiplier of 10117 (I̎ to ß̎)
a left half ring indicates a multiplier of 10126 (I͑ to ß͑)
a right half ring indicates a multiplier of 10135 (I͗ to ß͗)
a left arrow indicates a multiplier of 10144 (I᷾ to ß᷾)
a right arrow indicates a multiplier of 10153 (I͐ to ß͐)
a comma indicates a multiplier of 10162 (I̓ to ß̓)
an x indicates a multiplier of 10171 (I̽ to ß̽)
a bridge indicates a multiplier of 10180 (I͆ to ß͆)
Moving any of the multipliers below the symbol gives a further multiplier of 10180:
Finally we reach the limit of the current system, with a few more modifiers, including a hint at ways to extend it further, bringing in stylistic variants, like bold, italic, double-struck, fraktur, and encircled:
I͛ I᷃ I̾ I͋ I͊ II 𝕀 𝕴 Ⓘ (10729–10810)
Some will complain that these distinctions are typographically too subtle, and perhaps they are—though polytonic Greek was a great success, no? At the very least they provide an opportunity for your science nerd to become familiar with processing Unicode programmatically. Another win-win!
Below is a table with all of the diacritics proposed in this superior, novel solution—enlarged and applied to I for simplicity and clarity.
Ì Í İ Ï Ĩ
Ĭ Ȋ Î Ǐ Ī
I̊ I̍ I̎ I͑ I͗
I᷾ I͐ I̓ I̽ I͆
I̖ I̗ Ị I̤ Ḭ
I̮ I̯ I̭ I̬ I̱
I̥ I̩ I͈ I̜ I̹
I͔ I͕ I̦ I͓ I̪
Ȉ I̋ I᷀ I᷁ I͌
I̐ I͒ I҅ I҆ I᷄
I᷅ I᷆ I᷇ I᷈ I᷉
Ỉ I̒ I̔ I̿ Ḯ
I̫ I̼ I᷂ I͙ I̻
I̝ I̞ I̙ I̘ I̟
I͚ I̢ I̡ Į I̧
I͍ I͎ I͉ I̳ I̺
I͛ I᷃ I̾ I͋ I͊
II 𝕀 𝕴 Ⓘ
‡‡‡ Tractable. That’s a word the engineering types seem to like. Put those sociolinguistics skills to good use and sprinkle in those shibboleths so you bond with your victimtarget new friend!