Numeri++
Praenomen Gentilicium Cognomen, Esq.
It would be a great leap forward in humanities-nerd–science-nerd relations if we could convince the STEMmier nerds that linguistics is at least kinda interesting. Of course, linguistics really is interesting, but we have to rope them in first; baby steps are needed before we go full-tilt “minimally transmogrificational principals and parametric aleph movement Chornskyan hegemony” on them. We need to ease them in with simpler things—fun stuff like auto-antonyms, and useful stuff like what distinguishes en dashes and em dashes, or how to properly use a semicolon.
It seems fair to expect our partially hybridized computational linguist friends to serve as the bridge to that foreign land. As with many of the techier nerds, though, they aren’t always the most conversational bunch. To provide a metaphorical railing for this figurative bridge—or is it an allegorical trail to the symbolic bridge? Has this analogy gone off the metaphorical rails? Is it actually a trestle? Have I misplaced an entire train of thought? Where was I?! Right... To encourage our CompLing brethren and sistren to converse with other techies, I propose as a conversation starter the topic of improving Roman numerals! It’s not really relevant to linguistics—or any other humanities, or any of the sciences—but it is kinda interesting.
Our first motivation is to naturalize a sense of quasi-exponentiality in counting. Computer programmers are already overly fond of powers of two—though by 131,072 most have certainly lost the thread—and data scientists love taking logarithms of everything. (Is logarithmancy still a thing? I bet it is in whatever unholy places there may be where data science intersects the Harry Potter fandom.)
A handy method for pseudo-exponential reckoning is to enumerate and/or approximate as follows: 1, 2, 5, 10, 20, 50, 100, etc. Each step is 2 or 2.5 times the size of the last, each number is familiar and arithmetically “friendly”—no awkward 7s or ungainly 13s here!—and there’s only ever one significant digit to keep track of.
Our second motivation is to get rid of the subtractive nature of Roman numerals. While there is a logic to it—even potentially a non-subtractive logic to it^{†}—it is still a big turn off for many people.
We synthesize these objectives by introducing new symbols for 2, 20, 200, and 2000—filling the gaps in the Roman numeral system needed to enable our 1–2–5–10 progression, and obviating the need for subtraction. The new symbols—most being the letter following the one that stands for a given power of ten in the modern Latin alphabet—are as follows:
I=1
J=2
X=10
Y=20
M=1000
N=2000
The only symbols that need to be doubled now are the new 2-based symbols, and no subtractive arrangements are necessary:
1
I
2
J
3
JI
4
JJ
5
V
6
VI
7
VJ
8
VJI
10
X
11
XI
12
XJ
13
XJI
14
XJJ
15
XV
16
XVI
17
XVJ
18
XVJI
19
XVJJ
20
Y
30
YX
50
L
60
LX
70
LY
80
LYX
90
LYY
100
C
200
G
300
GC
400
GG
500
D
600
DC
700
DG
800
DGC
900
DGG
1000
M
2000
N
3000
NM
4000
NN
5000
ↁ or V̅
A good topic for further discussion is how best to continue: is the apostrophus (ↁ) or vinculum (V̅) better for expanding to 5,000 and beyond? Is there a superior, novel solution?^{‡‡†} (Nerds love solving problems, so let them go to town!)
And while the proposed system doesn’t actually make addition and multiplication with Roman numerals easy, it does make both at least somewhat tractable.^{‡‡‡} Addition is merely a regrouping exercise.
- 38 + 17 is YXVJI + XVJ.
- Regrouping gives YXXVVJJI.
- XX is Y, VV is X, JJI is V, so now we have YYXV.
- YYX is L, so we have LV, which is 55!
Multiplication is aided by recognizing certain multiplicative patterns. I×10 = X; X×10 = C; C×10 = M. Similarly for the J–Y–G–N and V–L–D series.
A simple example follows.
- 3×4 is JI×JJ.
- Distributive regrouping gives us JI×(J+J) = JI×J + JI×J.
- Multiplying by J is just writing it down twice, so now we have JIJI + JIJI.
- JIJ = JJI = V, so we can regroup again to get VI+VI = VVII = XJ = 12!
Compare that to the legacy system version:
- 3×4 is III×IV.
- Distribute to get III×V − III×I.
- Multiply through and group VV = X to get VVV − III = XV − III.
- V − III = IIIII − III = II, so XV − III = XII = 12.
Subtraction is a bummer!
Let’s try a slightly larger problem—134×19:
- 134×19 is CYXJJ×XVJJ.
- Distributive regrouping gives us CYXJJ×(X+V+JJ).
- Multiply through to get CYXJJ×X + CYXJJ×V + CYXJJ×JJ.
- Multiplying by ten just moves one step up the I–X–C–M, J–Y–G–N, or V–L–D series, so CYXJJ×X = MGCYY.
- CYXJJ×V = C×V + Y×V + X×V + JJ×V = DCLY.
- CYXJJ×JJ = C×JJ + Y×JJ + X×JJ + JJ×JJ
- = GG + YYYY + YY + JJJJJJJJ
- = GGCYXVI.
- MGCYY + DCLY + GGCYXVI = MDGGGCCCLYYYYXVI = MDDGCCLLYYVI = MMGCCCYYVI = MMDYYVI = 2546—it works!
Admittedly tedious, but in contrast briefly consider—but do not pursue—the subtractive horror of CXXXIV×XIX.
With our kinda interesting, semi-exponential, eminently tractable numeri upgrade in hand, we’re ready to go out into the world and have appropriate, mutually beneficial relations with our new STEMmy, chummy, and oh-so yummy friends!
^{†} You can think of counting in standard Roman numerals as an increment function (“add ‘I’ to the end”) and some rewrite rules to apply sequentially to the end of the Roman numeral:
- IIII → IV / ____#
- IVI → V / ____#
- VIV → IX / ____#
- IXI → X / ____#
Similarly:
- XXXX → XL / ____#
- XLX → L / ____#
- LXL → XC / ____#
- XCX → C / ____#
And of course:
- CCCC → CD / ____#
- CDC → D / ____#
- DCD → CM / ____#
- CMC → M / ____#
So, starting at CLXXXVIII (188) and incrementing by I to CC (200) we get:
- CLXXXVIII + I = CLXXXVIIII
- IIII → IV / ____#, giving CLXXXVIV
- VIV → IX / ____#, giving CLXXXIX
- CLXXXIX + I = CLXXXIXI
- IXI → X / ____#, giving CLXXXX
- XXXX → XL / ____#, giving CLXL
- LXL → XC / ____#, giving CXC
- CXC + I = CXCI
- CXCI + I = CXCII
- CXCII + I = CXCIII
- CXCIII + I = CXCIIII
- IIII → IV / ____#, giving CXCIV
- CXCIV + I = CXCIVI
- IVI → V / ____#, giving CXCV
- CXCV + I = CXCVI
- CXCVI + I = CXCVII
- CXCVII + I = CXCVIII
- CXCVIII + I = CXCVIIII
- IIII → IV / ____#, giving CXCVIV
- VIV → IX / ____#, giving CXCIX
- CXCIX + I = CXCIXI
- IXI → X / ____#, giving CXCX
- XCX → C / ____#, giving CC
Honestly, it seems pretty straightforward when you look at it like this, but people still don’t seem to like it. This also goes to show that most subjects are amenable to linguisticky interpretation if you look hard enough. Ah, Linguistics! Is there anything it can’t do!?
^{‡} Alas, D is already taken, but G’s similarity in shape to C makes for a pleasantly felicitous alternative.
^{‡†} Note that a number of nattering nobodies insinuate that this makes nine somehow “naughty by nature”. Nincompoops!
^{‡‡} In Canada, Z would be an acceptable alternative for the value of 1, but only when dealing with quantities of 41.
^{‡‡†} Of course there is a superior, novel solution, but it’s usually best to let your nerd try to figure it out for themselves.
For those among the curious for whom the superior, novel solution isn’t immediately obvious...
The legacy system has 1-based I–X–C–M and 5-based V–L–D, to which we’ve added 2-based J–Y–G–N. Adding more simple symbols (as opposed to the typographically challenging apostrophus or vinculum) just makes sense. For the 1-based and 2-based extensions, we add similarly shaped pairs: E/F (10,000 and 20,000), H/K (100,000 and 200,000), O/Q (1,000,000 and 2,000,000), P/R (10,000,000 and 20,000,000), and S/Z (100,000,000 and 200,000,000). We can extend the 5-based run by taking graphemic inspiration from the apostrophus and adding W (a doubled V = 5,000), T (a flipped and doubled L = 50,000), B (stacked Ds = 500,000), A (flipped, barred V = 5,000,000), U (rounded V = 50,000,000), and finally ß (similar to S and Z, which are maximally valued in their respective runs = 500,000,000).
For those who need to go farther, I propose a more typographically propitious billion (10^{9}) multiplier—the grave accent—giving the series Ì–X̀–C̀–M̀–È–H̀–Ò–P̀–S̀ (1 billion to 100 quadrillion), J̀–Ỳ–G̀–Ǹ–F̀–K̀–Q̀–R̀–Z̀ (2 billion to 200 quadrillion), and V̀–L̀–D̀–Ẁ–T̀–B̀–À–Ù–ß̀ (5 billion to 500 quadrillion). While that it sufficient for many of one’s more pedestrian numerical needs, it is by no means the end of the show!
The quintillion (10^{18}) multiplier, the acute accent, gives Í to ß́.
The octillion (10^{27}) multiplier, a dot above, gives İ to ß̇.
The undecillion (10^{36}) multiplier, a diaeresis, gives Ï to ß̈.
The quattuordecillion (10^{45}) multiplier, a tilde, gives Ĩ to ß̃.
The septendecillion (10^{54}) multiplier, a breve, gives Ĭ to ß̆.
The vigintillion (10^{63}) multiplier, an inverted breve, gives Ȋ to ß̑.
Once you get to numbers of a reasonable size, the silly names like million, billion, vigintillion, jillion, and godzillion become more a hindrance than a help, so we continue on with just good ole scientific notation.
- a circumflex indicates a multiplier of 10^{72} (Î to ß̂)
- a caron indicates a multiplier of 10^{81} (Ǐ to ß̌)
- a macron indicates a multiplier of 10^{90} (Ī to ß̄)
- a ring indicates a multiplier of 10^{99} (I̊ to ß̊)
- a prime indicates a multiplier of 10^{108} (I̍ to ß̍)
- a double prime indicates a multiplier of 10^{117} (I̎ to ß̎)
- a left half ring indicates a multiplier of 10^{126} (I͑ to ß͑)
- a right half ring indicates a multiplier of 10^{135} (I͗ to ß͗)
- a left arrow indicates a multiplier of 10^{144} (I᷾ to ß᷾)
- a right arrow indicates a multiplier of 10^{153} (I͐ to ß͐)
- a comma indicates a multiplier of 10^{162} (I̓ to ß̓)
- an x indicates a multiplier of 10^{171} (I̽ to ß̽)
- a bridge indicates a multiplier of 10^{180} (I͆ to ß͆)
Moving any of the multipliers below the symbol gives a further multiplier of 10^{180}:
- I̖ I̗ Ị I̤ Ḭ I̮ I̯ I̭ I̬ I̱ I̥ I̩ I͈ I̜ I̹ I͔ I͕ I̦ I͓ I̪ (10^{189}–10^{360})
For those who actually use moderately large numbers, two further series of multipliers are available:
- Ȉ I̋ I᷀ I᷁ I͌ I̐ I͒ I҅ I҆ I᷄ I᷅ I᷆ I᷇ I᷈ I᷉ Ỉ I̒ I̔ I̿ Ḯ (10^{369}–10^{540})
- I̫ I̼ I᷂ I͙ I̻ I̝ I̞ I̙ I̘ I̟ I͚ I̢ I̡ Į I̧ I͍ I͎ I͉ I̳ I̺̺ (10^{549}–10^{720})
Finally we reach the limit of the current system, with a few more modifiers, including a hint at ways to extend it further, bringing in stylistic variants, like bold, italic, double-struck, fraktur, and encircled:
- I͛ I᷃ I̾ I͋ I͊ I I 𝕀 𝕴 Ⓘ (10^{729}–10^{810})
Some will complain that these distinctions are typographically too subtle, and perhaps they are—though polytonic Greek was a great success, no? At the very least they provide an opportunity for your science nerd to become familiar with processing Unicode programmatically. Another win-win!
Below is a table with all of the diacritics proposed in this superior, novel solution—enlarged and applied to I for simplicity and clarity.
Ì Í İ Ï Ĩ
Ĭ Ȋ Î Ǐ Ī
I̊ I̍ I̎ I͑ I͗
I᷾ I͐ I̓ I̽ I͆
I̖ I̗ Ị I̤ Ḭ
I̮ I̯ I̭ I̬ I̱
I̥ I̩ I͈ I̜ I̹
I͔ I͕ I̦ I͓ I̪
Ȉ I̋ I᷀ I᷁ I͌
I̐ I͒ I҅ I҆ I᷄
I᷅ I᷆ I᷇ I᷈ I᷉
Ỉ I̒ I̔ I̿ Ḯ
I̫ I̼ I᷂ I͙ I̻
I̝ I̞ I̙ I̘ I̟
I͚ I̢ I̡ Į I̧
I͍ I͎ I͉ I̳ I̺
^{‡‡‡} Tractable. That’s a word the engineering types seem to like. Put those sociolinguistics skills to good use and sprinkle in those shibboleths so you bond with your victim target new friend!