code
|
text
A Polemic on the Typography
Noah Syrkis
December 29, 2024
In our information-theoretic age, the bit has replaced the atom as the most
fundamental unit of wealth
[1]
. If any phenomenon befits the name
Great
Replacement
, it is this. The notion of information as an exact, measurable
quantity was given to us by
C. E. Shannon [2]
with his seminal
1948
paper
A Mathematical Theory of Communication
. With it, he establishes information
theory as a science, introducing three towering theorems on which our modern
world is (arguably) built:
1)
The Data Compression Theorem, proving that for
a given source that produces symbols from an alphabet
𝜉
=
{
𝑠
0
,
𝑠
1
,
…
,
𝑠
𝑛
−
1
}
with probabilities
{
𝑝
0
,
𝑝
1
,
…
,
𝑝
𝑛
−
1
}
, there exists an optimal coding scheme that
can compress messages to an average length approaching the source entropy;
2)
The Channel Coding Theorem, proving that reliable communication over
noisy channels is possible, at the rate of mutual information between the two
endpoints; and
3)
The Rate-Distortion Theorem, describing the minimum bit rate
required to achieve a given level of distortion when compressing beyond the
entropy. Information theory, as such, is about the transmission of symbols. The
transmission of symbols, however, is about much more than one might think.
Indeed, the symbol too reigns supreme in the context of even deoxyribonucleic
acid (DNA), a source with the alphabet
{
𝐴
,
𝐶
,
𝑇
,
𝐺
}
. Senescence is the failure of
the body to transmit its own DNA undistorted forward in time to itself. Mortality
is
an information-theoretic phenomenon with an information theoretic solution
[3]
.
Figure 1: The Alphabet in Fraktur
The importance of the symbol is reflected in that has ubiquitously been drawn
with typographical masterpieces like Fraktur (seen in
Figure 1
), the double-
struck letters, used to denote fundamental number sets like the naturals
ℕ
, the
reals
ℝ
and the complex
ℂ
, peppering the pages of any modern scientific text,
D. Knuth [4]
‘s font Computer Modern (with which this page is rendered), and,
more recently, Fira Code (with which this page is typed). Indeed, the first, and
perhaps only take away the average reader of Shannon’s paper will notice, is the
beautifyl way in which it is typed. This too is true for
[5]
. In the case of Fraktur,
ubiquity has been controversial. Now it has been demoted to denote more exotic
mathmatical concepts like Lie algebras
𝔤
, prime ideals
𝔭
, and maximal ideals
𝔪
.
But, Fraktur was the de facto font of the Third Reich, and Germanic territories
at large before that. Etymologically, the word “Fraktur” comes from the Latin
frāctūra
(“a fracture”), built from
frāctus
(“to fracture”). The typeface has its ori
gin in the 16th century. The Holy Roman Emperor Maximilian I commissioned
Hieronymus Andreae to create a new typeface for Albrecht Dürer’s woodcut
print
Triumphal Arch
¹
. 195 separate wood blocks were cut, and used to imprint
36 large paper sheets for a final composite measuring 295
×
357 centimetres,
making it one of the largest and most complex prints of that kind ever produced
—a masterpiece befitting the creation of its own typeface. By the 20th century,
Fraktur had become the German typeface (ubiquity). However, Adolf Hitler,
who was otherwise well-versed in aesthetics, is known to have disliked the font,
publicly denouncing it in a 1934 speech to the Reichstag:
Your alleged Gothic
internalization does not fit well in this age of steel and iron, glass and concrete […]
.
On this particular dimension, Hitler was so out of step with his compatriots. Yet
it took almost a decade to abolish the script, with the intervening years seeing
Roman characters (of which Fraktur is not) scolded as being under “Jewish
influence”, urging their use to be replaced by Fraktur, the true “German script”.
Even with the immense power afforded the Führer with his position as such,
it took him almost a decade to abolish the font, when Matrin Bormann, in a
1941 Schrifterlass declared Fraktur to not be “Gothic” but rather Schwabacher
“Jewish letters” (controversy). The final demise of Fraktura as the font of the
Third Riech was motivated not only by Hitler’s warped sense of typographical
beauty—if that had been enough, it had like been abolished around the afore
given 1934 quote—but also practical matters. In the German occupied terriotries,
people stuggled to read Fraktur, having not been trained to do so during their
schooling. Further, stereotypes (type of printing plate developed in the late 18th
century and widely used in letterpress, newspaper, and other high-speed press
runs) supporting the font for the pressing of Joseph Goebbels’ propaganda were
few and far between in these newly invaded countries. In its stead Antiqua, a
type face much easilyer discernable by the subject of the extended Third Reich,
would be used.
In the case of the double-struck letters, for those who have had to explain to
students that mathematics is not the mere study of numbers (
1
,
2
,
3
,
…
) but
rather the sets thereof (
ℤ
=
{
0
,
1
,
−
1
,
2
,
−
2
,
…
}
)—or sets in general—the idea of
denoting these by blackboard bold letters is ingenious—draw two diagonal lines,
N
→
ℕ
, and the studious mind readily understands that this symbol denotes
something more profound. Popularized in the 1960s during the typewriter era,
this shorthand for conceptual difference was particularly apt for the page: strike
the letter key twice, and voilà…
However, in artifcial intelligence, there is a long standing debate between
suymbolic and
sub
-symbolic approaches to solving intelligence. With the deep
learning revolution, this debate has largely been settled with subsymbolism as
the winning paradigm. But what is subsymbolism? If the atoms of DNA, the neu
cliotides denoted by the afforementioned alphabet
{
𝐴
,
𝐶
,
𝑇
,
𝐺
}
, are symbolic,
what atoms might that atoms of deep learning be so as to merits the prefix
sub
?
Again, words is the enemy of clarity of thought (see
/terminology
). As an example
of a subsymbolic deep learning see
/miiii
, the trained model therein implements
an algorithm similar to
Eq. 1
, the modular addition algorithm reverse engineered
by
N. Nanda, L. Chan, T. Lieberum, J. Smith, and J. Steinhardt [6]
.
𝑥
0
→
sin
(
𝑤
𝑥
0
)
,
cos
(
𝑤
𝑥
0
)
𝑥
1
→
sin
(
𝑤
𝑥
1
)
,
cos
(
𝑤
𝑥
1
)
(1)
For those nont mathematically inclindes,
Eq. 1
might seem inscruitable, but
comparing that to the weights of the trained model directly, focusing only on
the first two lines, of the equation, visualized as deep learning weights, we have
Figure 2
. To quote Anders Søgaard, there is no directing mapping from
Figure 2
to the mathematical notation in
Eq. 1
.
Figure 2: The embedding layer of the MIIII model
The success of sub-symbolic AI, has only further increased the importance of
Liebniz’s array, the way in which data has most often be represented. Now,
programs, too, are increasingly represented as such. And yet, the way in which
arrays are turned to ink is a an frequent abomination, compared to the care with
which symbols are.
Figure 3: Taliban's much improved Afghan flag
¹
https://www.metmuseum.org/art/collection/search/388475
References
[1]
A. McAfee,
More from Less: The Surprising Story of How We Learned to
Prosper Using Fewer Resources–And What Happens Next
. New York: Scribner,
2019.
[2]
C. E. Shannon, “A Mathematical Theory of Communication,”
Bell Sys
tem Technical Journal
, vol. 27, no. 3, pp. 379–423, 1948, doi:
10.1002/
j.1538-7305.1948.tb01338.x
.
[3]
M. N. Bin-Jumah
et al.
, “Genes and Longevity of Lifespan,”
International
Journal of Molecular Sciences
, vol. 23, no. 3, p. 1499, Jan. 2022, doi:
10.3390/
ijms23031499
.
[4]
D. Knuth, “Computer Modern.” 1992.
[5]
T. M. Cover and J. A. Thomas,
Elements of Information Theory
, 2nd ed.
Hoboken, N.J: Wiley-Interscience, 2006.
[6]
N. Nanda, L. Chan, T. Lieberum, J. Smith, and J. Steinhardt, “Progress Mea
sures for Grokking via Mechanistic Interpretability,” no. arXiv:2301.05217.
arXiv, Oct. 2023.