05 April 2023

Early Chinese Telegraph Codes

From Kingdom of Characters: The Language Revolution That Made China Modern, by Jing Tsu (Riverhead Books, 2022), Kindle pp. 91-92, 106-108, 110-111, 123-124:

In Morse code, the basic symbols were dots and dashes. The system’s twenty-six combinations of dots and dashes, ranging from one to four symbols, were meant to accommodate the twenty-six letters of the alphabet, with another ten combinations of five symbols each for numbers zero to nine.

To send a message, a telegraph operator pressed an electric switch, in the form of a key: a short tap for a dot and a long one for a dash. The message was converted into an electric current that traveled along the wires and was reverse translated into letters and numbers on the receiving end. The sound of clicking patterns could become so familiar that an experienced telegrapher could tell what word was being coded from its distinct rhythm. Telegraph costs were determined by how long they took to transmit—each dot or space was a single unit, and a dash—three times as long as a dot—was three units. As Morse explained early on, his system was designed to be cost-efficient. The most frequently used letter in English, “e,” was also the least expensive: It was represented by a single dot. The high frequency of “e” holds true for most European languages, from Italian to Dutch. But Morse code clearly favored the American English alphabet. An English letter takes up somewhere between one and thirteen units. To add even a single diacritical mark to the letter “a”—as when making the French “à”—required ten more units. So there was already plenty to disagree about among Roman alphabet users.

The inequities of Morse code were on a different scale for the Chinese. International telegraphy recognized only the Roman alphabet letters and Arabic numerals used by the majority of its members, which meant that Chinese, too, had to be mediated via letters and numbers. Whereas English could be English, and Italian mostly Italian, Chinese had to be something other than itself. Every Chinese character was transmitted as a string of four to six numbers, each of which cost more than a letter. The assigned code for a Chinese character first had to be looked up in a codebook before being converted to the dots and dashes of Morse code. Coding and converting Chinese characters into an ordinary telegram of twenty-five words required at least half an hour, whereas a comparable message in English took only about two minutes. Untold opportunity costs accrued with every telegraph that was delayed when the operator had to pause to check a character against its assigned number in a codebook or had to take extra time to correct an error.

...

[Septime Auguste] Viguier possessed the confidence and skill set that Great Northern [Telegraph Company (大北電報公司 / 大北电报公司 Dàběi Diànbào Gōngsī)] was looking for. He had already worked on developing a code for Chinese telegraphy years earlier for the French government in support of their failed efforts to interest the Chinese Empire in their telegraph cables. He was well versed in early word-copying machines like the Caselli pantelegraph, a precursor to the modern-day fax machine. When the French project was shelved, Viguier ended up in Shanghai—ripe for the Danes’ recruitment. He was the best candidate but not well-liked. Colleagues immediately noted his preening and boastfulness—the French way, they sneered. Viguier later also had a nasty exchange with the managing director Suenson, and his relationship with the company soured over questions of compensation and credit. Nonetheless, Viguier was able to work quickly enough to build out the Danish professor’s incomplete scheme. By June 1870, he had the first version. In 1872, he delivered the final, standardized telegraphic code table for 6,899 characters in The New Book for the Telegraph (Dianbao xinshu).

...

Viguier came up with a tabular form of twenty rows and ten columns per page. He assigned an arbitrary four-digit code from 0001 to 9999 to each character, with empty spaces left for potentially 3,000 more codes to accommodate customized vocabulary for individual business purposes. Each page contained 200 square spaces for listing 200 characters and their numerical codes. The code only included a relatively small number of characters out of the 45,000 or so that were extant. The mass scaling of telegraphy meant that it was geared toward the common person and the common tongue, so restricting the number of characters was not only efficient but also practical.

...

But Viguier’s telegraphic code did not go unchallenged. Almost immediately, the Chinese tried to outdo and improve upon it. A quiet young Chinese translator who had been part of that diplomatic mission to Europe in 1868, Zhang Deyi, became the first Chinese to do so. Zhang noted the pain of having to send Chinese messages back to the Chinese office in China in “foreign letters” whenever more urgent service was required. He also saw how Western telegrams were more secure, as secret messages were sent in numbers. That inspired Zhang to construct his own Chinese telegraphic codebook by following a similar format.

While the published version of Viguier’s work was an important landmark, Zhang zeroed in on its sloppiness. Viguier’s numbering of characters did not make them terribly easy to use for the Chinese. The continuous numbers did not separate out characters into groups, which was how the Chinese were accustomed to searching for characters in a dictionary. He decided to trim down the format of Viguier’s system and do some reorganization to make its content clearer. Zhang’s own New Method of Telegraphy (Dianxin xinfa) was published two years after Viguier came up with a draft of his telegraphic code in 1873. It reordered the characters so that the numbers were less arbitrary. Zhang used the same 214 radicals, but reselected about 7,000 characters from the Kangxi Dictionary and assigned them numbers from 0001 to 8000.

...

Westerners like Viguier had mapped Chinese onto numbers. Then the Chinese themselves had tried to use numbers to remap the alphabet. They kept bending the stick back and forth. Wang [Jingchun] was increasingly of the mind that one could put the Western alphabet in service of Chinese Romanization more permanently. He turned to Bopomofo, the Chinese phonetic alphabet approved at the 1913 National Language Unification Conference in Beijing, and its idea of an auxiliary phonetic alphabet formed from different styles and parts of Chinese characters. Working from this basis, Wang designed a use for Roman letters that was Latin in name but readapted to signal the three linguistic properties of Chinese characters: the phonetic representation of sound, tone, and the radical.

To indicate sound in his New Phonetic System, Wang mapped the sounds of Bopomofo—represented by symbols ㄅ, ㄆ, ㄇ, ㄈ, etc.—onto alphabet letters that shared similar starting consonant sounds. So ㄅ, ㄆ, ㄇ, ㄈ would match the letters “b,” “p,” “m,” and “f.” To show tone, Wang picked five letters to represent the five tones used in traditional and medieval phonology: “B” stands for the level or even tone; “P” marks the second or rising tone; “X” represents the third tone, which falls first then rises; “C” is fourth or falling tone; and “R” denotes the fifth or neutral tone. The last property, the radical, takes up two letters—a consonant and a vowel. Wang used two letters to spell the pronunciation of the radical part of the character only; e.g., tu for 土, li for 力, ko for 口, etc., in a way that was not dissimilar to what Wang Zhao had done with the Mandarin Alphabet. With one letter for sound, another letter for tone, and two more for phoneticizing the radical’s spelling, this system yielded a four-letter code for every character. The Chinese character could then be transmitted via telegraphy without using numbers at all. Wang’s idea took after other Romanization systems of the time, which were developed not for telegraphy per se, but to address the broader question of literacy. He borrowed from that conversation, run by linguists and ethnographers, to design a solution for what he had seen in the diplomatic arena.

During the year the Far Outliers spent in China in 1987-88, we had occasion to send a telegram to fellow teachers who were spending winter holidays in their hometown of Jingdezhen, famous for its pottery. They had written most of the text of the telegram and all we had to do was add the day and time when our train would arrive. So before we boarded the train for Jingdezhen, I handed the text of the telegram to a clerk at the telegraph office who proceeded to rewrite the message in a series of 4 digits for each Chinese character. It was very short and she had probably memorized some of the most frequent codes for 'arrive', 'depart', and dates and times, but it still looked like a tedious chore.

This interesting chapter includes a very misleading table, shown below. It shows American Morse code (also called Railroad Morse) that was standardized in 1844 and used by American railroads as late as the 1970s. Also called Morse landline code, its variable spacing and variable lengths did not travel well through undersea cables. Central Europeans used a modified code, the Hamburg alphabet, that evolved into the International Morse Code standardized in 1865. Working in the 1870s, Viguier and Zhang almost certainly used the international standard, the one where 'SOS' is rendered by the familiar dit dit dit dah dah dah dit dit dit.

Erroneous Morse Code

No comments: