Give yourself more security than you need today.
When the unexpected happens, you’ll be glad you did.
— Bruce Schneier
Codebreakers are linguistic alchemists, a mystical tribe attempting to conjure sensible words
out of meaningless symbols.
— Simon Singh
Simon Singh63 also has this to say about codes and coding:
Codes are more relevant today than ever before. Nowadays our phone calls bounce off satellites and our e-mails pass through various computers, and both forms of communication can be intercepted with ease, jeopardizing our privacy. Similarly, as more and more business is conducted over the Internet, safeguards must be put in place to protect companies and their clients. Encryption is the only way to protect our privacy and guarantee the success of the digital marketplace. Cryptology will provide the locks and keys of the information age.64
One of my duties as a naval officer duringWorldWar II was as my ship’s coding officer.65 That meant that I had to encipher messages for transmission over our ship’s radio and decipher incoming messages. When in port I also had to make monthly visits to fleet headquarters to obtain replacement coding materials. This was one of my most interesting assignments as I learned a great deal about how secret information was communicated at that time. In particular, I learned that secrecy should only be considered temporary: over time ciphers would be broken and the information they carried learned by the enemy. The trick then is to make that time period as long as possible.
Perhaps the best example of this temporal quality of communication secrecy was the 1941 Pearl Harbor episode. Our cryptographers had broken the Japanese diplomatic code, PURPLE, and they had partially broken the Imperial Japanese Navy code, JN-25. Some decoded messages suggested an attack but, for various reasons the information was not deciphered and communicated in time to give the naval base commanders an opportunity to deploy their defenses and send ships out of the crowded harbor. The result was this nation’s most serious naval defeat in history: 2,402 men and women were killed and 1,282 wounded; eight battleships, three cruisers and three destroyers were sunk or damaged; and 188 airplanes were destroyed, almost all on the ground.
Today encrypting and decrypting messages involves computers loaded with complex programs. You might sit at a keyboard-like console and type a message in standard English (what cryptographers call plaintext) and the machine would translate that message into an encrypted form to be sent over open communication channels like the Internet. What I will explore with you in this chapter are: (1) some of the mathematics that supports communication secrecy and (2) some of the history of this subject. I begin with some coding history.
5.1 Secret communication
One means of passing secret messages is to have couriers carry them in some way hidden on their person or among their belongings. Singh cites one extreme example of an obviously low priority message written on a man’s shaved skull. He then allowed his hair to re-grow before passing through enemy territory. Upon arrival at his destination he had his head shaved once again. This is a rather extreme example of what is called steganography, a word derived from the Greek for “concealed writing.”
Twenty-five hundred years ago the Greek Demartus carried a message hidden under the wax of what appeared to be blank tablets to warn the Athenians of a planned invasion by the Persian king, Xerxes. This allowed the Athenians and Spartans to build the fleet that defeated Xerxes’ forces at the Battle of Salamis in 480 BCE. That episode may suggest that only the ancients employed such devices, but hidden messages continue to be used today. In one instance during World War II, German spies in South America reduced pages of text to millimeter size. These so- called microdots were then printed as the periods that ended sentences in otherwise unexceptional documents. The FBI discovered this process and warned agents to watch out for these “shiny periods” in correspondence they intercepted.
There is, however, a serious problem with such hidden messages. Upon discov- ery, their plaintext is immediately evident.
In a real sense all written and even spoken language is in code. If you don’t know a foreign language, messages in that language carry little meaning without a codebook. If the language is Russian, for example, the codebook would be a Russian- English dictionary. It is, in fact, reasonable to consider linguists as cryptologists, for both seek to understand difficult texts. United States military forces took advantage of this form of language secrecy during World War II by enlisting Navajo Indians as front-line radio communicators. These so-called Windtalkers spoke to each other over open communication lines in their own language.
Figure – 5.1:
The Rosetta Stone
Major advances in decryption were made in the early 19th century when Egyptian artifacts were discovered and the ancient languages they represented were decrypted by French and British scholars after Napoleon’s abortive expedition against Egypt. A key discovery in 1799 was the Rosetta Stone, an artifact from 196 BCE that is now in the British Museum. A decree from Ptolemy V, it was divided into thirds with the same text (about tax relief) printed in hieroglyphic, Demotic and ancient Greek. The Greek text could be read and it served by comparison to translate the others that had until then not been deciphered. Thomas Young first translated the Demotic text and Jean-Franc¸ois Champollion the hieroglyphics. Their work provided access to thousands of previously indecipherable documents.
The flags you see hoisted on navy vessels each represent a letter, but they may also convey messages that are found in signal officer’s codebooks. For example, the all-red B (called “bravo”) flag, raised alone, indicates that the ship is working with dangerous cargo like explosives.66 Such signals are useful when the ships are within eyesight and their codebooks are shared: in particular, they allow radio silence. Those codebooks are what you see being thrown overboard in weighted bags in films about ships before they are surrendered to an enemy.
Two important communication ciphers are the Morse and ASCII (pronounced “ask’-e”) codes. The Morse code has letters and numbers represented by series of dots and dashes. Originally designed in the 1840s for use over telegraph lines (this was before the spoken voice was transmitted over those lines) this code remains in use today largely by amateur radio operators. In one odd use of this code, the tempo of the music that concludes the “Inspector Morse” series on television is based on Morse code with −− −−− ·−· ··· · spelling out morse.
Figure – 5.2:
Morse Code
ASCII is short for American Standard Code for Information Interchange. It has almost completely replaced the Morse code for communication purposes, using bi- nary representation to convey numbers, letters and other symbols. On the following chart you will note that the symbols are listed with their decimal (base 10), hex- adecimal (base 16) and octal (base 8) values. For example, the letter “a” is coded as 97Dec , 61Hx and 141Oct. It is important to understand that these three numbers are equal and are also equal to the binary number 1100001Bin. These relationships will be clarified in the following subsection.
Figure – 5.3:
Basic ASCII Code for Digits and Letters
Char Dec Hx Oct | Char | Dec | Hx | Oct | Char | Dec | Hx | Oct |
0 48 30 060 | A | 65 | 41 | 101 | a | 97 | 61 | 141 |
1 49 31 061 | B | 66 | 42 | 102 | b | 98 | 62 | 142 |
2 50 32 062 | C | 67 | 43 | 103 | c | 99 | 63 | 143 |
3 51 33 063 | D | 68 | 44 | 104 | d | 100 | 64 | 144 |
4 52 34 064 | E | 69 | 45 | 105 | e | 101 | 65 | 145 |
5 53 35 065 | F | 70 | 46 | 106 | f | 102 | 66 | 146 |
6 54 36 066 | G | 71 | 47 | 107 | g | 103 | 67 | 147 |
7 55 37 067 | H | 72 | 48 | 110 | h | 104 | 68 | 150 |
8 56 38 070 | I | 73 | 49 | 111 | i | 105 | 69 | 151 |
9 57 39 071 | J | 74 | 4A | 112 | j | 106 | 6A | 152 |
| K | 75 | 4B | 113 | k | 107 | 6B | 153 |
| L | 76 | 4C | 114 | l | 108 | 6C | 154 |
Decimal values | M | 77 | 4D | 115 | m | 109 | 6D | 155 |
0-47, | N | 78 | 4E | 116 | n | 110 | 6E | 156 |
58-64, | O | 79 | 4F | 117 | o | 111 | 6F | 157 |
91-96 and | P | 80 | 50 | 120 | p | 112 | 70 | 160 |
123-127 | Q | 81 | 51 | 121 | q | 113 | 71 | 161 |
are | R | 82 | 52 | 122 | r | 114 | 72 | 162 |
designated | S | 83 | 53 | 123 | s | 115 | 73 | 163 |
control | T | 84 | 54 | 124 | t | 116 | 74 | 164 |
characters | U | 85 | 55 | 125 | u | 117 | 75 | 165 |
and symbols | V | 86 | 56 | 126 | v | 118 | 76 | 166 |
| W | 87 | 57 | 127 | w | 119 | 77 | 167 |
| X | 88 | 58 | 130 | x | 120 | 78 | 170 |
| Y | 89 | 59 | 131 | y | 121 | 79 | 171 |
| Z | 90 | 5A | 132 | z | 122 | 7A | 172 |
Binary, octal and hexadecimal numeration
Although binary representation offers an extremely simple means of communication —0 and 1 represented by alternative states—writing lengthy binary numerals takes up much space and can easily result in communication errors. If, for example, you want to represent the simple word “Go” in ASCII code, you would have to write the binary numbers for the decimal numbers 71 and 111. Thus you would have a string of 14 so-called binary bits: 1000111 1101111.
Octal (base 8) and hexadecimal (base 16) representation allow you to represent that number, and others like it, in briefer form. The reason those two bases are chosen is that they are closely related to binary in a way that decimal numbers are not. Consider first base 8. Here is a table showing the digits in base 8 compared with those in base 2:
The binary numbers are represented with three digits because that is the least required to represent all the digits 0 through 7, and 7 is the largest digit in base 8.
Octal gives you a way to represent lengthy binary numbers in shorter form. Suppose, for example, you wish to represent the binary number 1101111 in octal. To do so, break it into groups of three digits starting from the right, just as you separate decimal numbers into thousands. You get 1,101,111. Now convert each 3-digit set into octal and you have 157. The ASCII table confirms this octal number as 157, because 1101111 was the binary number we used for the letter “o”. But consider the still larger number we get when we string those 14 digits together—10001111101111.
You can represent that in octal equally easily: break it into 10,001,111,101,111 which corresponds to the octal: 21757. Once you get used to this representation, conversion back and forth is very easy.
You can, moreover, shorten binary numbers still further by using hexadecimal or hex representation. The word hexadecimal means base 16.68 There is a problem with hexadecimal, however. Base 16 needs digits to represent numbers corresponding to the decimal numbers 0 to 15, and we have available only the ten digits of our decimal system. To address this problem, we simply continue with letters from the alphabet, counting 1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,10. Here then is a table equating 4-digit binary numbers with their corresponding hexadecimal digits:
Now we can translate those binary numbers four digits at a time: our letter “o” is now represented as 110,1111 which is 6F in hex, just as it appears in the ASCII table. And our larger number, 10,0011,1110,1111 is 23EF. It should not take you long to convince yourself that converting between these bases, which are powers of 2, is much easier than converting to and from the decimal system.
There is a matter that is indirectly related to this representation. Although it is less efficient, many calculators operate with what are called binary-coded decimals.
Instead of converting a number like 379 directly to binary (it would be 101111011) it is converted digit by digit into 4-digit binaries, thus 0011 0111 1001, and the numbers are processed internally in that form. Notice that those four-digit binaries are also equal to single hexadecimal digits (3, 7 and 9) that are always the same as the decimal values. But notice too how efficiency is lost: coding the decimal number 379, for example, takes 9 binary bits but 12 binary-coded decimal bits.
68Hex is a Greek prefix for 6 and dec for 10; the two combine in hexadecimal to make 16. This is similar to the word dodecagon, a 12-sided polygon with do representing 2 and dec again 10.
Some additional coding examples
Children use various techniques to make their communications confidential — or more often simply part of an “in” language. Pig Latin is one such code.69 Another is splitting a message by the “rail fence” technique. Here is how you would encrypt the message, “Paranoia is our profession,” which, it has been suggested, should be the motto of the North American Defense Command:
First, write the letters of the phrase together, omitting spaces between words:70 Then you copy the text in two rows, alternating letters:
Now copy those two rows as they appear, PRNIIORRFSINAAOASUPOESO, and you have your message. Your friends decipher the message by splitting it in two and arranging a similar fence.
The very first military cryptographic device, the Scytale, was used by the Spartans as early as the seventh century BCE to encode and decode messages by a somewhat similar breakdown method. To illustrate this tool, here is how one of Benjamin Franklin’s aphorisms “Three may keep a secret when two of them are dead” is encrypted. The letters are broken into six rows of characters. (Five characters are added at the end to square off the array.)
Now the characters are printed on a narrow vertical strip, but reading the letters by columns; thus, the strip would begin:
The Scytale itself is a rod with a polygonal cross section similar to that of a wooden lead pencil but larger, its sides conforming to the size of the characters on the message strip. This illustration assumes a rod with a hexagonal cross section. Those receiving the coded message would wrap the narrow strip around the rod in spiral fashion and the message would appear in lines just as they were before coding. Of course, this could also be accomplished simply by cutting the message strip after each group of six letters and placing the resulting strips side by side.
Exercises 5.1
(5.1.1) Decode the rail fence code: WAEUHTFADEMAEAENERSCSUFSRASRMDO.
(5.1.2) Change the following numbers. (You may need to use Panel 5.1.1 to answer (c), (d) and (e). Binary representation and arithmetic is developed in Chapter 8.)
(a) 1111110011101Bin to octal
(b) 1111110011101Bin to hexadecimal
(c) 234Oct to binary
(d) 234Dec to binary
(e) A5DHex to binary
(5.1.3) Use the ASCII table and Panel 5.1.1 to answer the following:
(a) Notice that the ASCII basic code has 128 entries: 0 to 127. How many binary digits are necessary to represent each of these numbers.
(b) What are the binary representations for the ASCII values for Z and z?
(5.1.4) You intercept a message designed for decoding by use of a scytale, but you do not know how many sides there are in the scytale’s polygonal cross-section. The single column of digits is ordered CTOHCHSOUNAOIANLSVDSGGAYEEMERTODDEXAIUETSY. Decode this message.
5.2 Substitution ciphers
A B C D E F G H I J K L MNOP Q R S T UVW X Y Z
D E F G H I J K LMNO P Q R S T UVW X Y Z A B C
For the message Caesar himself might have sent, CROSS THE RHINE, you would have:
FURVV WKH UKLQH.
This kind of secret communication is, of course, easy both for encryption and decryption. Panel 5.2.1 will do both for you. To decode a message you need only shift the coded letters an equal distance to the left or 26 minus your initial shift further to the right.
Although such shifting is simple to carry out, it is equally simple to break this kind of code. There are, after all, only 25 different shifts and it would not take long to discover which one is being employed. It is not a significant intellectual leap to the idea of mixing the letters randomly in the second line of that coding list.
Then there are not just 25 possibilities but 26! = 403,291,461,126,605,635,584,000,000 possibilities for coded messages. To see how that huge number is arrived at, consider the placement of the letters one at a time in that coding table. You have 26 choices for the first letter, then 25 for the second and so on until there is only one place left for the last letter. It is the product of those numbers—26 · 25 · 24 · … · 3 · 2 · 1 = 26!— that leads to that huge number of possibilities.
Another problem arises with that approach, however. If the letters are completely at random, it will be necessary for the receiver to have a record to use for decoding – unless, that is, the receiver has an unusual memory. (That is the kind of record that you see spies in movies sneaking into offices or hotel rooms to copy.) One way to get around needing such a written record is to have at least part of the coded list be a phrase that can be recalled easily. You may have met just such a phrase in learning to keyboard: “The quick brown fox jumps over the lazy dog.” That sentence was designed to use every letter in the alphabet and thus will serve us quite well. It has, however, some repetitions. We can easily drop the first “the”72 and then omit further letter duplications. Here then is our more easily remembered list for encryption:
A B C D E F G H I J K LM N O P Q R S T U V W X Y Z
Q U I C K B R O WN F X J M P S V E T H L A Z Y D G
A secret agent would be able to recreate that list from memory and it represents only one of millions of possibilities for such coding.73 This code can then be used with an additional Caesar shift or directly (with a further shift of 0.)
With no further shift CROSS THE RHINE would be encrypted as: IEPTT HOK EOWMK. The program of Panel 5.2.2 carries out this ciphering process for you. Notice that it works for both ciphering and deciphering. If we use a shift of 3 letters, the pairing would be:
A B C D E F G H I J K LMNOP Q R S T U VW X Y Z
C K B R OWN F X JMP S V E TH L A Z Y D G Q U I
and this time CROSS THE RHINE would be encrypted as: BLEAA ZFO LFXVO.
For many centuries such substitution ciphers were thought to be safe from decryption. But code breakers74 are generally very creative people, and mathematics began to play an increasingly important role in their work. Remarkably, given a reasonable amount of text, any straightforward75 substitution cipher can be broken in a short time. We can see how this is done by considering an example from American literature in the next section.
Exercises 5.2
(5.2.1) Caesar ciphers are so transparent that they are often used in the same way as writing answers to puzzle questions upside down: to hide the answers only temporarily. One such cipher is designated ROT13. ROT13 is an abbreviation for rotate 13, and is a Caesar shift of 13 letters. (Note: There is no program for this. You simply use Panel 5.2.1 with a shift distance of 13.)
(a) Encipher the word MATH with ROT13.
(b) Encipher your answer to (a) with ROT13.
(c) What does this indicate about ROT13?
(d) Why does this work?
(e) What would be the result if you encoded with ROT13 and encoded the result with ROT13.
(f) Let the Caesar shift of 0 letters be designated I. How does I relate to successive shifts of ROT13?
(5.2.2) Many newspapers and magazines include substitution ciphers as puzzles but, unlike most secret messages, they are broken into words, which helps in deciphering. Other clues are often offered to help readers solve them. Here is a quotation of this type for you to decrypt. Suggestion: the fifth and sixth words should give you a good start.
“AZ PEG JWMM ZEMUB PEG’SW Y QEMMWCW BJGFWDJ, ZEMUB YSW BE AHISWBBWF. PEG QYD XW Y BJGFWDJ AD YDPJKADC YDF DEJ KYLW JE UDEN YDPJKADC. TGBJ BYP JEOAQEMECP ES HYSADW XAEUADWBAB, YDF JKW IWSBED PEG’SW JYMUADC JE NAMM QKYDCW JKW BGXTWQJ JE KAHBWMZ. AZ JKAB FEWBD’J NESU, HWDJAED JKW DWGSYM BPDYIBWB EZ WHXSPEDAQ IACWEDB.”–QKGQU IYMYKDAGU
(5.2.3) In the very funny motion picture about pre-WorldWar II America, A Christmas Story, the young boy Ralphie orders a decoder ring from the Orphan Annie radio program and decodes the broadcast secret message. (Confession: I did too.) It turns out to be an advertisement for the show sponsor. You can do even better with Panel 5.2.1 playing the role of the ring.
(a) Caesar shift by 10: BEWARE THE IDES OF MARCH.
(b) What positive number shift would undo a shift of 10?
(c) Use this shift to decode your answer to (a).
(5.2.4) Carry out the tasks of exercises (5.2.3a) and (5.2.3c) using “Quick Brown Fox” coding.
5.3 Edgar Allen Poe: an early cryptologist
BXTTKXZBEEYMARCWYERTPERTEACZY
MARCKCFYZEECBADTLAOTMCKCXLCCE
BMKARYLACCMHYMQACEMTLARCBEABM
KWOMTLARHBYMWLBMNRECFCMARZYHW
CBEAEYKCERTTADLTHARCZCDACOCTD
ARCKCBARERCBKBWCCZYMCDLTHARCA
LCCARLTQXRARCERTADYDAODCCATQA
Legrand explains how he began his task:
Now, in English, the letter which most frequently occurs is e. Afterwards, succession runs thus: a o i d h n r s t u y c f g l m w b k p q x z;
with e predominating so remarkably that an individual sentence of any length is rarely seen in which it is not the prevailing character. Here, then, we have, in the very beginning, the groundwork for something more than a mere guess. The general use which may be made of the table is obvious – but, in this particular cipher, we shall only very partially require its aid. As our predominant character is C, we will commence by assuming it as the e of the natural alphabet. To verify the supposition, let us observe if the C be seen often in couples – for e is doubled with great frequency in English – in such words, for example, as meet, fleet, speed, seen, been, agree, &c. In the present instance we see it doubled no less than five times, although the cryptograph is brief.
That is not the only statistical clue that Legrand uses in deciphering the message, but I will leave further work on it to exercise (5.3.1), with help there if you need it.
Wikipedia suggests, however, that “recent analyses show that letter frequencies, like word frequencies, tend to vary, both by writer and by subject. One cannot write an essay about x-rays without using frequent Xs, and the essay will have an especially strange letter frequency if the essay is about the frequent use of x-rays to treat zebras in Qatar. Different authors have habits which can be reflected in their use of letters. Hemingway’s writing style, for example, is visibly different from Faulkner’s. Letter, bigram, trigram, word frequencies, word length, and sentence length can be calculated for specific authors, and used to prove or disprove authorship of texts, even for authors whose styles aren’t so divergent.”
If you look back at the symbols used for Morse code, you will see that more common letters are represented by shorter combinations of dots and dashes. Such letter frequencies also play a role in both design and strategy for a number of games including hangman, Scrabble, Wheel of Fortune, Definition, Bananagrams, and cryptograms.
One aspect of this study of letter frequency even had an effect on popular lit- erature. Before computers set type for books and newspapers, linotype machines were used and the central two lines of the machine keyboard reflected the letters then considered most common: e t a o i n s h r d l u. A linotype operator who made an error typing a line would finish the line by simply running his finger along the row of characters, producing those letters, and occasionally they would fail to remove the resulting metal slug that printed it. Thus the strange words etaoin shrdlu occasionally appeared in print. The words soon gained their own pronun- ciation, eh-TAY-oh-in SHIRD-loo, and the words took on a kind of in-joke status. For example, Etaoin Shrdlu is a central character in Max Shulman’s Barefoot Boy with Cheek and Shrdlu is a murderer originally played by Edward G. Robinson in Elmer Rice’s play, The Adding Machine.
Perhaps the most famous story associated with breaking a substitution code was that of Mary Queen of Scots. Driven out of Scotland, Mary headed south to seek refuge with her cousin, the English queen Elizabeth I. But the Catholic Mary posed a threat to the Protestant Elizabeth and so Mary was immediately imprisoned. Isolated and increasingly depressed, Mary communicated in a substitution code with a group seeking to rescue her in what came to be known as the Babington Plot. Elizabeth’s minister, Sir Francis Walsingham, intercepted Mary’s messages and a linguist on his staff, Thomas Phelippes, broke the code through careful frequency analysis of a number of messages. Mary’s co-conspirators were captured and executed by horrible means. Mary was then tried, found guilty and finally beheaded on February 8, 1587.
Exercises 5.3
(5.3.1) Decipher the Gold Bug code, using as few of the following hints as possible.
(a) Once he has entered e for C throughout the Kidd cipher, Legrand says:
Now, of all words in the language, the is most usual; let us see, therefore, whether there are not repetitions of any three characters, in the same order of collocation, the last of them being C. If we discover repetitions of such letters, so arranged, they will most probably represent the word the. Upon inspection, we find no less than seven such arrangements, the characters being ARC.We may, therefore, assume that A represents t, R represents h, and C represents e the last being now well confirmed.
(b) Legrand continues:
Let us refer to the last instance but one, in which the combination ARC occurs. We know that the A immediately ensuing is the commencement of a word, and, of the six characters succeeding this the, we are cognizant of no less than five. Let us set these characters down, thus, by the letters we know them to represent, leaving an asterisk for the unknown t*eeth. Here we are enabled, at once, to discard the th, as forming no portion of the word commencing with the first t; since, by experiment of the entire alphabet for a letter adapted to the vacancy, we perceive that no word can be formed of which this th can be a part. We are thus narrowed into t*ee, and, we arrive at the word tree, as the sole possible reading. We thus gain another letter, r, represented by L, with the words the tree in juxtaposition.
(c) More from Legrand:
Looking beyond these words, we again see the combination ARC, and employ it by way of termination to what immediately precedes. We have thus this arrangement: the tree thrTQXh the. Now, if, in place of the unknown characters, we substitute astersks, we read: the tree thr***h the. when the word through makes itself evident at once. But this discovery gives us three new letters, o, u and g, represented by T, Q and X.
(d) Still more, if you need it:
Referring to the beginning of the cryptograph, we find the combination, BXTTK. Translating, as before, we obtain *goo*, which assures us that the first letter is a, and that the first two words are a good.
(e) Legrand also finds words like degree and thirteen before completing the decryption.
(5.3.2) Edgar Allan Poe’s cipher was fictional, but one substitution cipher played a role in the tragic episode of the Zodiac killer in San Francisco during the 1960s and 1970s. A coded message was delivered to regional newspapers on July 31, 1969, together with a covering letter demanding front page coverage and giving details of killings that had not been divulged by the police. The message was decoded by Donald and Bettye Harden. They looked for and found that the letter group TBCCBHS translated to killing in the murderer’s code. (An additional hint: in the code the letter A occurs most often.) You can get a glimpse into a serial killer’s mind by decrypting his message (which includes misspellings and omitted words.) Although not enough evidence identified the main suspect, Arthur Leigh Allen, he was later jailed for another crime, child molestation, and the killings stopped. Here, again with letters substituted for other symbols, is the murderer’s frightening message for you to decipher:
5.4 The de Vigen`ere cipher
First developed by Giovan Bellasco in 1553, a modification of the Caesar cipher was popularized by Blaise de Vigen`ere (that family name pronounced something like vidz-un-are’), another 16th century cryptologist, and de Vigen`ere’s name has come to be associated with it. This method was long regarded as le chiffre indechiffrable – the indecipherable code. It is based on a set of Caesar shifts and a code word or\ phrase known only to the sender and receiver.
Consider all of the possible letter-shifts written out in an array, with the plaintext letters across the top. (Note that the very first line of code letters represents a zero-shift.)
An example will show how the code works. Suppose the key word is FIVE and you want to send the message bringfriends over an open line. To encode b, you trace from b down to the line that begins with F, the first letter in FIVE. To encode r, you trace down below r to the line beginning with I, the second letter in FIVE. To encode i, trace down below i to the line beginning with V. To encode n, trace down below n to the line beginning with E. Then you repeat the process, to encode g, trace down below g to the line beginning with F. If you continue in this way, your coded message will be: GZDRLNMMJVYW.
Now think about that large array. It would be useful for work with any accompanying key word, but in our case with the word FIVE, we needed only the lines beginning with those letters. Thus we can use the far simpler array:
With this reduced array, you can encode that phrase much easier. And you can reverse the process to decode the message. This coding process is incorporated in Panel 5.4.1.
It should be clear that this code is much more difficult to break than any specific Caesar shift and is even more difficult to break than the kind of substitution cipher Poe and Mary Queen of Scots used, because individual letters are coded according to different rules. For some reason, however, it did not catch on for several centuries. If Mary and her co-conspirators had used it, she might not have suffered the fate she did.
Exercises 5.4
(5.4.1) Using Panel 5.4.1 with the keyword AID.
(a) Code the phrase need rescue fast. (Note: you can use spaces in phrases to be coded, but not in keyword phrases.)
(b) You receive the following message coded using the same keyword: HMOP QV OV WHM ZAG. Decode the message using Panel 5.4.2.
(5.4.2) In 2010, CatherineWright, curator at the Museum of the Confederacy in Richmond, Virginia, opened a tiny vial disclosing a message written in a de Vigen`ere cipher on July 4, 1863. The message was in response to a request for assistance by General Pemberton, who was defending Vicksburg from the army of Union General U. S.
Grant. The message key is MANCHESTERBLUFF. Here are the message words (with some of the transcription errors corrected) printed under the de Vigen`ere key. I have included the first few letters of the decoded message in lower case letters. The rest of this interesting message was not sent, because the situation changed dramatically on that historically important day: General Pemberton surrendered.77
SEAN WIEUIIUZH DTG CNP LBHXGK OZ BJQB FEQT XZBW JJOY TK FHR TPZWK PVU
genl pemb…
RYSQ VOUPZXGG OEPH CK UASFKIPW PLVO JIZ HMN NVAEUD XYF DURJ BOVPA SF
MLV FYYRDE LVPL MFYSIN XY FQEO NPK M OBPC FYXJFHOHT AS ETOV B OCAJDSVQU
M ZTZV TPHY DAU FQTI UTTJ J DOGOAIA FPWHTXTI QLTR SEA LVLFLXFO
(a) Because you would almost certainly make typing errors entering that message, I will just give you the experience of seeing the result. You need only run Panel 5.4.3.
(b) Are there typos in the message?
5.5 Modular arithmetic
Some of you may have questioned whether cryptology is really a mathematical topic, because you have not seen many numbers so far in this chapter. Well, here they are. Modular arithmetic gives us a way to translate many of the problems you have already seen into numerical form. You saw a bit of that with ASCII code, but now you will see it in ways that offer new and important coding techniques.
Today modular arithmetic is often introduced to schoolchildren as clock arithmetic. 78 Indeed, the clock does offer a way to see what is going on. Our clocks repeat every 12 hours. (Is that news to you?) Thus, if at 10 p.m. you want to know what time it is after 9 hours of sleep, you don’t get 19 p.m. but instead 7 a.m. Forget the a.m. and p.m. designations (as your watch does unless you set it to military time) and you have simply the weird result, 10 + 9 7. Using modular notation, this is represented79 as 10 + 9 7 (mod 12) or more often simply as 10 + 9 = 7 mod 12. I adopt this latter form.
You might also ask: Will my watch show the same time 43 hours from now as it will 32 hours from now?80 In effect, you are asking: Is 43 = 32 mod 12? We can answer that question by removing some 12s from each side of the equivalence.
Removing 3 12s from the left member leaves 7, and two 12s from the right member leaves 8. Thus you have reduced the question to: Does 7 = 8 mod 12? And you know the answer to that question: They are not equal.
There is, however, an even simpler way to answer that question. We have a definition of = (or ) in any modular system:
a = b mod n iff n / (a − b)
Read that statement, “a equals b mod n if and only if n divides a minus b.”
This too requires some explanation. First of all, that vertical bar, |, means “divides evenly,” that is, the division leaves no remainder. And iff is an abbreviation for “if and only if”. In other words the expressions that appear on each side of that phrase are logically equivalent: the first implies the second and the second implies the first. Thus the single line includes both a statement and its converse. We can see how this particular statement works with our example: Does 43 = 32 mod 12? To be equivalent by this definition, 12 | (43-32) or 12 | 11. Clearly 12 does not divide 11 evenly so these numbers are not equivalent.
As its name suggests, modular arithmetic, like our normal arithmetic, is concerned with addition, multiplication, exponentiation and their inverse operations.
We don’t need to worry about the details of those operations. For our purposes it is enough to say that normal arithmetic and modular arithmetic are alike in many ways, but that in modular arithmetic we are best served by keeping our numbers, x, in the range 0 x < N, when N is the modulus.82
How can you represent any number with a given modulus, N, in that range? You divide the number by N and use the remainder. The program of Panel 5.5.1 will reduce any x mod N in this way.
In the coding applications that follow, we are going to be interested most often in two applications of modular systems: (1) converting powers like be into this range, so that we have be mod N = p with 0 p < N, and (2) reducing the value of x in the modular equation ax = 1 mod y. In this kind of processing, we have to guard against our numbers getting too large.83 To address the first task, we must return to our original means of raising numbers to powers, that is, by multiplying one factor at a time: be = b · b · b… for e factors, reducing the size of the product at each step. This is accomplished by Panel 5.5.2.
To address the second task, solving ax = 1 mod y, given a and y, you can use Panel 5.5.3.
Now let’s apply math to coding. We do this very simply. We encode the 26 letters of the alphabet as the numbers 1 to 25 and 0 mod 26. We have then the following correspondence:
and, once we have translated our message into those numbers, we can deal with a 3-shift Caesar cipher with the instruction C = M + 3 mod 26. That would give the following pairing:
Be sure you see what is happening in that process: The first row of letters is represented in the second row by their position number (mod 26) in the alphabet. Then the transformation C = M + 3 mod 26 is applied to each M in the second row to obtain the Cs of the third row. Finally, those numbers are each represented in the last row by their third row position number. You should also notice two things: (1) how the modular system takes effect there, with the final four letters rounding back to early letters in the alphabet, just as did the Caesar shifts, and (2) that 26 is playing the role of the modular number 00 in this and the following translations. In fact, that equation may suggest other mathematical shifts. Why not C = 5M mod 26? We can do that as well and have the following reassignments:
We can do still more. We can even combine a multiplicative shift with a Caesar (additive) shift, with C = 5M+ 3 mod 26:
With the 5M+ 3 mod 26 code, GO FOR IT would translate to the plaintext numbering, 07 15 06 15 18 09 20, and would be encoded as 12 00 07 00 15 22 25, these numbers representing the letters LZ GZO VY.
Decoding the simple Caesar shift is straightforward, C = M + 3 mod 26 is decoded by M = C − 3 mod 26. But multiplication becomes more involved. C = 5M mod 26 is decoded by M = 21C mod 26 and C = 5M + 3 mod 26 is decoded by M = 21C − 11 mod 26.
Those are tougher codes to decrypt, but still well within the ability of cryptographers armed with computers to make long series of trials with differing forms of the modular equation, aM + b mod 26. All of these codes are addressed in Panel 5.5.4 which allows you to encrypt and decrypt any shift of the form PX+Q.
With these tools you are now ready to examine some 21st century cryptographic procedures in the next section.
Exercises 5.5
(5.5.1) Reduce the following to numbers less than the modulus. You may need to use Panel
5.5.1 for some, but others you should be able to figure out on your own:
(a) Use C = M+ 3 mod 26 to encode the message, MEET AT SEVEN. (b) Decode the resulting coded message of (a) with M = C − 3 mod 26. (c) Use C = 5M mod 26 to encode the message, DO NOT BE LATE. (d) Use M = 21C mod 26 to decode the coded message of (c). (e) Use C = 5M+ 3 mod 26 to encode the message, I WILL BE THERE ON TIME. (f) Use M = 21C − 11 mod 26 to decode the coded message of (e).
(5.5.5) Until now all coding has been performed on letters, but the use of the ASCII code allows coding within letters. Consider the following example. You and a friend have a key, OK, that you use again and again in coding messages, just as you did with a key phrase in Vigenere ciphering. Here is how the cipher works to send the word MONDAY to your friend:
1. Write out the ASCII code for MONDAY (see page 101):
001001101 001001111 001001110 001000100 001000001 001011001
2. Below this repeatedly write out the code for OK (again on page 101) so that you have the arrangement:
001001101 001001111 001001110 001000100 001000001 001011001
001001111 001001011 001001111 001001011 001001111 001001011
3. The coded message is what you get by comparing digits in the two lines: if the digits are alike, enter 0; if unlike, enter 1. The coded message will be:
000000010 000000100 000000001 000001111 000001110 000010010.
4. To decode the message, your friend does exactly what you did and then looks up the resulting ASCII letters.
(a) Use this procedure to decode the message obtained in step 3. Compare your result with the initial ASCII message.
(b) Encode and decode the message GO.
(5.5.6) Weekdays repeat mod 7. Understanding this, in the 1880s, German mathematician Christian Zeller developed rather complicated formulas for determining the day of the week given the date. For our current (Gregorian) calendar the formula is:
Not only is the formula complicated but some of the variables in it must be calculated. The variables are D: day of the month; M: month (3 = March, 4 = April, … 13 = January, 14 = February); K: year during any century; J: century (Thus, for the year 2016, J = 20, K = 16) and W: day of the week (0 for Saturday, 1 for Sunday, . . .)
You don’t have to worry about using that formula and doing those messy calculations, however. The program of Panel 5.5.5 takes dates in standard form (1 = January, 2 = February, etc.; year in 2016 format) and applies Zeller’s formula to determine the day of the week for any date to which the Gregorian calendar applies.
(a) Check Panel 5.5.5 by using it to find the weekday for today.
(b) Determine the day of the week on which you were born.
Historically, Pope Gregory XIII announced the new calendar that would be given his name on February 24, 1582. It replaced the slightly simpler Julian calendar, named for Julius Caesar, who decreed its use in 45 BCE, just one year before he was murdered on the Ides of March. Zeller’s congruence for the Julian calendar differs only slightly from the formula he introduced for the Gregorian calendar. It is:
Panel 5.5.6 implements Zeller’s congruence for the Julian calendar.
The following table indicates when some countries instituted the calendar change:
(c) In 1752 the United States had not yet separated from Britain. Determine the day of the week for the last day for what was called the Old Style calendar and the first day of the New Style calendar for the British colonies. (It is said that riots broke out in some locations because people thought that they were losing days from their lives.)
(d) Thomas Jefferson lived at a time when the conversion was only slowly being accepted. For that reason he had both Old Style and New Style dates engraved on his tombstone. His Old Style birthday was April 2, 1743, and his New Style birthday was April 13, 1743. What were the days of the week for those two dates?
(e) What do your answers to (c) and (d) suggest about the day of the week at the change-over?
5.6 The RSA code
Coding techniques continued to be refined, and early in the 20th century machines were developed that coded messages automatically.84 Despite this, much secret communication continued to be by messages using various forms of substitution ciphers.
It seemed that secret coding had matured and only refinements could be added. But then, as so often happens in science, a remarkable breakthrough occurred. In 1978, three MIT mathematicians, Ron Rivest, Adi Shamir and Leonard Adleman, 85 published a quite simple coding procedure that could not be broken within a reasonable time even with the use of powerful computers.
There are several things that are interesting about this new procedure, which is referred to as RSA to memorialize its authors. First, it derived from the most esoteric branch of mathematics, number theory, the study of the integers. Number theorists are considered even in the mathematics community as ivory tower types who have no interest in the real world. One of them, G. H. Hardy, in his famous 1940 essay, A Mathematician’s Apology, said, “No one has discovered any warlike purpose to be served by the theory of numbers or relativity, and it seems very unlikely that anyone will do so for many years.” Indeed, his entire essay is an argument in support of pure mathematics as separate from applications. At the very time that he wrote this essay, however, number theoretic ideas, binary arithmetic in particular, were showing the way to the computing technology that pervades modern society including its wars and RSA provides one more example of the application of this field.
Second, it is reasonable to consider the question, Do we really want unbreakable codes? Clearly, some people do not. Agencies like the CIA devote substantial resources to breaking codes in order to defend against spies and terrorists. During the 1990s some cryptographers claimed that they were being harassed by the National Security Agency and the FBI, and the historic 1995 case Bernstein v. United States ultimately resulted in a decision that cryptographic programs are protected as free speech by the United States Constitution.
Finally, there is the simplicity of the idea of RSA-type code. The following example may seem complicated as I first describe it, but the procedure is easily implemented on computers. I will describe the RSA procedure in terms of two friends, one of whom sends and the other receives coded messages.
The friends, Eve and Adam, establish a way for Adam to send Eve secret messages over open communication lines. But Eve has had relationships with men go sour in the past, so she doesn’t want Adam to know her secret system. It may seem remarkable but, with these restrictions, Adam can send messages to Eve in code that even he cannot decipher. I will explore the reasons for that later.
Eve has computer access and can use programs like Mathematica or Maple to identify large prime numbers86 and to multiply them. By large, I mean numbers with hundreds of digits. Here, for example, is a prime with 100 digits identified by
Mathematica:
8543948143683640329580086824678208458410818089426611079788166431288878903122562200091848347746523
In what follows, the primes P and Q would be numbers of that size. In the example I will use to illustrate the process, I will instead use small numbers to illustrate the conepts. While it will be useful for you to follow what is going on here, you will have all of these procedures implemented on a panel.
Setting up the RSA coding procedure
1. A general public key number is published for everyone to see – for example, it might appear on the Internet. It will be useful to have this number a prime. We’ll useE = 47for this number.
2. Eve chooses two large primes, P and Q, and makes only their product,PQ = N, public as her personal code number. For example, she might publish N on her Facebook page. For our example, I’ll have Eve choose the numbers,P = 19and Q = 23, so that N = PQ = 437. Note however that she keeps P and Q secret. She might, for example, post the message, “Hi, everybody, my personal code number is 437 and I remind you that the general public key is 47.” (Remember that her real numbers would have hundreds of digits.)
3. Eve also calculates two other numbers that she will use regularly in decoding messages. The first is the number R = (P-1)(Q-1). In this example R = (19-1)(23-1) = 18×22 = 396. The second is the number D which satisfies the equation, ED = 1 mod R. For our example, 47D = 1 mod 396 leads to D = 59. (You can check that computation with Panel 5.5.3.)
All of the calculations behind those three steps are in Panel 5.6.1. All Eve had to do was chose the primes P and Q and enter them together with E to calculate R and D.
At this point everyone knows two numbers: E = 47 and N = 437. Eve knows much more; she knows that P = 19, Q = 23, R = 396 and D = 59. (In this example you could figure out those primes by factoring 247, but only because I chose such small numbers.87 You will see in the next section that large prime products are very difficult to factor.)
Sending an RSA coded message
Now Adam, or anyone else for that matter, can send Eve secret messages. I’ll keep this very simple: suppose infatuated Adam just wants to send Eve the message, X, representing a kiss. Panel 5.6.2 carries out the following procedures for him.
1. Adam first has to represent X with a number. He could use ASCII octal code (88) for X, but we’ll simply use the numerals for the letters of the alphabet on page 111. X is the 24th letter so his plaintext message isM = 24.
2. To convert M to code C, the program then evaluates the modular equation
C = ME mod N. (You can check the computation of our example,
C = 2447 mod 437, using Panel 5.5.2.) His result is C = 139.
3. Adam posts the public message, “Hi, Eve, 139, Adam.”
Receiving an RSA coded message
Eve checks her incoming messages and finds Adam’s. She uses Panel 5.6.3 to decode the 139. The panel does the following for her:
1. The program solves the modular equation: M = CD mod N, or in this case M =13959 mod 437. (You can check this calculation with Panel 5.5.2.) Her result isM = 24.
2. Eve knows that this represents X as the 24th letter in the alphabet, but she wonders if Adam is trying to tell her something about algebra. Notice that you need only three programs to establish and use the RSA encryption decryption
process: Panel 5.6.1 to establish the public values of N and E and the secret value of D; Panel 5.6.2 to encode a message; and Panel 5.6.3 to decode a message. Using those programs, you can send and receive messages similar to those you sent and received with the Vigenere programs.
We have, of course, only sent and received extremely simple messages and we have done so with a restricted version of the RSA system; however, those three programs represent the way the same process is used with computers that can access and process much larger numbers and longer messages.
Exercises 5.6
Note: These exercises are only illustrative. The actual coding process is significantly more complicated.
(5.6.1) Using the values P=17, Q=37 and E=43:
(a) Using Panel 5.6.1 determine the corresponding values of N and D.
(b) Using Panel 5.6.2 encode the message READY TO BUY. (Record these values to use in (c).)
(c) Using Panel 5.6.3, decode your result in (b). Does it give you the message that was sent?
(5.6.2) Using the values P=89, Q=97 and E=71:
(a) Using Panel 5.6.1 determine the corresponding values of N and D.
(b) Using Panel 5.6.2 encode the message DOWNSTAIRS.
(c) Using Panel 5.6.3, decode your result in (b). Does it give you the correct text?
(5.6.3) You may have noticed that the letter A, the first letter in the alphabet, is 1 in both plain text and code. By examining step 2 in the sending process, explain why this happens. It represents a flaw in the process we have used. If you had used the ASCII octal values for letters, this would not have happened. There is a more serious flaw in the procedure I have shown, however. Using the set-up values of exercise 5.6.2, send the message AAA BBB CCC DDD.
(a) What do you notice about how each letter is translated?
(b) Your answer to (a) indicates that all the work of my examples has produced a simple substitution code like the one Legrand used in Poe’s story or that appear in newspapers as word puzzles. Code breakers would just have to collect a number of messages that Eve received in order to break her code. One way Eve might avoid this would be to remain unpopular so she receives few messages, but obviously coders use other means to avoid this. The website www.di-mgt.com.au/rsa alg.html#simpleexample discusses ways to avoid this problem. Do not get bogged down with the vocabulary of this website. Instead, just respond to the following question: Is using the RSA coding procedure as simple as I have made it out to be in this section—and the initial examples in the website do as well?
Do not, because of your answer to (b), throw out the baby with the bath water. RSA coding represents a paradigm shift, a game change in the world of coding, with research ongoing.
5.7 Why RSA Coding is such a breakthrough
Suppose that the cleaning lady gives p and q by mistake to the garbage collector, but that the product pq is saved.
How to recover p and q? It must be felt as a defeat for mathematics that the most promising approaches are searching the garbage dump and applying memo-hypnotic techniques.
– Hendrik W. Lenstra Jr.
There are two major reasons why RSA coding represents serious problems to code breakers. First, you should understand that, even with Eve’s value of N that Adam uses in coding his message, X, that he sent over an open line as C = 139, carries no meaning to anyone but Eve. In fact, if Adam forgot the message he sent, he could not then retrieve it from its coded form! The reason is that it is not possible to work backward to solve a modular equation the way it usually is with an algebraic equation. Consider, for example, the equation 4x+1 = 29. It is quite straightforward to find the single value for x that satisfies that equation, x = 7. But now consider the modular equation 4x+ 1 = 29 mod 9. That equation also has the solution x = 7, but an infinite number of other values will also satisfy it: x = 16, x = 25, x = 34, and so on.
Knowing this, simply trying different values of D for the equation M = 139D mod 437 to determine M doesn’t work, because you get many possibilities for each calculation.
But you could find the value of D by factoring N to give you P and Q. That would be easy for the values of N I used in my examples. I used those particular values for two reasons: (1) to keep the calculations clear, and (2) to keep calculating time reasonable. Large powers were involved and I wanted to avoid slow processing. You don’t want to sit for 20 minutes waiting to get an answer to a homework question.
I will try to show you why finding P and Q, knowing N, when N is a very large number is a problem for code-breakers. Stated another way, we want to find out why simply factoring N should serve as such a deterrent to them. To explore this question, I introduce Panel 5.7.1, which simply does what it says. It factors positive integers into their prime factors. This panel is also useful for identifying primes, when no smaller prime factors are found.
Since Panel 5.7.1 reports calculating time, I will use it to show you that factoring large numbers takes much computing time. Consider for this purpose the two largest primes less than 100, then 1000, 10,000 and 100,000. Then form the product of each pair and use the panel to time how long it takes to find those factors. Here is a table giving this information from my own experimentation. You should check the first few results. Your results should be within a few seconds of mine, depending on the computing device supporting your book. (Unless you have time to spare, forget checking the last: 2323 seconds is 38 minutes and 43 seconds.88)
1. As the number of digits in the prime factors increases by 1, the time is multiplied by about 10. 2. For just 5-digit factors we have already reached over 38 minutes.
Our 10-digit calculators cannot continue this table, but you should see that things would increase dramatically if it could be extended with this same increasing tenfold factor. The relationship between the number of digits in the prime factors, n, and the time in seconds, t, is approximately t = 2.3 · 10n−2. Using this equation to calculate larger factors, here are some of the approximate times that would result:What that means is that, if a computer with the power of yours could handle 24-digit numbers, it would take over 700 years to find the prime factors of the number 999,999,999,950,000,000,000,429. A reasonable response to this is: Yes, but my computing device is small and this is a rather simple program. Big computers can calculate thousands of times, perhaps even millions of times, as fast. If your computer is 1,000 times as fast, it would still take 250 days to factor that number, 1,000,000 times as fast, still 6 hours. And indeed you could devise a more efficient program. Don’t miss the point here, however. Factoring with many-digit numbers takes a significant amount of time even with the most powerful computers. With a computer program like Mathematica you can multiply two 100-digit prime numbers in an instant, but it would take a great deal of time to factor the resulting 200-digit product even with the finest current technology.
Exercises 5.7
(5.7.1) Here are some other modular situations that show different possible answers:
(a) Suppose it is 10 a.m. and someone asks you in how many hours will it be 4 o’clock? You could answer 18 hours and be correct, but what are some other correct answers? This corresponds to solving the equation 10 + x = 4 mod 12.
(b) You know that x = 0 is a solution of 3x = 0 mod 12. By testing even numbers from x = 2 to x = 24, find some additional solutions.
(5.7.2) Okay, you understand that, with large values of P and Q, it would be hard to recover their values from N, but we have been forced to use small N. Let’s see how you can work backwards from knowing N = 391 and E = 15 to find D:
(a) With Panel 5.7.1 find the prime factors of N.
(b) Use those values of P and Q to calculate R = (P − 1)(Q − 1).
(c) Now knowing E and R, you can find D from the modular equation
ED = 1 mod R. When you have found D, you have broken the code and can translate any coded message sent to the receiver who used those numbers.
(5.7.3) Suppose technology advances to the point at which computer programs factor large numbers 100,000,000,000,000 (that’s 100 trillion) times as fast as your little calculator. You now use this technology to attempt to factor a 200-digit composite number into two approximately 100-digit primes.
(a) Extend the table at the end of Section 5.7 to 200 factor digits to determine the time it would take with your calculator to carry out this task. Powers of 10 can help you represent this. For 10 digits you have 7 · 100 years, for 11 digits 7 · 101 years and so on. You want to find the number of years that corresponds to 200 digits.
(You should not have to work out each year to find this value.)
(b) Divide that time by 100,000,000,000,000 = 1014 to get an idea of how many years it would take with your assumed more modern technology.
(c) Does your answer to (b) suggest that cryptanalysts should seek an entirely different method for breaking RSA-type codes?
(5.7.4) In 1965, Gordon Moore, president of the transistor manufacturer Intel, predicted that computers would double in speed every two years.90
(a) Doubling every two years means multiplying by what factor every ten years?
(b) Modern computers operate very fast. In fact the acronym MIPS, for million instructions per second, is used to keep track of how fast they operate. Here is a table of the approximate speed of the fastest computers over the past 40 years:
Figure – 5.9:
Have computers lived up to Moore’s projection?
(c) What is a simple way that RSA coders can respond to this increase in computer speed?
References
63Many topics in this chapter are extensions of those in Simon Singh’s excellent book with the same title as this chapter.
64The Science of Secrecy, pp. x-xi.
65There is a technical difference between codes and ciphers. Codes involve whole words or even phrases: interesting examples are found in newspaper personal ads where you find acronyms like LTR for long-term relationship, ND for non-drinker, GSOH for good sense of humor, S single, and ISO for in search of. Ciphers, on the other hand, involve the encryption of single letters: thus, if each letter is represented by its numerical order – a = 01, b = 02, etc. – “single” would become “190914071205”. In common usage this distinction is not made, and it is not made here. I will be talking almost exclusively about ciphers but using the words “code” and “cipher” as synonyms.
66We had an interesting experience with misrepresentation of a flag message on our navy ship. As commander of a small squadron of vessels, my captain asked our signal petty officer to hoist a message telling the other ships to sail at 5 knots – just over 5 miles per hour. The signal was hoisted: “Make speed five.” Unfortunately he should have signaled “Make speed zero five.” His message instead told the other ships to make what is called flank speed: go as fast as you possibly can. The other ships were almost to the horizon before the error was corrected.
69Pig Latin has nothing to do with Latin. It is a childish encoding of words by moving the beginning consonant sound of a word to the end of the word and annexing the sound “ay” to this. Thus “pig” becomes “igpay” and “house” becomes “ousehay.”
70Coded messages are often also broken into two parts like “our profession Paranoia is” to further confuse cryptanalysists, who look for salutations like “Dear friend” at the beginning of messages.
78This was one of the nonsensical curriculum offerings taken up by some of the New Math ele- mentary school textbooks of the 1970s. There were no reasonable applications for the topic and all it provided was busywork. You may have met this topic yourself and soon, quite reasonably, forgot it. It is an appropriate topic now, however, because you will soon find that it has important applications.
79the sign is read “is equivalent to” or “is congruent to”.
80I cannot imagine anyone asking that question, but it does provide me an opportunity to introduce more ideas about modular systems.
85Five years earlier, British mathematician Clifford Cocks and his colleague, James Ellis, had discovered an equivalent system but, because they worked for British Intelligence, their discovery was not revealed until 1998 and was not known to the MIT team. In fact all of this work was based on still earlier discoveries. Before 1980 the team of Whitfield Diffie, Martin Hellman and Ralph Merkel had devised the idea of a cipher that did not require the transfer of a secret key.
86Recall that prime numbers are positive integers with exactly two different factors, themselves and one. Thus 5 is prime, because its factors are 1 and 5; 6 is not prime because it has factors 1,2,3 and 6; 1 is not prime because it has only one factor, itself; and 0 is not prime because it has an infinite number of divisors.
