Can someone explain how the modulus work?

Question

I have understood everything I think. I have not included the rest of the code, but the problem to solve is basically checking all letters and then swap the letters by n places. For example 'B' becomes 'G' if n is 5.

As I understand it uses ASCII table values so the code below becomes: (65 + ((66 - 65 + 5)) % 26)

character = (char)('a' + ((character - 'a' + n)) % 26);

What I don't understand is how the modulus % 26 makes the "reset" to start the alphabet over again. If someone were able to explain this in some easy way I would be grateful.

Acorn · Accepted Answer

The % C operator yields the remainder of the integer division.

For instance:

24 % 26 == 24
25 % 26 == 25
26 % 26 == 0
27 % 26 == 1

And so on.

Therefore, your example:

65 + (66 - 65 + 5) % 26 == 65 + 6 % 26 == 65 + 6 == 71

The remainder is there in case you have lowercase letters, which have a higher code point.

Steve Summit · Answer

It's a hallmark of the modulus operator that it "wraps around". One way of thinking about it involves what's sometimes called "clock arithmetic".

Let's look at some integers (first row) and their remainders modulo three (second row). You can clearly see that although the numbers keep getting bigger and bigger, the remainders never get bigger than 2, they keep wrapping around 0, 1, 2, 0, 1, 2.

1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
1  2  0  1  2  0  1  2  0  1  2  0  1  2  0  1  2  0  1  2  0  1  2  0  1

And obviously this works the same for numbers modulo 26.

But in the basic recipe, you get modulus numbers ("remainders") that go from 0 up to N-1. So how can you use this to do "modular addition" on letters in the range [A..Z]? It's pretty easy to see how, say, M + 5 goes to R, but how do you arrange for Y + 5 to wrap around to D?

Well, if you want to map the character codes for the letters A..Z to and from the integers [0..25], one simple way is to subtract or add the character code for the letter A, which is 65 in ASCII. (Actually we don't even need to do things in terms of that "magic number", as we'll see in a minute.)

In detail: take your letter value, subtract 65 to map it to the range [0..25], add your Caesar cipher offset (5), which might take you outside the range [0..25], take it modulo 26 to wrap it back into the range [0..25] again if necessary, then finally add 65 to map it back to the range [A..Z] again. In C that's

(((c - 65) + 5) % 26) + 65

Or, graphically:

  -65    +5   %26   +65
A     0     5     5     F
.     .     .     .     .
.     .     .     25    Z
.     .     .     0     A
.     .     .     .     .
Z     25   30     4     E

This works because, in ASCII at least, the character codes for all the uppercase letters are contiguous. (But don't try this with EBCDIC, kids!)

If you wanted it to work for lowercase letters, you'd have to use 97, not 65.

But as I mentioned, using magic numbers like 65 and 97 is (a) a nuisance, because you have to know or look up those numbers, and (b) a sign that you don't know C as well as you could, because there's a nifty shortcut. Be lazy, let the computer do the dirty work: Since characters in C are represented by their values in the machine's character set, the value of the character "capital A" is, literally, the character constant 'A'.

So instead of writing

(((c - 65) + 5) % 26) + 65

you can just write

(((c - 'A') + 5) % 26) + 'A'

Or, for lower case:

(((c - 'a') + 5) % 26) + 'a'

Now you don't have to worry about figuring out the magic numbers, and anyone reading the code doesn't have to try to figure them out, either.

Can someone explain how the modulus work?

Tags:

c

Henrik Maaland

2 Answers

Acorn

Steve Summit

Recent Activity

Donate For Us

Can someone explain how the modulus work?

Tags:

c

Henrik Maaland

2 Answers

Acorn

Steve Summit

Related questions

Recent Activity

Donate For Us