Simple XOR: In ruby using a test driven approach

XOR works on bits. So what is a bit? Put simply, it is a 0 or a 1. The XOR operation compares 2 bits and if they are the same it returns a 0. If they are different it returns a 1. Take a look at this this figure below to get a better idea:

0 ^ 0 = 0
0 ^ 1 = 1
1 ^ 0 = 1
1 ^ 1 = 0

I decided to arm my brain with some cryptography knowledge, and XOR is one of the first methods I am learning about. I should mention that it is noted XOR is not the best way to encipher your data. Not even close actually. So please don’t use anything you read here for anything more than an experiment.

With that disclaimer out of the way I am comfortable focusing on using a test driven approach to learn.

At any point if you are stuck are not able to follow along, take a look at the source code hosted on github. I will be implementing the methods in ruby, and testing with rspec. We start by creating a file xorer.rb and adding a describe block:

describe Xorer do
end

When we run rspec we see the following error:

$ rspec xorer.rb
/Users/mattweppler/developer/projects/xorer/xorer.rb:1:in `<top (required)>': uninitialized constant Xorer (NameError)
...

This is because we have not yet defined an Xorer module. So above the describe block let’s define it:

module Xorer
end

When we run rspec this time, we see the following:

$ rspec xorer.rb
No examples found.

Finished in 0.00003 seconds
0 examples, 0 failures

It is time to add our next test, but what should it be? Basically, I want to take a key and XOR against some plain text. So given these two strings I need to start by converting them into there binary representation. We will need to take it character by character, first turning the character into a byte or character code.

So given the character a, the byte should be 97. That will be our first test.

describe Xorer do
  it 'converts a character to a byte' do
    expect(Xorer.char_to_byte('a')).to be(97)
  end
end

Running rspec now we should see the following:

$ rspec xorer.rb

Xorer
  converts a character to a byte (FAILED - 1)

Failures:

  1) Xorer converts a character to a byte
     Failure/Error: expect(Xorer.char_to_byte('a')).to be(97)
     NoMethodError:
       undefined method `char_to_byte' for Xorer:Module
     # ./xorer.rb:6:in `block (2 levels) in <top (required)>'

Finished in 0.00036 seconds
1 example, 1 failure

Failed examples:

rspec ./xorer.rb:5 # Xorer converts a character to a byte

The key to this error is the undefined method 'char_to_byte' for Xorer:Module. Which points out the obvious, we are missing a method named char_to_byte in our Xorer module. So let’s add this:

module Xorer
  def self.char_to_byte(char)
  end
end

When we run rspec this time, it fails with:

$ rspec xorer.rb

Xorer
  converts a character to a byte (FAILED - 1)

Failures:

  1) Xorer converts a character to a byte
     Failure/Error: expect(Xorer.char_to_byte('a')).to be(97)

       expected #<Fixnum:195> => 97
            got #<NilClass:8> => nil

       Compared using equal?, which compares object identity,
       but expected and actual are not the same object. Use
       `expect(actual).to eq(expected)` if you don't care about
       object identity in this example.
     # ./xorer.rb:8:in `block (2 levels) in <top (required)>'

Finished in 0.00125 seconds
1 example, 1 failure

Failed examples:

rspec ./xorer.rb:7 # Xorer converts a character to a byte

Basically, that it was expecting the char_to_byte method to return 97 but instead nothing or nil was returned. So let’s fulfill our expectation and return 97.

module Xorer
  ...
  def self.char_to_byte(char)
    97
  end
  ...
end

…and when we run our test, it passes!

$ rspec xorer.rb

Xorer
  converts a character to a byte

Finished in 0.00071 seconds
1 example, 0 failures

The problem is that while it passes this test, what if we passed the character b to the char_to_byte method? Let’s change our test to the following:

describe Xorer do
  ...
  it 'converts a character to a byte' do
    expect(Xorer.char_to_byte('b')).to be(98)
  end
  ...
end

…and when we run our test, it passes! Oops… no, actually it doesn’t. It fails with:

$ rspec xorer.rb

Xorer
  converts a character to a byte (FAILED - 1)

Failures:

  1) Xorer converts a character to a byte
     Failure/Error: expect(Xorer.char_to_byte('b')).to be(98)

       expected #<Fixnum:197> => 98
            got #<Fixnum:195> => 97

       Compared using equal?, which compares object identity,
       but expected and actual are not the same object. Use
       `expect(actual).to eq(expected)` if you don't care about
       object identity in this example.
     # ./xorer.rb:9:in `block (2 levels) in <top (required)>'

Finished in 0.00079 seconds
1 example, 1 failure

Failed examples:

rspec ./xorer.rb:8 # Xorer converts a character to a byte

So why did I waste your time? Well, I didn’t really waste your time. This process of TDD begins with taking a small idea and asking yourself: Do I have a test for that? If the answer is No, well the next step is to Write a test. Once you have your test, the next question you ask is: Does the test pass? If the answer is No, well the next step is to Write just enough code for the test to pass.

Then you ask again: Does the test pass? At the time when our code_to_byte method was returning 97 and our test was expecting a 97, we could answer yes to that question. We had written just enough code for the test to pass. So the next question would be: Do you need to refactor? Well since we know that hard coding 97 as a return value is not really changing the character to a byte, our answer is Yes.

So our next step is to Refactor the code. Let’s change our test back to pass a, and expect the result to be 97. Then we change the char_to_byte method to the following:

module Xorer
  ...
  def def self.char_to_byte(char)
    char.ord
  end
  ...
end

We ask ourself Does the test pass? and to answer that question we run rspec and get the following:

$ rspec xorer.rb

Xorer
  converts a character to a byte

Finished in 0.00071 seconds
1 example, 0 failures

Great! It passes again. So what is the ord method we’re calling on char? From the ruby docs we see that ord returns the Integer ordinal of a one-character string. Which is exactly what we’re trying to do. Or is it? Well, kind of, but not exactly.

We know that a bit is a 0 or a 1. An On or Off. One of two possible values. A byte is 2 to the power of 8. Which gives us a possible 256 values, 0 to 255 (since computers start counting at 0). 97 is the ascii value of the character a and it does fall into this range.

The problem arrises if the plain text has unicode values. For instance, given '六'.ord (which is Chinese for “Six”) will return the value: 20845, which is two bytes long. To keep things simple I am assuming that we will only be dealing with English characters.

So we ask ourself again Do you need to refactor? I am happy with our implementation of this feature so the answer is No. Moving on it is now time to Select a new feature to implement.

We now need to turn the byte into a binary value to work so that we can XOR it. So let’s add the following:

describe Xorer do
  ...
  it 'converts a byte to a binary string' do
    expect(Xorer.byte_to_binary_string(97)).to eql('1100001')
  end
  ...
end

If you cannot count in binary you can take my word that the binary representation of 97 is 1100001. Since we’re learning an explanation would be better.

So computers only work with 0’s and 1’s and 97 is a byte (value between 0 and 255 or 2 to the power of 8). Take a look at the figure below:

--------------------------------------
| 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
--------------------------------------
|     |  1 |  1 |  0 | 0 | 0 | 0 | 1 |
--------------------------------------

A bit is considered On if it has a value of 1 and Off if it has a value of 0. If it is Off we do not count it. With all the bits Off (00000000) the value is 0. With all the bits On (11111111) the value is 255.

1 plus 2 equals 3.
3 plus 4 equals 7.
7 plus 8 equals 15.
15 plus 16 equals 31.
31 plus 32 equals 63.
63 plus 64 equals 127.
127 plus 128 equals 255.

In the case of ‘1100001’ we only have 7 digits so the left most bit would be a 0. So we have a 1 in the 1 column, a 1 in the 32 column, and a 1 in the 64 column.

1 plus 32 equals 33.
33 plus 64 equals 97.

With that out of the way let’s run our test:

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string (FAILED - 1)

Failures:

  1) Xorer converts a byte to a binary string
     Failure/Error: expect(Xorer.byte_to_binary_string(97)).to eql('1100001')
     NoMethodError:
       undefined method `byte_to_binary_string' for Xorer:Module
     # ./xorer.rb:13:in `block (2 levels) in <top (required)>'

Finished in 0.00084 seconds
2 examples, 1 failure

Failed examples:

rspec ./xorer.rb:12 # Xorer converts a byte to a binary string

We see that we have not yet defined the byte_to_binary_string method. So in our Xorer module let’s add that:

module Xorer
  ...
  def self.byte_to_binary_string(byte)
    byte.to_s 2
  end
  ...
end

The Fixnum class has a to_s method that takes a radix base between 2 and 36. If we run our test now its passing:

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string

Finished in 0.00139 seconds
2 examples, 0 failures

We don’t need to refactor this so let’s work on our XOR feature. We need a test for it. I’m going to pop the training wheels off and step it up a notch. Assume that after adding a test I run rspec and based on the error message, add the least amount of code to make it pass.

Select a new feature to implement: xor

Do you have a test for that?

describe Xorer do
  ...
  it 'xors two binary values' do
    expect(Xorer.xor(0, 0)).to be(0)
  end
end

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values (FAILED - 1)

Failures:

  1) Xorer xors two binary values
     Failure/Error: expect(Xorer.xor(0, 0)).to be(0)
     NoMethodError:
       undefined method `xor' for Xorer:Module
     # ./xorer.rb:21:in `block (2 levels) in <top (required)>'

Finished in 0.00124 seconds
3 examples, 1 failure

Failed examples:

rspec ./xorer.rb:20 # Xorer xors two binary values

Write just enough code for the test to pass

module Xorer
  ...
  def self.xor(val_1, val_2)
    val_1 ^ val_2
  end
end

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values

Finished in 0.00113 seconds
3 examples, 0 failures

Do you need to refactor? Based on the requirements of this test No, we don’t need to refactor. So let’s work on our next feature.

Our xor method works fine when given binary values like 0 in this case. What happens if we pass in binary strings like ‘1100001’ & ‘1100010’? Let’s write another test for that, but first what is the value we’re expecting supposed to be? Let’s actually XOR this by hand:

1 ^ 1 = 0
1 ^ 1 = 0
0 ^ 0 = 0
0 ^ 0 = 0
0 ^ 0 = 0
0 ^ 1 = 1
1 ^ 0 = 1

So we’re expecting the value: ‘0000011’.

describe Xorer do
  ...
  it 'xors two binary strings' do
    expect(Xorer.xor('1100001', '1100010')).to eql('0000011')
  end
  ...
end

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values
  xors two binary strings (FAILED - 1)

Failures:

  1) Xorer xors two binary strings
     Failure/Error: val_1 ^ val_2
     NoMethodError:
       undefined method `^' for "1100001":String
     # ./xorer.rb:11:in `xor'
     # ./xorer.rb:29:in `block (2 levels) in <top (required)>'

Finished in 0.00127 seconds
4 examples, 1 failure

Failed examples:

rspec ./xorer.rb:28 # Xorer xors two binary strings

Write just enough code for the test to pass

module Xorer
  ...
  def self.xor(val_1, val_2)
    val_1 = val_1.to_s.scan(/./).map { |i| i.to_i }
    val_2 = val_2.to_s.scan(/./).map { |i| i.to_i }
    xord = []
    0.upto(val_1.size - 1) do |i|
      xord << (val_1[i] ^ val_2[i])
    end
    xord.join('')
  end
  ...
end

Ok, so that method just got a bit more complicated… So what are we doing here? First we need to handle either a string or a number. To do that we can just turn the values into strings with the to_s method. Next we are calling the scan method on the string and passing in the argument /./. This is a regular expression. Regular expressions are outside of what I want to cover, but in this case take a deep breath and accept this explanation.

The /’s use enclose the regular expression the way '’s enclose a string. So the heart of this particular regular expression is the ., which says match any character.

The scan method iterates through the string, matching the given pattern. For each match, a result is generated and either added to the resulting array.

Next we call the map method on the array. map takes a block and transforms each value in the array. In this case we are taking each digit in the string and calling the to_i method on it. This allows us to do an ^ (XOR) on the values.

So now we create an array to hold the XOR results. We run a loop from 0 to val_1’s size, minus 1, using i as the index. Finally we join the xord array and return the resulting string.

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values (FAILED - 1)
  xors two binary strings

Failures:

  1) Xorer xors two binary values
     Failure/Error: expect(Xorer.xor(0, 0)).to be(0)

       expected #<Fixnum:1> => 0
            got #<String:70312811550080> => "0"

       Compared using equal?, which compares object identity,
       but expected and actual are not the same object. Use
       `expect(actual).to eq(expected)` if you don't care about
       object identity in this example.
     # ./xorer.rb:31:in `block (2 levels) in <top (required)>'

Finished in 0.00138 seconds
4 examples, 1 failure

Failed examples:

rspec ./xorer.rb:30 # Xorer xors two binary values

Unfortunately when we run this our previous test fails. This is because we are now returning a string instead of an integer. Since we changed the api we need to update the failing test. So update the failing test to:

describe Xorer do
  ...
  it 'xors two binary values' do
    expect(Xorer.xor(0, 0)).to eql('0')
  end
  ...
end

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values
  xors two binary strings

Finished in 0.00116 seconds
4 examples, 0 failures

Do you need to refactor? No, this looks good. So, on to the next feature. Next we have to go the other way. From a binary string to a byte and from a byte to a char. Let’s start with converting a binary string into a byte.

Do you have a test for that?

describe Xorer do
  ...
  it 'converts a binary string to a byte' do
    expect(Xorer.binary_string_to_byte('1100001')).to eql(97)
  end
  ...
end

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values
  xors two binary strings
  converts a binary string to a byte (FAILED - 1)

Failures:

  1) Xorer converts a binary string to a byte
     Failure/Error: expect(Xorer.binary_string_to_byte('1100001')).to eql(97)
     NoMethodError:
       undefined method `binary_string_to_byte' for Xorer:Module
     # ./xorer.rb:39:in `block (2 levels) in <top (required)>'

Finished in 0.00146 seconds
5 examples, 1 failure

Failed examples:

rspec ./xorer.rb:38 # Xorer converts a binary string to a byte

Write just enough code for the test to pass

module Xorer
  ...
  def self.binary_string_to_byte(binary_string)
    binary_array = binary_string.reverse.scan(/./).map { |i| i.to_i }
    count = 1
    sum = binary_array[0] * count
    1.upto(binary_array.size - 1) do |i|
      count *= 2
      sum += binary_array[i] * count
    end
    sum
  end
  ...
end

So what is going on here? We have a binary string and start by calling reverse on it. So the name implies, it reverses the characters in the string. More on why we’re doing that in a moment. Next we do the same scan & map that we used before to create an array of binary numbers.

Next we create a count variable and set it to 1. Our algorithm is simple, we multiply the bit by count and add that value to sum. Then we multiply the value of count by 2 and that becomes the new value of count.

Since we’re dealing with our binary string in reverse this makes perfect sense. count starts at 1, then becomes 2, then 4, and so on…

Finally we return the sum, which is now our byte or character integer.

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values
  xors two binary strings
  converts a binary string to a byte

Finished in 0.0013 seconds
5 examples, 0 failures

Do you need to refactor? No, our test is passing and this works fine for now.

Select a new feature to implement: convert a byte to a char

Do you have a test for that?

describe Xorer do
  ...
  it 'converts a byte to a character' do
    expect(Xorer.byte_to_char(97)).to eql('a')
  end
  ...
end

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values
  xors two binary strings
  converts a binary string to a byte
  converts a byte to a character (FAILED - 1)

Failures:

  1) Xorer converts a byte to a character
     Failure/Error: expect(Xorer.byte_to_char(97)).to eql('a')
     NoMethodError:
       undefined method `byte_to_char' for Xorer:Module
     # ./xorer.rb:55:in `block (2 levels) in <top (required)>'

Finished in 0.00134 seconds
6 examples, 1 failure

Failed examples:

rspec ./xorer.rb:54 # Xorer converts a byte to a character

Write just enough code for the test to pass

module Xorer
  ...
  def self.byte_to_char(byte)
    byte.chr
  end
  ...
end

Here we’re using the chr method which returns a string containing the character representation.

Does the test pass?

$ rspec xorer.rb

Xorer
  converts a character to a byte
  converts a byte to a binary string
  xors two binary values
  xors two binary strings
  converts a binary string to a byte
  converts a byte to a character

Finished in 0.00146 seconds
6 examples, 0 failures

Do you need to refactor? No, this is actually a good stopping point for now.

In my next post I will add the ability to read in a ‘key’ (string of characters) and a file containing ‘plain text’ from the command line. We will XOR the ‘key’ against the ‘plain text’ and output the resulting ‘cipher text’ to a file. We will run into some problems along the way and add some helper methods. Then we will look at pack and unpack as potential ways of working with the data.

Hope this was somewhat helpful, and if you have any comments or feedback please let me know. You can view the source code hosted on github.

See part 2 of our simple xorer.