Custom Source Control
I, like a lot of developers these days, really enjoy working with Git. It just makes sense to me. However, not every shop has bought in. Some use Subversion, Perforce, Mercurial, and/or Fossil to name a few. That is just to mention a few, as there are even more that I have not worked with. Git though, I have used enough to appreciate. While working with other SCM’s I’ve faced issues where I find myself wishing for one (or many) of Git’s features.
Working with Perforce the other day, I started editing a file. It was an unfamiliar codebase and I just wanted to get an idea of the flow of the application. When I hit save however I get a ‘readonly’ message. So I could do p4 open
, or ignore the message and save anyway. Now, I may have to do this several times and its more than likely that none of the edits I made have saved to code. This is quite a hindrance. So I continue ignoring the readonly message and keep saving files and deleting the changes.
The thing I like about Git is that it stays out of your way for the most part, until it’s time to commit. So I can edit to my heart’s content and do a git status
to see if I left anything in the code that shouldn’t be there, and quickly.
That being said, I wanted to build a little tool to abstract some of the differences of other SCM’s away. Things like a git status
amongst others. The problem however, is that I have a hard time finding the time to build such a tool. Also, I haven’t really figured out all of the functionality I would like this tool to have.
Then I came across a video “Source Control Made Easy” by Jim Weirich, a man well known in the Ruby community, who recently passed away. I liked his teaching style and feel I’ve learned a lot from his talks. One of my personal favorites is on testing called “Roman Numerals Kata”. I didn’t know him Jim, but it seems like he would have been a fun person to be friends with.
“Source Control Made Easy”, is kind of a talk about Git, but not directly. Or at least it doesn’t seem that way at first. The following is part of the description for this video:
“In this 49-minute screencast, Jim Weirich takes you on a journey of how you might design and build a source control system from scratch. Along the way you’ll gain a deeper understanding of the first principles behind systems like Git, so things begin to make more sense.”
I highly recommend this video to anyone interested in learning about not only Git, but understanding the principles of source control in general. Also, note that the great people over at the Pragmatic Programmers are donating 100% of the purchase price to Jim’s family.
So, as a tribute to Jim Weirich, I decided to take a shot at implementing the source control system he talks about in ruby and to incorporate some test driven development as well.
Initial Stories:
I’ll start by writing out some quick user stories.
Initialize a new repository
As a user
I want the ability to initialize a repository
So that I can begin adding snapshots to it
Create snapshots of my work
As a user
I want to create snapshots
So that I can save snapshots of my work
Checkout previous snapshots
As a user
I want to checkout a previous snapshot
So that I can fix issues or get back to a working state
With those stories in place to drive my development, I wanted to take a few minutes up front to think about or pseudo code a quick possible implementation. I don’t really have much in terms of expectations here. I just want to get some of the ideas into focus.
initialize
- calling initialize from within a directory should:
- create new hidden directory .esc
- create new sqlite database to store what?
- create HEAD file which contains hash (manifest filename) of the latest snapshot (empty initially) (maybe a db entry)
snapshot
- create metadata file (metadata) which will contain the manifest hash, snapshot author (name, email), timestamp, comments, current head (parent to this snapshot)
- create manifest file (manifest) which will contain hashes of the files in the snapshot, maps the hashes to original filenames and directories
- get a list of all files with paths in the working directory
- iterate through the list, calculate the file hash
- search the repository for a file with the calculated hash
- if found, just add the hash and the filename with pathname to the manifest file
- if not found
- check if we need to create a directory for the file (create hash directories: a..z maybe)
- copy the file to the repository directory
- add the hash and the original filename with pathname to the manifest file
- calculate the hash of the manifest file rename it from manifest to hash
- calculate the hash of the metadata file and rename it from metadata to hash update metadata file with snapshot hash
- update HEAD to point to this snapshot (metadata filename)
checkout(version number)
- check .esc for the metadata file (version number/hash)
- if metadata file is not found, fail and inform the user
- if found, get the manifest hash
- print the metadata info out to the console
- open the manifest
- scan the file line by line,
- get the actual path/filename for an entry and see if it exists in the working directory
- if it doesn’t, just copy the file changing the hash to the filename and placing it in the correct directory
- if it does, calculate the hash for the file in the working directory
- if the hash is the same, don’t do anything with that file
- if the hash is different, overwrite the existing file with the file from the repository
One more thing before we get on with the actual coding. I am trying to keep this simple. That being said, I may not adhere to any strict standards or practices. I will try to point them out as I go. This will free me up to:
- Write code as fast as possible since it’s been hard enough to find the time to write these days.
- Refactoring would be a great exercise for any reader who would like to continue this project. Ideally, I will do a follow up post where I refactor. I want to code this almost raw, and think about things as I go almost like the Roman Numeral Kata.
Initialize
…but before we actually start writing the application code, let’s get our initial testing in place. Create a directory, and call it whatever you want. I am calling mine: custom_source_control
.
mkdir custom_source_control
cd custom_source_control
Now open your favorite editor, create a new file named custom_source_control.rb
and add the following to it.
#!/usr/bin/env ruby
require 'minitest/autorun'
describe CustomSourceControl do
before do
@csc = CustomSourceControl.new
end
describe 'when a repository is initialized' do
it 'must create a new hidden directory named .esc' do
@csc.repository_exists?.must_equal true
end
end
end
So if you aren’t familiar with minitest
you should continue reading the post. If you like what you’ve seen checkout the README over on github. Basically, we just described the first thing we’d like to test. The DSL’s that people write using Ruby are great and this almost reads like English (or some weird robot form of it). Let’s just look at the words between the quotes:
‘when a repository is initialized’, ‘must create a new hidden directory named .esc’
Another thing to point out is the before
block. We can pretty much infer that the before
block will run before any of our tests. Then there is that require 'minitest/autorun'
thing at the top. That just makes the tests run when we execute the ruby file. Let’s make the ruby file executable, and execute it.
chmod u+x custom_source_control.rb
./custom_source_control.rb
(test)
./custom_source_control.rb:5:in `<main>': uninitialized constant CustomSourceControl (NameError)
Here we gave the script the ability to be executed as a command. It ran the script… and failed, kind of.
Actually, this is ruby telling us that we don’t have a constant CustomSourceControl
in our script, but were acting as if we did. CustomSourceControl
is going to be our class. So we’ll need to add it. We are going to write our actual implementation code above our tests and everything else, but below the shebang. Wait, what’s a ‘shebang’? It’s that little #!/usr/bin/env ruby at the top of our file. Remember the chmod u+x custom_source_control.rb
we just did. Well chmod u+x custom_source_control.rb
tells the operating system the file is executable, and #!/usr/bin/env ruby
tells it to use ruby to execute it.
#!/usr/bin/env ruby
class CustomSourceControl
end
require 'minitest/autorun'
...
Note the use of ...
, do not type it in the editor. This is just me saying “more text may come before or after”.
(retest)
E
Finished tests in 0.000628s, 1592.3567 tests/s, 0.0000 assertions/s.
1) Error:
CustomSourceControl::when a repository is initialized#test_0001_must create a new hidden directory named .esc:
NoMethodError: undefined method `repository_exists?' for #<CustomSourceControl:0x007fe974157038>
./custom_source_control.rb:15:in `block (3 levels) in <main>'
We now get an error (denoted by the E
) that the CustomSourceControl
class doesn’t have a repository_exists?
method. We will just add that method and retest:
class CustomSourceControl
...
def repository_exists?
end
...
end
(retest)
F
Finished tests in 0.020405s, 49.0076 tests/s, 49.0076 assertions/s.
1) Failure:
CustomSourceControl::when a repository is initialized#test_0001_must create a new hidden directory named .esc [./custom_source_control.rb:17]:
Expected: true
Actual: nil
We are getting an actual failure (denoted by the F
) now. I mean, where do we get off expecting true to be returned from the repository_exists?
method. We haven’t even implemented it, of course it’s going to return nil…
So let’s implement it.
class CustomSourceControl
...
def repository_exists?
true
end
...
end
(retest)
.
Finished tests in 0.000590s, 1694.9153 tests/s, 1694.9153 assertions/s.
…and we’re passing (denoted by the .
)!
Seriously? We passed? Yeah… and I want you to know that I realize just returning true
doesn’t mean that the repository actually exists. I know this for a few reasons, but let’s take the most straightforward way and prove this.
ls -la
ls -la
if you don’t know, or couldn’t tell, just lists all the files in a directory. We need the a
in order to see all files, including the hidden ones. On *nix files/directories that begin with a .
are hidden.
drwxr-xr-x 4 mweppler _guest 136 Mar 7 07:05 .
drwxr-xr-x 29 mweppler _guest 986 Mar 7 07:05 ..
-rwxr--r-- 1 mweppler _guest 377 Mar 7 07:04 custom_source_control.rb
Nope, no .esc
directory here… So what did that prove? The testing stuff? Why do it? We are taking a very systematic and pragmatic approach here. This testing stuff is good for a few reasons. We will see this come to light a bit later. For now though, please just accept it.
With this little bit of testing so far, we’ve really just exercised our minds as well as minitest
. As our code grows and we move on to other projects, we will likely forget all the intricacies of what we wrote. Our tests here should give us a bit of confidence, even this early on in our development cycle. Also, our code is small and easy to manually test, so we know minitest
is doing its job.
Let’s really implement the repository_exists?
method now.
class CustomSourceControl
...
def repository_exists?
Dir.exists? '.esc'
end
...
end
(retest)
F
Finished tests in 0.018134s, 55.1450 tests/s, 55.1450 assertions/s.
1) Failure:
CustomSourceControl::when a repository is initialized#test_0001_must create a new hidden directory named .esc [./custom_source_control.rb:18]:
Expected: true
Actual: false
…and were failing again :(, that rush I got from passing never quite lasts long. Keep calm and carry on. It’s good that we’re failing, we should be failing. We have yet to create our .esc
directory. Now I don’t know about you, but I want to get to passing again. I got that itch now…
class CustomSourceControl
...
def repository_exists?
Dir.mkdir '.esc'
Dir.exists? '.esc'
end
...
end
(retest)
.
Finished tests in 0.000720s, 1388.8889 tests/s, 1388.8889 assertions/s.
There is nothing crazy going on here, just calling the mkdir
(make directory) method on the Dir
class. But guess what, we’re passing again. If we check the file system, we see our newly added directory.
drwxr-xr-x 5 mweppler _guest 170 Mar 7 07:15 .
drwxr-xr-x 29 mweppler _guest 986 Mar 7 07:05 ..
drwxr-xr-x 2 mweppler _guest 68 Mar 7 07:15 .esc
-rwxr--r-- 1 mweppler _guest 412 Mar 7 07:15 custom_source_control.rb
Lets move on to our next test.
describe 'when a repository is initialized' do
...
it 'must create an empty HEAD file' do
@csc.head_exists?.must_equal true
@csc.head_contents.empty?.must_equal true
end
end
(test)
EE
Finished tests in 0.000708s, 2824.8588 tests/s, 0.0000 assertions/s.
1) Error:
CustomSourceControl::when a repository is initialized#test_0002_must create an empty HEAD file:
NoMethodError: undefined method `head_exists?' for #<CustomSourceControl:0x007fad59a170f8>
./custom_source_control.rb:23:in `block (3 levels) in <main>'
2) Error:
CustomSourceControl::when a repository is initialized#test_0001_must create a new hidden directory named .esc:
Errno::EEXIST: File exists - .esc
./custom_source_control.rb:5:in `mkdir'
./custom_source_control.rb:5:in `repository_exists?'
./custom_source_control.rb:19:in `block (3 levels) in <main>'
What? Why do we have 2 errors? We were just passing. If we look at our initial test we can see that the .esc
file already exists. We have to clean up after ourselves like good TDD citizens. First let’s manually remove the existing .esc
directory. Then let’s add some code to our tests that will clean up after each test is run.
rmdir .esc
require 'fileutils'
class CustomSourceControl
...
end
...
describe CustomSourceControl do
after do
FileUtils.rm_rf '.esc' if Dir.exists? '.esc'
end
...
end
We added the require 'fileutils'
right above our class CustomSourceControl
statement, and after our after
block inside our main describe
block above our before
block. Was that confusing? If so, you can double check your work with the project I have hosted on github.
(retest)
.E
Finished tests in 0.001229s, 1627.3393 tests/s, 813.6697 assertions/s.
1) Error:
test_0002_must create an empty HEAD file(CustomSourceControl::when a repository is initialized):
NoMethodError: undefined method `head_exists?' for #<CustomSourceControl:0x007feac2805818>
custom_source_control.rb:29:in `block (3 levels) in <main>'
Rerun our tests and this time our initial test is passing again. We can now deal with the error we are seeing. I don’t know about you, but I am finding this process very helpful. It’s an iterative approach, one that you likely do anyway, just without the testing.
So, if history is any indicator of things to come (and we read the error message), we know the next step is to create the head_exists?
method in our CustomSourceControl
class.
class CustomSourceControl
...
def head_exists?
end
...
end
(retest)
F.
Finished tests in 0.011323s, 176.6316 tests/s, 176.6316 assertions/s.
1) Failure:
test_0002_must create an empty HEAD file(CustomSourceControl::when a repository is initialized) [custom_source_control.rb:31]:
Expected: true
Actual: nil
We could take the same approach we took in our previous test and return true
to start, then test, fail, refactor, etc… This would be the right approach, but I’ll leave that for you to do on your own. Get comfortable with the process and messages.
Once you’ve run through the exercise, after a few iterations, you should have a method similar to the repository_exists?
method except instead of creating a directory, we’re going to create a file.
class CustomSourceControl
...
def head_exists?
File.new(File.join('.esc', 'HEAD'), 'w')
File.exists? File.join('.esc', 'HEAD')
end
...
end
(retest)
E.
Finished tests in 0.001215s, 1646.0905 tests/s, 823.0453 assertions/s.
1) Error:
test_0002_must create an empty HEAD file(CustomSourceControl::when a repository is initialized):
Errno::ENOENT: No such file or directory - .esc/HEAD
custom_source_control.rb:7:in `initialize'
custom_source_control.rb:7:in `new'
custom_source_control.rb:7:in `head_exists?'
custom_source_control.rb:34:in `block (3 levels) in <main>'
…but wait! Why is this you ask! Well we only create .esc
when we call repository_exists?
, and then after each test, it is removed if it exists. So .esc
doesn’t exist anymore.
Let’s think about our story for a second. What we really care about, from a high level, is repository initialization. So lets skip
this test and refactor a bit.
describe 'when a repository is initialized' do
...
it 'must create an empty HEAD file' do
skip
@csc.head_exists?.must_equal true
@csc.head_contents.empty?.must_equal true
end
end
(retest)
.S
Finished tests in 0.029118s, 68.6860 tests/s, 34.3430 assertions/s.
That S
tells us that we’re skipping a test. Update the CustomSourceControl
class with the following:
...
class CustomSourceControl
def head_exists?
File.exists? File.join('.esc', 'HEAD')
end
def initialize_repository
Dir.mkdir '.esc'
File.new(File.join('.esc', 'HEAD'), 'w')
end
def repository_exists?
Dir.exists? '.esc'
end
end
...
We’ve created an initialize_repository
method. We’ve moved both the .esc
directory creation and the HEAD
file creation out of the xxx_exists?
methods and into initialize_repository
. This all makes sense. The xxx_exists?
should only be responsible for checking that something actually exists, not creating anything. initialize_repository
on the other hand, its purpose is to handle the tasks involved in initializing the repository. One of those tasks is creation of the repository structure.
If we run this now what do you think will happen?
(retest)
FS
Finished tests in 0.008715s, 229.4894 tests/s, 114.7447 assertions/s.
1) Failure:
test_0001_must create a new hidden directory named .esc(CustomSourceControl::when a repository is initialized) [custom_source_control.rb:33]:
Expected: true
Actual: false
Well, we fail since we haven’t actually called the initialize_repository
method anywhere in our code. Is that what you guessed? So where should we call the initialize_repository
method? If you guessed ‘in our before
block’, you win.
...
before do
@csc = CustomSourceControl.new
@csc.initialize_repository
end
...
(retest)
S.
Finished tests in 0.000897s, 2229.6544 tests/s, 1114.8272 assertions/s.
Great, we’re passing again. Let’s remove the skip
statement from the test and rerun.
(retest)
E.
Finished tests in 0.002892s, 691.5629 tests/s, 691.5629 assertions/s.
1) Error:
test_0002_must create an empty HEAD file(CustomSourceControl::when a repository is initialized):
NoMethodError: undefined method `head_contents' for #<CustomSourceControl:0x007f970d009618>
custom_source_control.rb:39:in `block (3 levels) in <main>'
It looks like we’re missing that head_contents
method. We should add that.
class CustomSourceControl
...
def head_contents
File.open('.esc/HEAD', 'r') { |f| f.read }
end
...
end
(retest)
..
Finished tests in 0.001290s, 1550.3876 tests/s, 2325.5814 assertions/s.
You did it! You can now initialize a new repository! Let’s move on to our next story…
Snapshot
Create snapshots of my work
As a user
I want to create snapshots
So that I can save snapshots of my work
If you recall from our brief pseudo code, we’ve already kind of mapped out a few steps. If you don’t recall that, try again, try harder, or just reread that part above.
Let’s update our before block to include a new snapshot
method.
before do
@csc = CustomSourceControl.new
@csc.initialize_repository
@csc.snapshot
end
(retest)
EE
Finished tests in 0.002971s, 673.1740 tests/s, 0.0000 assertions/s.
1) Error:
test_0002_must create an empty HEAD file(CustomSourceControl::when a repository is initialized):
NoMethodError: undefined method `snapshot' for #<CustomSourceControl:0x007fd35a985078>
custom_source_control.rb:34:in `block (2 levels) in <main>'
2) Error:
test_0001_must create a new hidden directory named .esc(CustomSourceControl::when a repository is initialized):
NoMethodError: undefined method `snapshot' for #<CustomSourceControl:0x007fd35a98bce8>
custom_source_control.rb:34:in `block (2 levels) in <main>'
As you might have guessed, it failed since we haven’t actually created a snapshot
method. You also probably guessed that, that is going to be our next step. And you’d be right.
class CustomSourceControl
...
def snapshot
end
...
end
Deep breath…
(retest)
..
Finished tests in 0.001580s, 1265.8228 tests/s, 1898.7342 assertions/s.
…and we’re passing. Now, let’s create a new describe
block for our snapshot story and test the existence of a metadata file. The test will be inside our main describe CustomSourceControl do
block, outside and below the describe 'when a repository is initialized' do
.
...
describe CustomSourceControl do
...
describe 'when a repository is initialized' do
...
end
describe 'when we take a snapshot' do
it 'must create a metadata file' do
@csc.metadata_exists?.must_equal true
end
end
end
Which will undoubtedly fail…
(test)
..E
Finished tests in 0.002115s, 1418.4397 tests/s, 1418.4397 assertions/s.
1) Error:
test_0001_must create a metadata file(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `metadata_exists?' for #<CustomSourceControl:0x007fb122a36760>
custom_source_control.rb:53:in `block (3 levels) in <main>'
Now, let’s make it pass.
class CustomSourceControl
...
def metadata_exists?
File.exists? File.join('.esc', '__metadata__')
end
...
def snapshot
File.open(File.join('.esc', '__metadata__'), 'w')
end
end
First, we are adding the necessary metadata_exists?
method and checking that a file .esc/__metadata__
actually exists. It won’t, and has to be created as part of the snapshot so we add that code to our snapshot method.
(retest)
...
Finished tests in 0.001947s, 1540.8320 tests/s, 2054.4427 assertions/s.
Here is a homework assignment:
- Follow the same process for the manifest file test/creation. Make sure you follow the process as you go.
It should look something like this:
(our test)
describe 'when we take a snapshot' do
...
it 'must create a manifest file' do
@csc.manifest_exists?.must_equal true
end
end
…and this:
(method to test existence)
class CustomSourceControl
...
def manifest_exists?
File.exists? File.join('.esc', '__manifest__')
end
...
end
…and last but not least:
(actually create the file)
class CustomSourceControl
...
def snapshot
File.new(File.join('.esc', '__metadata__'), 'w')
File.new(File.join('.esc', '__manifest__'), 'w')
end
...
end
(retest)
....
Finished tests in 0.002804s, 1426.5335 tests/s, 1783.1669 assertions/s.
Our next step is to get a list of files in the current working directory.
describe 'when we take a snapshot' do
...
it 'gets a list of files in the current working directory' do
@csc.cwd_files.must_equal ['custom_source_control.rb']
end
end
class CustomSourceControl
...
def cwd_files
all_files_wildcard = File.join '**', '*'
Dir.glob(all_files_wildcard)
end
...
end
Notice how I skipped running the test and went straight to implementation? Well I didn’t actually skip the testing part. I just didn’t write it here. Keep this in mind. You should be testing as often as possible. Get familiar with the messages and try to understand what they are telling you is wrong.
(retest)
.....
Finished tests in 0.003872s, 1291.3223 tests/s, 1549.5868 assertions/s.
We moved pretty quickly in that last cycle of: write a test, watch it fail, write code to make it pass. The last part in that cycle which I have not done (for the most part) is refactor. It’s called Red, Green, Refactor. Refactoring is an important part of the cycle and I normally wouldn’t skip over it. I am doing so here however to get you familiar with the other parts of the cycle with the intention that we will revisit and refactor in another post. I mentioned this before, but want to reiterate the point here.
Let’s create our SHA1 file hashes.
(write a test)
describe 'when we take a snapshot' do
...
it 'creates a file hash for all files in the current working directory' do
hashes = {
'custom_source_control.rb' => '1527a36a8246ad0c07b9d5478c7374d3d576752d'
}
@csc.cwd_hashes.must_equal hashes
end
end
(watch it fail)
(test)
.....E
Finished tests in 0.006143s, 976.7215 tests/s, 976.7215 assertions/s.
1) Error:
test_0004_creates a file hash for all files in the current working directory(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `cwd_hashes' for #<CustomSourceControl:0x007fbafa8158a0>
custom_source_control.rb:84:in `block (3 levels) in <main>'
(write code to make it pass)
require 'openssl'
...
class CustomSourceControl
...
def cwd_hashes
sha1 = OpenSSL::Digest::SHA1.new
hashes = {}
cwd_files.each do |file|
hashes[file] = sha1.hexdigest(File.read(file))
sha1.reset
end
hashes
end
...
end
What did we just do here? First, we are adding openssl
, which provides the methods necessary to hash files. In the cwd_hashes
method, we’re creating an instance of OpenSSL::Digest::SHA1
and later using the hexdigest
it provides to hash files in the current working directory.
(retest)
..F...
Finished tests in 0.020276s, 295.9164 tests/s, 345.2357 assertions/s.
1) Failure:
test_0004_creates a file hash for all files in the current working directory(CustomSourceControl::when we take a snapshot) [custom_source_control.rb:95]:
--- expected
+++ actual
@@ -1 +1 @@
-{"custom_source_control.rb"=>"1527a36a8246ad0c07b9d5478c7374d3d576752d"}
+{"custom_source_control.rb"=>"0b85c90f96ef4d8f0937338760be827d68f93469"}
We’re getting the wrong hash, and that’s because were actually updating the very file we’re hashing/testing. We can never (at least I can’t think of a cleaver way) pass like this. We need to account for this and work around it. What we’re going to do is create two files manually, sha1sum
them and remove all but those two files from the hash returned from the cwd_hashes
method.
Here I am going to use the cat
command and paste the text in. You can use any means you’re comfortable with to create the files and add the test to them. It is important however that you add the empty newline
by hitting the enter
key at the end of the sentence. I am also assuming that you have shasum
or sha1sum
installed. If you don’t you can safely skip over that command.
cat > test_file_1.txt
Copy and paste in the following:
this is test_file_1.text
Then type ctrl+c to quit. Run shasum
or sha1sum
to get the files hash.
sha1sum test_file_1.txt
bb4d8995cfa843effc83d6ddcea1a8351c09497f test_file_1.txt
Repeat the process for the second test file.
cat > test_file_2.txt
this is test_file_2.text
sha1sum test_file_2.txt
5d3140359919315ea06e3755cdc81860e9d7c556 test_file_2.txt
Now let’s update our test.
...
it 'creates a file hash for all files in the current working directory' do
actual_hashes = {
'test_file_1.txt' => 'bb4d8995cfa843effc83d6ddcea1a8351c09497f',
'test_file_2.txt' => '5d3140359919315ea06e3755cdc81860e9d7c556'
}
expected_hashes = @csc.cwd_hashes.keep_if { |key, value| key == 'test_file_1.txt' || key == 'test_file_2.txt' }
expected_hashes.must_equal actual_hashes
end
...
(retest)
..F...
Finished tests in 0.019896s, 301.5682 tests/s, 351.8295 assertions/s.
1) Failure:
test_0003_gets a list of files in the current working directory(CustomSourceControl::when we take a snapshot) [custom_source_control.rb:88]:
--- expected
+++ actual
@@ -1 +1 @@
-["custom_source_control.rb"]
+["custom_source_control.rb", "test_file_1.txt", "test_file_2.txt"]
This gets us passing the test in question, but now we’re failing a previous test. If we inspect the message we see that the newly introduced files are causing gets a list of files in the current working directory
test to fail. We can simply add the new file names to the array of actual file names.
...
it 'gets a list of files in the current working directory' do
@csc.cwd_files.must_equal ['custom_source_control.rb', 'test_file_1.txt', 'test_file_2.txt']
end
...
(retest)
......
Finished tests in 0.004923s, 1218.7690 tests/s, 1421.8972 assertions/s.
Now we need our list of files in the repository.
describe 'when we take a snapshot' do
...
it 'gets a list of files in the current working directory' do
@csc.repository_file_list.must_equal ['__manifest__', '__metadata__', 'HEAD']
end
end
(test)
......E
Finished tests in 0.008568s, 816.9935 tests/s, 816.9935 assertions/s.
1) Error:
test_0005_gets a list of files in the current working directory(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `repository_file_list' for #<CustomSourceControl:0x007fd440844718>
class CustomSourceControl
...
def repository_file_list
all_files_wildcard = File.join '.esc', '*'
Dir.glob(all_files_wildcard).map { |pathname| File.basename pathname }
end
...
end
(retest)
.......
Finished tests in 0.004527s, 1546.2779 tests/s, 1767.1747 assertions/s.
Ok, now that we have our list of existing files in the repository. We can compare what is new, with what already exists and return both of those lists.
describe 'when we take a snapshot' do
...
it 'returns a list of new and existing files' do
deltas = @csc.deltas
deltas[:new].keep_if { |key, value| key == 'test_file_1.txt' || key == 'test_file_2.txt' }
deltas[:new].must_equal ['test_file_1.txt', 'test_file_2.txt']
deltas[:existing].must_equal []
end
end
(test)
...E....
Finished tests in 0.008715s, 917.9575 tests/s, 917.9575 assertions/s.
1) Error:
test_0006_returns a list of new and existing files(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `deltas' for #<CustomSourceControl:0x007f8c2a83e790>
Let’s add that deltas
method.
class CustomSourceControl
...
def deltas
new, existing = [], []
cwd_hashes.each do |key, value|
if repository_file_list.include? key
existing << key
else
new << key
end
end
{ :new => new, :existing => existing }
end
...
end
Here, we’re going through our current working directory hashes and checking if any exist in the repository. If they do, we add them to the existing array. If they do not, we add them to the new array. Then we create a hash with the keys :new
& :existing
, add the arrays, and return that hash.
(retest)
........
Finished tests in 0.005304s, 1508.2956 tests/s, 1885.3695 assertions/s.
I think the next step should be to add the files to the manifest, then based off the manifest copy and hash the files added to the snapshot.
describe 'when we take a snapshot' do
...
it 'adds entries to the manifest file' do
expected_content = %Q{5d3140359919315ea06e3755cdc81860e9d7c556 => test_file_2.txt (new)\nbb4d8995cfa843effc83d6ddcea1a8351c09497f => test_file_1.txt (new)}
@csc.write_manifest
@csc.manifest_contents.gsub!(/.*? => custom_source_control\.rb \(new\)\n?/, '').chomp.must_equal expected_content
end
end
(test)
.......E.
Finished tests in 0.009311s, 966.5986 tests/s, 1073.9985 assertions/s.
1) Error:
test_0007_adds entries to the manifest file(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `write_manifest' for #<CustomSourceControl:0x007fb13084c888>
class CustomSourceControl
...
def hash_for_file(file = nil)
sha1 = OpenSSL::Digest::SHA1.new
sha1.hexdigest(File.read(file))
end
...
def write_manifest
file_deltas = deltas
file_list = []
file_deltas[:new].each { |filename| file_list << "#{hash_for_file filename} => #{filename} (new)"}
file_deltas[:existing].each { |filename| file_list << "#{hash_for_file filename} => #{filename} (existing)"}
File.open(File.join('.esc', '__manifest__'), 'w') do |file|
file_list.sort!.each { |entry| file.puts entry }
end
end
...
end
Here we’re getting the deltas and writing them to the manifest file. We also added a helper method hash_for_file
to return the hash of any file we pass in. I can see this coming in handy.
(retest)
........E
Finished tests in 0.010894s, 826.1428 tests/s, 917.9365 assertions/s.
1) Error:
test_0007_adds entries to the manifest file(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `manifest_contents' for #<CustomSourceControl:0x007f91e08542f0>
custom_source_control.rb:146:in `block (3 levels) in <main>'
We’re going to need to read that manifest file back out, so let’s add that method.
class CustomSourceControl
...
def manifest_contents(manifest = nil)
manifest ||= '__manifest__'
File.open(".esc/#{manifest}", 'r') { |f| f.read }
end
...
end
(retest)
.........
Finished tests in 0.006487s, 1387.3902 tests/s, 1695.6991 assertions/s.
If we take another look at this test we see @csc.write_manifest
method call. Really this should be happening in the snapshot
method itself. So let’s make that call in snapshot
and remove it from the test.
class CustomSourceControl
...
def snapshot
File.new(File.join('.esc', '__metadata__'), 'w')
File.new(File.join('.esc', '__manifest__'), 'w')
write_manifest
end
...
end
(retest)
.........
Finished tests in 0.014500s, 620.6897 tests/s, 758.6207 assertions/s.
Next we need to copy the files in the manifest to the repository directory
describe 'when we take a snapshot' do
...
it 'copies files listed in the manifest to the repository' do
@csc.copy_manifest_files_to_repository
@csc.verify_manifest('__manifest__').must_equal true
end
end
(test)
..E.......
Finished tests in 0.015868s, 630.1991 tests/s, 693.2191 assertions/s.
1) Error:
test_0008_copies files listed in the manifest to the repository(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `copy_manifest_files_to_repository' for #<CustomSourceControl:0x007ff26103d120>
custom_source_control.rb:155:in `block (3 levels) in <main>'
We’re introducing a few methods here so let’s take our time with this.
class CustomSourceControl
...
def copy_manifest_files_to_repository
end
...
end
(retest)
..E.......
Finished tests in 0.015557s, 642.7975 tests/s, 707.0772 assertions/s.
1) Error:
test_0008_copies files listed in the manifest to the repository(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `verify_manifest' for #<CustomSourceControl:0x007fca9b03d1b8>
custom_source_control.rb:159:in `block (3 levels) in <main>'
Add the verify_manifest
helper method.
class CustomSourceControl
...
def verify_manifest(manifest = nil)
end
...
end
(retest)
.........F
Finished tests in 0.024386s, 410.0714 tests/s, 492.0856 assertions/s.
1) Failure:
test_0008_copies files listed in the manifest to the repository(CustomSourceControl::when we take a snapshot) [custom_source_control.rb:162]:
Expected: true
Actual: nil
Now that we’re failing, let’s start implementing the verify_manifest
method.
class CustomSourceControl
...
def verify_manifest(manifest = nil)
manifest ||= '__manifest__'
File.open(File.join('.esc', manifest)) do |file|
repo_files = repository_file_list()
file.readlines.each do |entry|
return false unless repo_files.include? entry[0...40]
end
end
true
end
...
end
Here we’re reading the __manifest__
file, and for each entry we get the 40 character hash (entry[0...40]
) and checking the repo_files
array (file names) for it.
(retest)
....F.....
Finished tests in 0.030663s, 326.1259 tests/s, 391.3511 assertions/s.
1) Failure:
test_0008_copies files listed in the manifest to the repository(CustomSourceControl::when we take a snapshot) [custom_source_control.rb:170]:
Expected: true
Actual: false
This time we’re returning false
, and it makes sense since we’re not actually copying the files just yet. So let’s work on the implementing the copy_manifest_files_to_repository
method.
class CustomSourceControl
...
def copy_entry_to_repository(manifest_entry)
FileUtils.cp(manifest_entry[:pathname], File.join('.esc', manifest_entry[:hash]), { :preserve => true })
end
def copy_manifest_files_to_repository
entries = []
File.open(File.join('.esc', '__manifest__'), 'r') do |file|
file.readlines.each do |entry|
entry =~ /(.*?) => (.*?) \(new\)\n?/
entries << { :hash => $1, :pathname => $2 } if $1
end
end
entries.each do |entry|
copy_entry_to_repository entry
end
end
...
end
There is quite a bit going here. First, we open the __manifest__
file and break down each entry (line). What’s the deal with =~
, and what are $1
and $2
? =~
is the match operator in ruby. It will match the variable on the left (string or regular expression) to a regular expression on the right. It returns nil
if a match is not found, and the position of the match if found. Also if there is a match the $1
, $2
, …, $9
will represent the capture blocks (whatever is enclosed in the ()
). That is how we break down the entry into a hash
and pathname
. For the actual copying we created a helper method copy_entry_to_repository
.
(retest)
..........
Finished tests in 0.016832s, 594.1065 tests/s, 712.9278 assertions/s.
That was fun, wasn’t it? Let’s take the same approach and add the copy_manifest_files_to_repository
call to the snapshot
method. This will allow us to remove it from the test as well. Make sure you’re test still passes before moving on.
…it didn’t pass did it? We’re you able to figure out why? Did you attempt to fix it? Here is what I did. Based on the failure:
........F.
Finished tests in 0.039832s, 251.0544 tests/s, 301.2653 assertions/s.
1) Failure:
test_0005_gets a list of files in the current working directory(CustomSourceControl::when we take a snapshot) [custom_source_control.rb:168]:
--- expected
+++ actual
@@ -1 +1 @@
-["__manifest__", "__metadata__", "HEAD"]
+["3b0930f8589a4eb37a1dbb9cbf355391781c2bba", "5d3140359919315ea06e3755cdc81860e9d7c556", "__manifest__", "__metadata__", "bb4d8995cfa843effc83d6ddcea1a8351c09497f", "HEAD"]
I went right to the gets a list of files in the current working directory
test and saw that we’re only accounting for the HEAD
file, which should always be in an initialized repository, and then the current working __manifest__
& __metadata__
files. This isn’t the case anymore since our snapshot
method is doing more at this point. So what we really want is to make sure that at the point of this test at least those files exist. The must_include
assertion provided by minitest
is perfect for this.
Let’s update our gets a list of files in the current working directory
test to the following:
describe 'when we take a snapshot' do
...
it 'gets a list of files in the current working directory' do
['__manifest__', '__metadata__', 'HEAD'].each do |filename|
@csc.repository_file_list.must_include filename
end
end
...
end
(retest)
..........
Finished tests in 0.026117s, 382.8924 tests/s, 650.9170 assertions/s.
Now we have to calculate the hash of the manifest file and rename it to the hash.
describe 'when we take a snapshot' do
...
it 'calculates the hash of the manifest file and renames it to the hash' do
manifest_hash = @csc.hash_for_file File.join('.esc', '__manifest__')
@csc.hash_and_copy_manifest
@csc.repository_file_exists?(manifest_hash).must_equal true
end
end
(test)
.......E...
Finished tests in 0.026070s, 421.9409 tests/s, 652.0905 assertions/s.
1) Error:
test_0009_calculates the hash of the manifest file and renames it to the hash(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `hash_and_copy_manifest' for #<CustomSourceControl:0x007ffa4a0a7020>
class CustomSourceControl
...
def hash_and_copy_manifest
manifest_file = File.join('.esc', '__manifest__')
hash = hash_for_file(manifest_file)
FileUtils.cp(manifest_file, File.join('.esc', hash))
hash
end
...
def repository_file_exists?(filename = nil)
File.exists? File.join('.esc', filename)
end
...
end
We added another helper method repository_file_exists?
. It simply takes a file name and checks the repository for existence of the filename.
(retest)
...........
Finished tests in 0.039280s, 280.0407 tests/s, 458.2485 assertions/s.
Now that we’re passing, let’s add the hash_and_copy_manifest
method to the snapshot
method and remove @csc.hash_and_copy_manifest
from the test. Make sure you’re passing and move on.
We’re almost there. Next, we have to update the metadata file with the necessary info, then hash it.
describe 'when we take a snapshot' do
...
it 'adds the snapshot info to the metadata file, calculates its file hash, and renames it to the hash' do
manifest_hash = @csc.hash_for_file File.join('.esc', '__manifest__')
@csc.write_metadata manifest_hash
metadata_hash = @csc.hash_for_file File.join('.esc', '__metadata__')
@csc.hash_and_copy_metadata
@csc.repository_file_exists?(metadata_hash).must_equal true
end
end
(test)
..........E.
Finished tests in 0.033031s, 363.2951 tests/s, 544.9426 assertions/s.
1) Error:
test_0010_adds the snapshot info to the metadata file, calculates its file hash, and renames it to the hash(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `write_metadata' for #<CustomSourceControl:0x007fc77c022f60>
class CustomSourceControl
...
def hash_and_copy_metadata
metadata_file = File.join('.esc', '__metadata__')
hash = hash_for_file(metadata_file)
FileUtils.cp(metadata_file, File.join('.esc', hash))
hash
end
...
def write_metadata(manifest_hash = nil)
File.open(File.join('.esc', '__metadata__'), 'w') do |file|
file.puts "Snapshot Manifest: #{manifest_hash}"
file.puts "Snapshot Parent: #{(head_contents.empty?) ? 'root' : head_contents}"
file.puts "Snapshot Taken: #{Time.now}"
end
end
...
end
(retest)
............
Finished tests in 0.034964s, 343.2102 tests/s, 543.4161 assertions/s.
Let’s clean up our test like we’ve done before. The @csc.write_metadata manifest_hash
& @csc.hash_and_copy_metadata
calls will happen in the snapshot
methods so let’s delete them.
...
it 'adds the snapshot info to the metadata file, calculates its file hash, and renames it to the hash' do
manifest_hash = @csc.hash_for_file File.join('.esc', '__manifest__')
metadata_hash = @csc.hash_for_file File.join('.esc', '__metadata__')
@csc.repository_file_exists?(metadata_hash).must_equal true
end
...
(test)
..F.........
Finished tests in 0.076299s, 157.2760 tests/s, 249.0203 assertions/s.
1) Failure:
test_0010_adds the snapshot info to the metadata file, calculates its file hash, and renames it to the hash(CustomSourceControl::when we take a snapshot) [custom_source_control.rb:224]:
Expected: true
Actual: false
Now that we’re failing lets add the necessary calls to the snapshot
method.
...
def snapshot
File.new(File.join('.esc', '__metadata__'), 'w')
File.new(File.join('.esc', '__manifest__'), 'w')
write_manifest
copy_manifest_files_to_repository
manifest_hash = hash_and_copy_manifest
write_metadata manifest_hash
hash_and_copy_metadata
end
...
(retest)
............
Finished tests in 0.079316s, 151.2936 tests/s, 239.5481 assertions/s.
The last step for our snapshot story is to update HEAD to point to this snapshot (metadata filename)
describe 'when we take a snapshot' do
...
it 'updates HEAD to the latest snapshot' do
metadata_hash = @csc.hash_for_file File.join('.esc', '__metadata__')
@csc.update_head metadata_hash
@csc.head_contents.must_equal metadata_hash
end
end
(test)
......E......
Finished tests in 0.042301s, 307.3213 tests/s, 449.1620 assertions/s.
1) Error:
test_0011_updates HEAD to the latest snapshot(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `update_head' for #<CustomSourceControl:0x007ff5908eea58>
custom_source_control.rb:231:in `block (3 levels) in <main>'
class CustomSourceControl
...
def update_head(metadata_hash = nil)
File.open(File.join('.esc', 'HEAD'), 'w') do |file|
file.write metadata_hash
end
end
...
end
(retest)
.............
Finished tests in 0.040271s, 322.8129 tests/s, 496.6353 assertions/s.
Now refactor the update_head
out of the test.
it 'updates HEAD to the latest snapshot' do
metadata_hash = @csc.hash_for_file File.join('.esc', '__metadata__')
@csc.head_contents.must_equal metadata_hash
end
(retest)
..F..........
Finished tests in 0.164686s, 78.9381 tests/s, 121.4432 assertions/s.
1) Failure:
test_0011_updates HEAD to the latest snapshot(CustomSourceControl::when we take a snapshot) [custom_source_control.rb:237]:
--- expected
+++ actual
@@ -1 +1 @@
-"7a294d8483262c77b5d762a5325efb920e9948b6"
+""
def snapshot
File.new(File.join('.esc', '__metadata__'), 'w')
File.new(File.join('.esc', '__manifest__'), 'w')
write_manifest
copy_manifest_files_to_repository
manifest_hash = hash_and_copy_manifest
write_metadata manifest_hash
metadata_hash = hash_and_copy_metadata
update_head metadata_hash
end
(retest)
.F...........
Finished tests in 0.048889s, 265.9085 tests/s, 409.0900 assertions/s.
1) Failure:
test_0002_must create an empty HEAD file(CustomSourceControl::when a repository is initialized) [custom_source_control.rb:177]:
Expected: true
Actual: false
We expected the previous because we removed the update_head
call and didn’t add it to snapshot
. Then, we added the update_head
to the snapshot
method, but since that file is not empty anymore we’re failing our must create an empty HEAD file
test. It looks like we’re going to have to refactor a bit more.
Let’s refactor the before
block. We know all of our tests depend on @csc
to be an instance of CustomSourceControl
and they all need an initialized repository. The thing is our when a repository is initialized
tests don’t require a snapshot
. So let’s move that out and into a before
block inside the when we take a snapshot
tests.
describe CustomSourceControl do
...
before do
@csc = CustomSourceControl.new
@csc.initialize_repository
end
...
describe 'when we take a snapshot' do
before do
@csc.snapshot
end
...
end
end
(retest)
.............
Finished tests in 0.040347s, 322.2049 tests/s, 495.6998 assertions/s.
…and our snapshot story is complete! On to our final story!
Checkout
Checkout previous snapshots
As a user
I want to checkout a previous snapshot
So that I can fix issues or get back to a working state
At some point we’re going to need a way to list all the snapshots csc
knows about. One way to do this would be to get the HEAD
snapshot then recursively scan through the metadata files and their parents all the way up to root, then just list them out. This might end up in a log
subcommand. For now, I am trying to keep the functionality really basic. I am going to manually build up the repository with 2 snapshots, then pick the first snapshot to checkout.
To keep this testable, we’ll do this with a before block for this set of tests.
describe CustomSourceControl do
...
describe 'when we checkout a previous snapshot' do
before do
@csc.snapshot
end
end
end
So that’s going to create the first snapshot. I am going to use the pry
gem to suspend the test so that I can manually inspect the .esc
directory. If you use it, make sure to type quit
when you’re finished inspecting things.
describe 'when we checkout a previous snapshot' do
...
it 'suspends the test so we can inspect the .esc directory' do
require 'pry'; binding.pry
skip
end
end
There is a way you can accomplish this without having to install a gem. Add a call to ruby’s sleep
method with a time of something long enough for you to carry the tasks out for yourself. That would look like this:
describe 'when we checkout a previous snapshot' do
...
it 'suspends the test so we can inspect the .esc directory' do
sleep 60 * 60
skip
end
end
* Be sure to clean up after yourself by deleting the suspends the test so we can inspect the .esc directory
test when you are finished.
Getting the hashes:
ls -ltr .esc
-rw-r--r-- 1 mattweppler staff 25 Mar 8 11:01 bb4d8995cfa843effc83d6ddcea1a8351c09497f
-rw-r--r-- 1 mattweppler staff 25 Mar 8 11:02 5d3140359919315ea06e3755cdc81860e9d7c556
-rwxr-xr-x 1 mattweppler staff 7351 Mar 8 14:43 2c114f49e4935865fb00903dba1df2ba70283748
-rw-r--r-- 1 mattweppler staff 129 Mar 8 14:44 b696627a90073c3d3870a30ac5c8140f853a6b3e
-rw-r--r-- 1 mattweppler staff 129 Mar 8 14:44 __metadata__
-rw-r--r-- 1 mattweppler staff 207 Mar 8 14:44 __manifest__
-rw-r--r-- 1 mattweppler staff 40 Mar 8 14:44 HEAD
-rw-r--r-- 1 mattweppler staff 207 Mar 8 14:44 87b17efdc68c9c1d806c4bd05ce70d9baacd22bf
So if we inspect HEAD
we see the metadata file hash.
cat .esc/HEAD
b696627a90073c3d3870a30ac5c8140f853a6b3e
We then take that file hash, which is the metadata file and inspect that:
cat .esc/b696627a90073c3d3870a30ac5c8140f853a6b3e
Snapshot Manifest: 87b17efdc68c9c1d806c4bd05ce70d9baacd22bf
Snapshot Parent: root
Snapshot Taken: 2014-03-08 14:44:10 -0800
This shows that this is the first snapshot as denoted by the Snapshot Parent: root
. So let’s take a look at the manifest next Snapshot Manifest: 87b17efdc68c9c1d806c4bd05ce70d9baacd22bf
cat .esc/87b17efdc68c9c1d806c4bd05ce70d9baacd22bf
2c114f49e4935865fb00903dba1df2ba70283748 => custom_source_control.rb (new)
5d3140359919315ea06e3755cdc81860e9d7c556 => test_file_2.txt (new)
bb4d8995cfa843effc83d6ddcea1a8351c09497f => test_file_1.txt (new)
You can quit
pry
now. If you used the sleep
method do these tasks and it still hasn’t woke up and finished just hit ctrl+c
to kill the tests. Let’s add a new file test_file_3.txt
and update an existing one test_file_2.txt
:
...
before do
@csc.snapshot
# make some edits
File.open('test_file_3.txt', 'w') do |file|
file.write "this is test_file_3.text\n"
end
File.open('test_file_2.txt', 'a') do |file|
file.write "this is an update to test_file_2.text\n"
end
# take our second snapshot
@csc.snapshot
end
...
We’re also going to want to clean up after ourselves again:
...
after do
File.open('test_file_2.txt', 'w') do |file|
file.write "this is test_file_2.text\n"
end
File.delete('test_file_3.txt')
end
...
Ok let’s run the test again and this time make note of the hashes. For me, HEAD
is 35d91f744401d8d4828c65bd65029dc07119d5a7
. The metadata file (35d91f744401d8d4828c65bd65029dc07119d5a7
) shows:
Snapshot Manifest: 14c86c6a758c197367d417b32d57433446fa63e4
Snapshot Parent: 36e0583c25d5e8107538afa345122e9529b9d6fd
Snapshot Taken: 2014-03-08 14:50:32 -0800
So let’s take the parent metadata 36e0583c25d5e8107538afa345122e9529b9d6fd
and take a look:
Snapshot Manifest: 1b1ab1bef308608786e9a1ae2e30e370dd032939
Snapshot Parent: root
Snapshot Taken: 2014-03-08 14:50:32 -0800
Yep, 1b1ab1bef308608786e9a1ae2e30e370dd032939
that’s the one we want. Just to be sure, I ran through this process a few more times. I noticed that 2 of the hashes were changing, while the remaining files stayed the same. So I opened one of the files where the file hash had changed. I spotted the issue right away… The timestamp! Since the timestamp changed each time I ran it I was not getting a consistent set of hashes. In the spirit of keeping it simple, I am just going to change the timestamp to a constant value ‘2014-03-07 23:59:59 -0800’. This may seem hacky, and it is. :)
class CustomSourceControl
...
def write_metadata(manifest_hash = nil)
File.open(File.join('.esc', '__metadata__'), 'w') do |file|
file.puts "Snapshot Manifest: #{manifest_hash}"
file.puts "Snapshot Parent: #{(head_contents.empty?) ? 'root' : head_contents}"
file.puts "Snapshot Taken: #{'2014-03-07 23:59:59 -0800' || Time.now}"
end
end
...
end
This time, our consistent hash is 3b9158d6cd90b07811496330d873d8a71651cd8b
.
Snapshot Manifest: fba7362f34266e6d491c48396f936a8ed4bc5c72
Snapshot Parent: root
Snapshot Taken: 2014-03-03 23:59:59 -0800
We can remove the suspends the test so we can inspect the .esc directory
test.
...
it 'copies files from the manifest into the current working directory' do
@csc.checkout '3b9158d6cd90b07811496330d873d8a71651cd8b'
restored_hash = @csc.hash_for_file 'test_file_2.txt'
restored_hash.must_equal '5d3140359919315ea06e3755cdc81860e9d7c556'
end
...
You know the drill by now.
(retest)
..E...........
Finished tests in 0.052594s, 266.1901 tests/s, 380.2715 assertions/s.
1) Error:
test_0001_copies files from the manifest into the current working directory(CustomSourceControl::when we checkout a previous snapshot):
NoMethodError: undefined method `checkout' for #<CustomSourceControl:0x007feb59034bb0>
class CustomSourceControl
...
def checkout(snapshot = nil)
end
...
end
(retest)
..F...........
Finished tests in 0.056402s, 248.2181 tests/s, 372.3272 assertions/s.
1) Failure:
test_0001_copies files from the manifest into the current working directory(CustomSourceControl::when we checkout a previous snapshot) [custom_source_control.rb:274]:
--- expected
+++ actual
@@ -1 +1 @@
-"5d3140359919315ea06e3755cdc81860e9d7c556"
+"89cdd7bfa8c31d8a23a61d6b695b762c7f588bee"
class CustomSourceControl
...
def checkout(snapshot = nil)
manifest_hash = ''
File.open(File.join('.esc', snapshot), 'r') do |file|
file.readlines.each do |entry|
manifest_hash = $1 if entry =~ /Snapshot Manifest: (\w{40})/
end
end
entries = []
File.open(File.join('.esc', manifest_hash), 'r') do |file|
file.readlines.each do |entry|
entry =~ /(.*?) => (.*?) \(new\)\n?/
entries << { :hash => $1, :pathname => $2 } if $1
end
end
entries.each do |entry|
copy_entry_to_working_directory entry
end
end
...
def copy_entry_to_working_directory(entry)
FileUtils.cp(File.join('.esc', entry[:hash]), entry[:pathname], { :preserve => true })
end
...
end
Here we’re reading the metadata file and getting manifest hash. Then we’re reading the manifest file and breaking down the entries again, this time calling copy_entry_to_working_directory
method to copy the files from the repository to the current working directory.
(retest)
..E...........
Finished tests in 0.041946s, 333.7625 tests/s, 476.8035 assertions/s.
1) Error:
test_0001_copies files from the manifest into the current working directory(CustomSourceControl::when we checkout a previous snapshot):
Errno::ENOENT: No such file or directory - .esc/3b9158d6cd90b07811496330d873d8a71651cd8b
Ugh… another issue with the hashing. Not so much the hashing actually, but the fact that we’re editing the very file we’re trying to code/test custom_source_control.rb
. This test can never pass. So what are, our options? Well, the first that comes to mind is to just run the script from another directory. We can do this by adding the directory we’re working in to our path. This actually would’ve solved an issue we faced earlier as well. However, I tried to avoid it to keep things simple.
First, let’s skip the current failing test.
(retest)
..S...........
Finished tests in 0.115535s, 121.1754 tests/s, 173.1077 assertions/s.
Ok, we’re passing again so we can restructure a bit. Let’s get the current working directory.
pwd
/Users/mattweppler/developer/projects/custom_source_control
Now let’s add it to our path so we can execute it from a different directory. We can even bring our command a little closer to the command Jim Weirich mentions: csc
, by creating a symlink. Create a new directory (it can even be within the current directory), I am calling it test_dir
. Then, let’s move the test files into the test_dir
and change into that directory.
export PATH=$PATH:/Users/mattweppler/developer/projects/custom_source_control
ln -s $HOME/developer/projects/custom_source_control/custom_source_control.rb $HOME/developer/projects/custom_source_control/csc
mkdir test_dir
mv test_file_* test_dir
cd test_dir
Some of our tests will fail since I did a few hackish things here and there. Again, I was trying to cut down on the amount of possible new concepts. Oh well… Rerun the tests and let’s see what we get.
csc
(retest)
..S....F.E....
Finished tests in 0.178928s, 78.2438 tests/s, 106.1880 assertions/s.
1) Failure:
test_0003_gets_a_list_of_files_in_the_current_working_directory(CustomSourceControl::when we take a snapshot) [/Users/mattweppler/developer/projects/custom_source_control/csc:218]:
--- expected
+++ actual
@@ -1 +1 @@
-["custom_source_control.rb", "test_file_1.txt", "test_file_2.txt"]
+["test_file_1.txt", "test_file_2.txt"]
2) Error:
test_0007_adds_entries_to_the_manifest_file(CustomSourceControl::when we take a snapshot):
NoMethodError: undefined method `chomp' for nil:NilClass
/Users/mattweppler/developer/projects/custom_source_control/csc:245:in `block (3 levels) in <main>'
Ok so what are we working with here. Well we no longer need to account for the custom_source_control.rb file. Let’s go update that. So this:
...
it 'gets a list of files in the current working directory' do
@csc.cwd_files.must_equal ['custom_source_control.rb', 'test_file_1.txt', 'test_file_2.txt']
end
...
becomes this:
...
it 'gets a list of files in the current working directory' do
@csc.cwd_files.must_equal ['test_file_1.txt', 'test_file_2.txt']
end
...
We can also remove the keep_if
’s since we were trying to guard against anything but our control files (test_file_1.txt
, test_file_2.txt
). So this:
...
it 'creates a file hash for all files in the current working directory' do
actual_hashes = {
'test_file_1.txt' => 'bb4d8995cfa843effc83d6ddcea1a8351c09497f',
'test_file_2.txt' => '5d3140359919315ea06e3755cdc81860e9d7c556'
}
expected_hashes = @csc.cwd_hashes.keep_if { |key, value| key == 'test_file_1.txt' || key == 'test_file_2.txt' }
expected_hashes.must_equal actual_hashes
end
...
becomes this:
...
it 'creates a file hash for all files in the current working directory' do
actual_hashes = {
'test_file_1.txt' => 'bb4d8995cfa843effc83d6ddcea1a8351c09497f',
'test_file_2.txt' => '5d3140359919315ea06e3755cdc81860e9d7c556'
}
@csc.cwd_hashes.must_equal actual_hashes
end
...
and this:
...
it 'returns a list of new and existing files' do
deltas = @csc.deltas
deltas[:new].keep_if { |key, value| key == 'test_file_1.txt' || key == 'test_file_2.txt' }
deltas[:new].must_equal ['test_file_1.txt', 'test_file_2.txt']
deltas[:existing].must_equal []
end
...
becomes this:
...
it 'returns a list of new and existing files' do
deltas = @csc.deltas
deltas[:new].must_equal ['test_file_1.txt', 'test_file_2.txt']
deltas[:existing].must_equal []
end
...
Lastly this:
...
it 'adds entries to the manifest file' do
expected_content = %Q{5d3140359919315ea06e3755cdc81860e9d7c556 => test_file_2.txt (new)\nbb4d8995cfa843effc83d6ddcea1a8351c09497f => test_file_1.txt (new)}
@csc.manifest_contents.gsub!(/.*? => custom_source_control\.rb \(new\)\n?/, '').chomp.must_equal expected_content
end
...
becomes this:
...
it 'adds entries to the manifest file' do
expected_content = %Q{5d3140359919315ea06e3755cdc81860e9d7c556 => test_file_2.txt (new)\nbb4d8995cfa843effc83d6ddcea1a8351c09497f => test_file_1.txt (new)}
manifest_contents = @csc.manifest_contents
if manifest_contents =~ /.*? => custom_source_control\.(?:md|rb) \(new\)\n?/
manifest_contents.gsub!(/.*? => custom_source_control\.(?:md|rb) \(new\)\n?/, '').chomp.must_equal expected_content
else
manifest_contents.chomp.must_equal expected_content
end
end
...
(retest)
..S...........
Finished tests in 0.032450s, 431.4330 tests/s, 616.3328 assertions/s.
…and we’re back to passing! Let’s remove that skip statement and continue working on that last test. If you recall, we need to suspend the test long enough so that we can go through the metadata files and get our root snapshot.
For me, that hash is 485ac882b4e89e929584acdfed522499f0a45464
. With that let’s update the test and run it.
...
it 'copies files from the manifest into the current working directory' do
@csc.checkout '485ac882b4e89e929584acdfed522499f0a45464'
restored_hash = @csc.hash_for_file 'test_file_2.txt'
restored_hash.must_equal '5d3140359919315ea06e3755cdc81860e9d7c556'
end
...
For the win…
(retest)
..............
Finished tests in 0.040760s, 343.4740 tests/s, 515.2110 assertions/s.
We… are… passing! Good job! I really enjoyed writing this post. I hope this was helpful for you. Just a few last things before you go.
How do I use this thing now that it’s built?
Well, while we have the methods to handle some of the functionality, we haven’t added the ability to pass arguments on the command line. You can add something very simple like the following code:
First change the require 'minitest/autorun'
to require 'minitest/spec'
and add the following to the bottom of the file.
...
def main
unless ['initialize', 'snapshot', 'checkout'].include? ARGV[0]
puts "#{ARGV[0]} is not a subcommand."
exit 1
end
csc = CustomSourceControl.new
case ARGV[0]
when 'initialize'
csc.initialize_repository
when 'snapshot'
csc.snapshot
when 'checkout'
if ARGV[1]
csc.checkout ARGV[1]
else
puts "'checkout' subcommand takes a second argument, SHA1 of the metadata file to checkout."
exit 1
end
end
end
main if __FILE__ == $0
You would then be able to call it from the command line like this:
csc initialize
tree
.
├── .esc/
│ └── HEAD
├── test_file_1.txt
└── test_file_2.txt
csc snapshot
tree
.
├── .esc/
│ ├── 485ac882b4e89e929584acdfed522499f0a45464
│ ├── 5d3140359919315ea06e3755cdc81860e9d7c556
│ ├── 74aec4d0ab199369fc8fe3fd38a9e1459678b2ea
│ ├── HEAD
│ ├── __manifest__
│ ├── __metadata__
│ └── bb4d8995cfa843effc83d6ddcea1a8351c09497f
├── test_file_1.txt
└── test_file_2.txt
cat > test_file_3.txt
this is test_file_3.text
cat >> test_file_2.txt
this is an update to test_file_2.text
cat test_file_2.txt
this is test_file_2.text
this is an update to test_file_2.text
tree
.
├── .esc/
│ ├── 485ac882b4e89e929584acdfed522499f0a45464
│ ├── 5d3140359919315ea06e3755cdc81860e9d7c556
│ ├── 74aec4d0ab199369fc8fe3fd38a9e1459678b2ea
│ ├── HEAD
│ ├── __manifest__
│ ├── __metadata__
│ └── bb4d8995cfa843effc83d6ddcea1a8351c09497f
├── test_file_1.txt
├── test_file_2.txt
└── test_file_3.txt
csc snapshot
tree
.
├── .esc/
│ ├── 485ac882b4e89e929584acdfed522499f0a45464
│ ├── 5d3140359919315ea06e3755cdc81860e9d7c556
│ ├── 6ace300838a9818ec987a0e483b7a3ae598afe7f
│ ├── 74aec4d0ab199369fc8fe3fd38a9e1459678b2ea
│ ├── 89cdd7bfa8c31d8a23a61d6b695b762c7f588bee
│ ├── HEAD
│ ├── __manifest__
│ ├── __metadata__
│ ├── bb4d8995cfa843effc83d6ddcea1a8351c09497f
│ ├── bdb94f80a11a03f2f739faa297b3dc219df93e0c
│ └── e9e0ddd6a9d8f998c41fb83978942e1021f21cac
├── test_file_1.txt
├── test_file_2.txt
└── test_file_3.txt
cat test_file_2.txt
this is test_file_2.text
this is an update to test_file_2.text
cat .esc/HEAD
e9e0ddd6a9d8f998c41fb83978942e1021f21cac
cat .esc/e9e0ddd6a9d8f998c41fb83978942e1021f21cac
Snapshot Manifest: bdb94f80a11a03f2f739faa297b3dc219df93e0c
Snapshot Parent: 485ac882b4e89e929584acdfed522499f0a45464
Snapshot Taken: 2014-03-07 23:59:59 -0800
csc checkout 485ac882b4e89e929584acdfed522499f0a45464
cat test_file_2.txt
this is test_file_2.text
- You should notice the timestamp still shows
2014-03-07 23:59:59 -0800
. You can remove that line of code, but the tests will fail again. - We don’t really clean up after ourselves so that functionality needs to be added.
- Getting the checkout hash is also a manual process so that
csc log
functionality we talked about would come in handy. - We are not handling any types of errors mind you
…so its not quite production ready.
What do I do next?
Some of the things I’d like to address in a future post include:
- Separating the tests from the actual implementation code.
- DRY’ing out our code. Many times I have had to fight the urge to do it in this post. I really wanted this information to be approachable by anyone though, so I didn’t use any gems, even
minitest/given
which was created by Jim Weirich. - Testing for more edge cases, and fixing any bugs we find.
- Adding code coverage.
- Adding the ability to handle command line arguments with
OptionParser
. - Adding tests and functionality to diff checkins.
- Adding tests and functionality to list the history (metadata file hashes from head all the way back to root)
- Possibly turning this into a gem.
Lastly I’d like to thank a few people for helping with this post. Austin Puri, thanks for running through this as a developer and giving some great feedback. Devon Mahnken, for catching a lot of spelling and English grammar mistakes. After all your corrections, for the first time I think my father was right about me being a robot. Really appreciate the help guys!
You can double check your work with the project I have hosted on github
You may have some questions that this didn’t quite answer. Feel free to email me or leave a comment.