stevieb's tech musings: Using a DVCS for your code and documents: Part 1

This document is the first in a series that outlines the very basics of using a Distributed Revision Control System (DVCS) to manage and store changes and updates to documents you write. Used primarily for software code, I've come to use it for my blog posts, poetry, documentation etc. If you've never heard of revision control systems before, you might want to do a quick search online for what they do, and what they are for.

Back in the day, I used software such as CVS and SVN to house my code changes. This was tedious though. I had to set up a server, keep it running 24x7, ensure it was properly backed-up and maintained. I worked at an ISP, so these things weren't a problem for me. However, for others, having a dedicated server doesn't make a lot of sense.

Recently, I decided to give Bitbucket a try. It, like GitHub, provides free hosting of your document repositories. The most interesting and useful feature of DVCS is that they are indeed distributed. If my CVS or SVN server went down, that was it... work stopped. With DVCS, I can clone my repository, and if the remote server goes down, I can use my current clone just as if it was the original repository. I can clone it again, or even use it to store new changes.

Go on over to Bitbucket and set yourself up an account. Once you're done that, navigate to where you can create a new repository. I'm going to name mine "Test" for this example. I am going to make it public, so that you can see my repository at the end of this post. I don't need a bug tracker, so I'm leaving that option unchecked, as well as the Wiki option. Although I've written patches against Git repositories, I like Mercurial, so I'm going to use that. (The command "hg" is for Mecurial, so you'll see it often in my examples.)

Once you've created the repository (hereinafter: repo), click the link that states "I'm starting from scratch".

Create a repo directory on your computer, and change into it:

$ cd ~
$ mkdir repos
$ cd repos

Now copy the 'clone' line that you see on the Bitbucket page, and paste it on your command line:

steve@ub:~/repos$ hg clone https://bitbucket.org/spek/test

Output:

destination directory: test
no changes found
updating to branch default
0 files updated, 0 files merged, 0 files removed, 0 files unresolved

We've cloned our new, empty repository. It created a new sub-directory called "test". Change into this new directory:

steve@ub:~/repos$ cd test

steve@ub:~/repos/test$ ls -la
total 12
drwxrwxr-x 3 steve steve 4096 2012-04-13 17:59 .
drwxrwxr-x 3 steve steve 4096 2012-04-13 17:59 ..
drwxrwxr-x 3 steve steve 4096 2012-04-13 17:59 .hg

The '.hg' directory is where the important information is stored.

Let's get right into using your repository. I will go through the basic commands as we encounter them.

Start by creating a new file, and adding some text to it. I use vim, but you can of course use any editor of your choosing.

steve@ub:~/repos/test$ vim test.pl

Save the file. Here's what my new file looks like:

#!/usr/bin/perl

use warnings;
use strict;

print "Hello, world!\n";

Ok, we have a new file with some text in it. Let's check the status of the file in relation to our repository. The command 'hg' is for Mercurial:

steve@ub:~/repos/test$ hg status
? test.pl

The "?" before the filename means that this file is unknown to the repository. There will be many cases where you won't want to add certain files to a repository, but we'll deal with that in a later post. For now, we want to add this file:

steve@ub:~/repos/test$ hg add test.pl

Note that you can also call "hg add" with no filenames. This will include ALL files (recursively). Now let's re-check the status of our repository:

steve@ub:~/repos/test$ hg status
A test.pl

The "A" before the filename means that you have added a new file. It has not been committed to the repository yet. Let's do this now:

steve@ub:~/repos/test$ hg commit -m "-initial import"

Output:

abort: no username supplied (see "hg help config")

Whoops! What happened? Well, Mercurial (hg) needs to know authentication information before you send up changes back to your master repository. We'll discuss how to do this momentarily. First, lets focus on the "commit" command to hg. The "-m" flag tells hg that you want to add an inline message for this change. If you omit the -m and the following message, you will be dropped into your default editor to write one out there. You can cancel a commit simply by exiting your editor without saving. Now, back to adding auth information. While in your repository, create a new file named "hgrc", and add your information. Mine looks like this:

steve@ub:~/repos/test$ cat hgrc 
[paths]
default = https://spek@bitbucket.org/spek/test
[ui]
username = steveb <steveb@cpan.org>

The "default" directive under the [paths] category is the link to your repository on Bitbucket. Under the [ui] section, the "username" is the email address/account you signed up to Bitbucket with. I don't add anything further... I prefer to just type my password out manually when I need to. Once your 'hgrc' file is created, move it into the .hg directory:

steve@ub:~/repos/test$ mv hgrc .hg/

Now rerun your commit:

steve@ub:~/repos/test$ hg commit -m "-initial import"

Went off without a hitch. Committing saves your changes in a changeset in your local working copy. To push them to the master (in this case, Bitbucket), we use "push". Let's upload the local commits now:

steve@ub:~/repos/test$ hg push

Output:

pushing to https://spek@bitbucket.org/spek/test
searching for changes
http authorization required
realm: Bitbucket.org HTTP
user: spek
password: 
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: bb/acl: spek is allowed. accepted payload.

Done. We added a file with "hg add", committed the changes via "hg commit", and uploaded the single changeset with "hg push". There's a problem though. My program was supposed to say hello to the universe, not just the world! Edit the test.pl file to print "Hello, universe!\n"; instead of "Hello, world!\n";, and then save the changes.

Now commit this update ("hg commit"), this time without the '-m' flag so it opens your editor. Add the following line in the commit message, and then save:

- replaced world with universe in print statement

Oh, man! I wanted to insert a comment saying what the print line is doing, but I forgot. Edit test.pl so it looks like this:

#!/usr/bin/perl

use warnings;
use strict;

# say "hi" to the universe
print "Hello, universe!\n";

Let's check the status again:

steve@ub:~/repos/test$ hg status
M test.pl

The 'M' prior to the filename signifies that we have a Modified file that hasn't been committed yet. Do that now:

steve@ub:~/repos/test$ hg commit -m "- added comment for print universe"

We committed two changes (which created two changesets), but these changes are local only. Let's push them up to our master repository:

steve@ub:~/repos/test$ hg push
pushing to https://spek@bitbucket.org/spek/test
searching for changes
http authorization required
realm: Bitbucket.org HTTP
user: spek
password: 
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 2 changesets with 2 changes to 1 files
remote: bb/acl: spek is allowed. accepted payload.

Notice this time the output found two changesets. This is because we committed two changes prior to pushing the first one. A rule of thumb is "commit early, commit often". I follow the same rule with push.

So, we have our program created, and it runs great. We have made changes, and saved these changes. Let's see the basics on viewing the changes we've made. "hg log" shows you a list with a brief set of details for all the changesets you've committed. They appear in reverse chronological order. The hexidecimal string next to the "N:" in the "changeset:" line represents the specific changeset. This is much more complicated than how I'm describing it, so we'll focus on these details in a later post.

steve@ub:~/repos/test$ hg log

changeset:   2:52eec25c7d12
tag:         tip
user:        steveb@cpan.org
date:        Fri Apr 13 19:30:58 2012 -0400
summary:     - added comment for print universe

changeset:   1:b558f5695e13
user:        steveb@cpan.org
date:        Fri Apr 13 19:29:09 2012 -0400
summary:     - replaced world with universe in print statement

changeset:   0:739f47eadd48
user:        steveb@cpan.org
date:        Fri Apr 13 19:14:08 2012 -0400
summary:     -initial import

The log is great for history, but what if we need to see more information... such as the list of files changed, and all the lines in the commit message as opposed to just the first? Adding the "-v" flag to 'hg log' will show you the files changed, as well as all the lines you added to your commit message. Here's an example from one of my real repositories:

steve@ub:~/devel/repos/devel-trace-method$ hg log -v | more

changeset:   13:2ca86cf74c83
user:        steveb 
date:        Sat Mar 03 10:54:36 2012 -0500
files:       Changes Makefile.PL README lib/Devel/Trace/Method.pm
description:
- 0.08 POD cleanup, Makefile.PL fix
- added meta section to Makefile.PL allowing us to tell CPAN
  that we use a different tracker than rt.cpan.org
- cleaned up POD so that LICENSE would appear correctly on
  CPAN

Within each commit, we can now see what files we changed, and the list of comments we made per changeset. What if we need to see the actual changes themselves? No problem... add the "-p" (patch) flag to 'hg log':

steve@ub:~/repos/test$ hg log -p

...wait! That lists ALL of our changesets (commits). That's too much information for what we want. I want to know about the last commit only right now. In Mercurial, we are currently working in the "tip" branch. Other *revision control systems may refer to this as HEAD. Let's check out the actual changes like we tried above, but only the most recent change. Again, the '-p' flag means "patch". The '-r' flag means "revision". We want to see the actual physical changes (-p) to the most recent revision (-r):

steve@ub:~/repos/test$ hg log -p -r tip

changeset:   2:52eec25c7d12
tag:         tip
user:        steveb@cpan.org
date:        Fri Apr 13 19:30:58 2012 -0400
summary:     - added comment for print universe

diff -r b558f5695e13 -r 52eec25c7d12 test.pl
--- a/test.pl Fri Apr 13 19:29:09 2012 -0400
+++ b/test.pl Fri Apr 13 19:30:58 2012 -0400
@@ -3,4 +3,5 @@
 use warnings;
 use strict;
 
+# # say "hi" to the universe
 print "Hello, universe!\n";

What if I want to see the more verbose changeset information (all files and all comments). Can I? Youbetcha!:

steve@ub:~/repos/test$ hg log -p -v

Here are a couple "howtos" regarding the "hg log" command. You can add the verbose (-v) and patch (-p) flags to either of these:

# view the first commit
steve@ub:~/repos/test$ hg log -r 0

# review the 1st and 3rd commit
steve@ub:~/repos/test$ hg log -r 0 -r 2

# review the most recent commit
steve@ub@:~/repos/test$ hg log -r tip

This tutorial series is primarily designed to describe the command-line usage of a DVCS application. The web-based display of the storage facility is outside the scope of this document, but it can be very handy. Here is what my online repo looks like after completing the examples in this post.

I'll end this post here. You've learnt the very basics on how to clone a Mercurial repository from your free Bitbucket account, how to commit changes into changesets, how to push the changesets back into the master repository, and how to do some basic review of the changes that you've made. In the next episode, we'll delve into how you can perform more advanced reviews of your changes, revert your working directory to a previous change, creating branches to manage different change tracks and an explanation and examples of how DVCS differs from non-distributed versioning systems. We'll also touch on the ".hgignore" file, which allows you to use "hg add" without adding files you don't want included.

Thanks for reading. If you've read any of my other posts, you know I appreciate all feedback, good and/or bad in either the comments below, or privately via email.

stevieb's tech musings

2012/04/13

Using a DVCS for your code and documents: Part 1

2 comments:

stevieb's

stevieb's