Using Git as a "Poor Man's" Time Machine

Audience: Doesn’t cry when installing new software and using the command line, interested in reading a long rambling post about Garry Winogrand, Chuck Norris and, eventually, Git.

In recent versions of Apple’s OSX there has been a new feature called “Time Machine”. In short it allows you to step back in time to revisit your file system in previous states and copy files from those previous times. “Time Machine” is a simplified gui for a advanced filesystem created by Sun called ZFS. In this series of articles, we will attempt to mimic some of these features in our own creation, powered by Git.

Note: These directions are a little Windows-centric, though it should be easy to adapt them to any operating system

It Finally Happened

So the day has finally come all those hours and hours of blood, sweat and tears spent working on the All Important Spreadsheet, or perhaps it was a colossal document of your famous stamp collection. Maybe it was your latest digital masterpiece, a homage to Garry Winogrand’s “Park Avenue, New York” done entirely in MS Paint?

You’ve checked once…twice…three times and you are finally coming to grips with the fact that your file is missing. Well the hope is that its just missing, maybe it was misplaced, accidentally saved to a different folder. After a frantic search, nothing has turned up.

After the panic subsides you start thinking of what to do next. After searching in other folders, the next logical step would be to retrieve it from yesterdays backup (you are backing up right?). So you pull out yesterday’s backup and realize you have just lost hours of productive time. Not only have you spent all morning searching for the file, now you have to somehow recreate all the work that was lost. There has to be a better way!

You Silly Git

Well, its your lucky day über hacker Linus Torvalds already invented it, for fun, while beating Bruce Schneier at chess and punching Chuck Norris in the face. It’s called Git.

The Playschool definition of Git is as follows:

Super-duper undo for files - with cheatcodes. With its amazing branching, merging and other hoopla it is like the [Konami Code](http://en.wikipedia.org/wiki/Konami_Code) crossed with [Portal](http://www.youtube.com/watch?v=iFhPFSjNovA&feature=related) crossed with [Your Mom](http://beltespenner.com/oscommerce/images/i%20love%20your%20mom.jpg) (because she really does try to do whats best for you even if you don't understand it at the time)

The Git Website says the following:

> > ## Git is... > > Git is a **free & open source, distributed version control system** designed to handle everything from small to very large projects with speed and efficiency. **Every Git clone is a full-fledged repository** with complete history and full revision tracking capabilities, not dependent on network access or a central server.**Branching and merging are fast** and easy to do.

Say What?

The short of it is this: git (and other distributed revision control software) let you record changes to your files in a meaningful way (you get to tag the changes with a name). Then you can arbitrarily pick and choose, roll back and forward, and branch and merge these changes. We are going to take the baby steps necessary to get you up and running with an “automatic” record (or commit) instead of the “real” way, which is to record these changes as you go.

This is not the intended use for Git. Git was created to be used by somewhat technical people, working independently, with text files. We will be abusing it by using it with/for non-technical people (well you might be a smarty pants but we are using a very dumb approach of “fire and forget” for the commits).

Also, we will be recording the pool (repository) of everyone’s work, normally each person would have their own copy of all the files to work on and then when they are done working all of everyones changes are merged into one repository.

Lastly, Git (and most other versioning software) is made to work on text files. “Why?” you may ask? Because, diffing (finding the differences between two different versions of the same file) text files is easy peasy, diffing binary files (images, word documents, programs, etc) is hard stuff. Every type of binary file has its own format. Which means whoever invented the format had their own vision how the bits should be ordered and what they really mean.

In some binary formats, to be more efficient, everything gets rewritten, not just the stuff that changed. For example, if you are working on an image and remove the background. If you compare the images side by side it is obvious to you what has changed. But to the computer all it knows is bits. So half of the file may be changed for more efficient storage. But I digress…

The point is that even though we are using Git in a way its creator hadn’t intended, it is flexible enough for the job. This is a testament to the philosophy of doing one thing and doing it well. Our project today is not an end-all-be-all it is merely a stop-gap for situations where you don’t have regular, consistent backups and/or you want some of the benefits of version control.

Enough Already, Let’s Get To It

You’re going to hate me. Only a little though.

In the amount of time you spent reading the above drivel you could have already implemented our little project. Well, we laughed, we cried, good times…

Anyway on to it! On to…

Building a “Poor Man’s Time Machine” with Git

It’s not really the worst ever but it’s no Tardis. We will be able to move backward in time, kinda forward-ish, depending on your perspective, and sideways as well. We are mostly going to be concerned with the preventing-the-JFK-assassination-and-returning-to-the-present-day-with-nothing-else-changed rather than the Bill-and-Ted-travel-back-in-time-and-totally-screw-with-the-present-errm-future-err-whatever-dude.

When everything is said and done you will be able to: see what files have been added or changed in the previous day (or hour, or whatever interval you choose) and be able to arbitrarily grab any previous version of any file.

What you need to download (assuming your are running Windows)

What you need to know

What else

Time Machine Go

Okay, so install everything, I’ll wait…

Good now lets tell Git what folder to work on and “initialize” it. And no, don’t worry, “initialize” has nothing to do with “erase”. The easiest way to show you is from the command line, so roll up your sleeves.

During our example we will be using a folder called “SharedFolder”. This will represent the common file-share on our hypothetical server.

C:\>cd SharedFolder

C:\SharedFolder>dir
 Volume in drive C has no label.
 Volume Serial Number is 18A3-D0C5

 Directory of C:\SharedFolder

02/13/2010  12:12 PM    <dir>          .
02/13/2010  12:12 PM    <dir>          ..
02/13/2010  12:02 PM                 5 FileOne.txt
02/13/2010  12:02 PM                 5 FileTwo.txt
               2 File(s)             10 bytes
               2 Dir(s)     984,203,264 bytes free

C:\SharedFolder>git status
fatal: Not a git repository (or any of the parent directories): .git

C:\SharedFolder>git init
Initialized empty Git repository in C:/SharedFolder/.git/

C:\SharedFolder>git status
# On branch master
#
# Initial commit
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       FileOne.txt
#       FileTwo.txt
nothing added to commit but untracked files present (use "git add" to track)

In this example we have initialized the directory (telling git this is where we want to work). Since we have not added any files yet, we have not told git to actually track them. So now we will do just that.

C:\SharedFolder>git add FileOne.txt

C:\SharedFolder>git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#       new file:   FileOne.txt
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       FileTwo.txt

C:\SharedFolder>git add .

C:\SharedFolder>git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#       new file:   FileOne.txt
#       new file:   FileTwo.txt
#

As you have seen we can use “git add” to be picky about what files we include. In advanced usage you can even tell git what part of which files to include. The last command “git add .” is a shortcut telling git to add every new file that has is not already being tracked.

Lastly, we are going to commit our changes to git. In effect, this is creating a checkpoint within git. Now any time in the future we can roll back to exactly this state, regardless of how many changes we have made, even if we have deleted the files entirely. As long as the hidden “.git” directory is there, all our changes are there too.

C:\SharedFolder>git commit -am "Initial Commit"
[master (root-commit) 436162e] Initial Commit
 2 files changed, 2 insertions(+), 0 deletions(-)
 create mode 100644 FileOne.txt
 create mode 100644 FileTwo.txt

C:\SharedFolder>git status
# On branch master
nothing to commit (working directory clean)

C:\SharedFolder>git log
commit 436162e50d2075366634064793ef7ef8051da871
Author: unknown <root@.(none)>
Date:   Sat Feb 13 12:12:41 2010 -0600

    Initial Commit

C:\SharedFolder>

Implementation

Like the flux capacitor in Dr. Brown’s DeLorean, git is doing most of the work in our little “time machine”. Since, all of the hard work has already been done, there is only a small script we need to write to “steer” git.

cd C:\SharedFolder
git add . && git commit -am "Daily Update"

Yep, that’s really all there is to it.

So go ahead and save this code as a batch file. Test it a few times. After you run it you should be able to do a “git log” and see a new revision (assuming that changes have been made).

At this point all there is left to do is to setup your batch file as a scheduled task and wait…

Review

Further Reading

Coming Soon…

In part two we will be discussing exactly what can you do with this wonderful contraption we have built. We will learn how to compare changes, see new files that have been added, and bring back old files that have been deleted or changed. See you…in the future.