Thursday, August 17, 2006

Word Document management using SVN

Seeting Microsoft Word Document Management System
using SVN and TortoiseSVN

Any form of writing is an iterative refinement of thoughts, language, and vocabulary. But, Microsoft Word, one of the most popular editors in the Windows world, does little to facilitate this iterative refinement of the document. Although Microsoft word provides "review" functionality that keeps track of changes in the document, using reviewing functionality makes the document cluttered and messy. Further, Microsoft Word does not provide common functionalities such as comparing, merging, or identifying differences between different versions of the document.

Managing multiple versions of a software code (written in java, C, etc) is a well established technology in the software development world. One such tool that facilitate managing multiple version of the software code is SVN. Yesterday, after getting frustrated with the limited capability of the Microsoft word in managing multiple version of the document, I thought of using SVN. After Googling and searching on blogs, I found that SVN along with TortoiseSVN can be used to setup a free home-made document management system for maintaining word documents. To get you an idea of the power of this free document management system, below is the image of the Microsoft Word (2007) that shows different version of the same document. Notice "new addition" in the title.

The image below shows the
1. The current document on which I am working (lower right panel)
2. The previous version of the same document (upper right panel)
3. Changes in the document. New additions are shown in Red Color (middle panel)
4. Only changes (left panel)

Step by Step instructions to setup Microsoft word document management system
1. download latest SVN
2. Install SVN with default setting
3. Download TortoiseSVN
4. Install TortoiseSVN with default settings
5. After restarting the computer, create a folder, title SVN, on C:\ drive i.e. C:\SVN
6. Right click on the "SVN" folder and select TortoiseSVN > Create Repository here (see fig 2) Select default Native filesystem (FSFS) and click OK. This creates a repository where SVN stores all versions.

7. Now import the files. For this select the folder that contains the files and right click and select TortoiseSVN > Import. In the popup box, select c:\svn for the "url of the repository" and click OK
8. Now create another folder "svn_doc" where ever you want. In this folder we will checkout a copy of all files. For this right click on the folder and select "SVN Checkout" from the context menu.
9. All the files that were imported now should be available in this folder.
10. Now always use files from this new folder. Whenever you edit any file, commit changes to SVN by right click on the file and selecting "SVN commit". To see difference between two versions, select the file and right click and select TortoiseSVN> diff. Mircorsoft word should start and show current file, last saved version, and the differences, as shown in the fig 1.

This setp is required only for Microsoft Office 2007
1. To be able to see the differences in the *.docx file, right click on any file and select TortoiseSVN > setting. In the tree view select Diff Viewer and click on the "Advanced" Button. Select extension .doc and click on edit. Change "doc" to "docs". Alternatively you can create a new extension "docs" and copy the diff viewer location from "doc".

Enjoy free document management system.


Abhishek said...

This is damm cool, could have used it while I was editing my MBA essays.

minimoe said...
This comment has been removed by the author.
Anonymous said...

Shouldn't it be docx instead of docs?

Roachy said...

Fantastic :)

Great post

iyusaf said...

This is useful; thanks.

Anonymous said...

This is a great idea . but it does not work for me (docs or docx)

Anonymous said...

Yes. I actually implemented that with Word 2007

Thomas said...

This is a very cool idea.

Concerning .docx:
- use .docx instead of . docs
- switch %base and %mime (if you don't chanced and original versions are exchanged)

Nathan VanHoudnos said...

If you want to just version your Documents folder, instead of checking out svn_doc, you can do an in place import. This is talked about here:

I've written up some sloppy, though hopefully helpful directions to it Tortoise SVN style below. Disclaimer: my directions are for Vista, adapt as needed, YMMV.

After step 6, do the following.

7. Go to c:/users/vanhoudn and right-mouse click on the documents folder. Choose Tortoise Svn -> Repo Browser

8. Select file:///C:/svn (or where ever you put your repo), hit OK

9. In the left hand pane, right mouse click "file:///C:/svn" and choose "Create Folder". Type in "Documents", hit okay, and close out of the Repo Browser. (This needs to be the *exact same* name as the directory your wanting to do the in place import for. E.g. for WinXP "My Documents")

10. Go back to c:/users/ in explorer, right mouse click on the vanhoudn directory, and choose SVN Checkout. Click OK, and click okay to the warning about the directory being non-empty. What this will do is checkout the empty Documents directory on top of your full Documents directory. Since svn is smart, it won't blow away your files.

11. You will notice at this point that c:/users/vanhoudn/Documents is under source control (it's got a little green checkmark on it.) However, none of the files have been added to the repository. You can either add them one-at-a-time, or en masse. If you choose to add the one-at-a-time, you have to do it for all of the parent directories. To do it en masse, simply select all the files, right mouse click on your selection, choose TortoiseSVN -> Add, and click OK. That will *mark them* for addition. You're not done yet.

12. To finish, commit your changes. Right mouse click on c:/users/vanhoudns/Documents/, choose SVN Commit. For you import message, say something like "Initial import of Documents folder."

Tiago Franco said...


I already do document management with SVN, but I'm having trouble creating unique reference numbers for each document. SVN, unlike document management systems, can't be used to create unique document reference numbers.

Any idea?

vfdvgf said...

We always Wow Power Leveling and world of warcraft gold

Tiketiketik said...

This post is getting old but it helped me three years later.

To update some of the instructions, you don't need to install SVN. Just install TortoiseSVN and you should have all you need.

Today, TortoiseSVN has built-in support for Office 2007, so you don't need to worry about the "docx" extension. It works out of the box.

Anonymous said...

The Word diff function combined with Tortoise integration is jaw-dropping.
MS should package Tortoise with Office ;)

Alexis said...

I often work with doc files. Some days ago I converted doc file into docx and my data was was lost. I didn't know what to do next,and entered in the Google and began searching the tool which could be restore the file. I throve and discovered like software - recovery for word. It determined my trouble quite quickly and free of charge as far as I remembered.

generic cialis said...

Interesting article, added his blog to Favorites

ranjini said...

Wonderful blog & good post.Its really helpful for me, awaiting for more new post. Keep Blogging!

Document Management Software

Anonymous said...

indeed I just learnd that TurtoiseSVN which we use a lot can do the word-diff-thing... it's great!

Does anyone know how to include SVN version numbers INTO the word document, so that you can see the version of the doc in the printout?

Mark Martin said...

Hey there! Thank you for sharing your thoughts about document management systems in your area. I am glad to stop by your site and know more about document management systems. Keep it up! This is a good read. You have such an interesting and informative page. I will be looking forward to visit your page again and for your other posts as well.
Versioning is a process by which documents are checked in or out of the document management system, allowing users to retrieve previous versions and to continue work from a selected point. Versioning is useful for documents that change over time and require updating, but it may be necessary to go back to or reference a previous copy.
When implementing a new technology within your company, we work hard to ensure that integration with your existing systems is a seamless process. We also manage staff training to assure that every member of your company understands how to operate the new system so you can get the most out of your new business tools.

document management systems

tst said...

Hi, just one question to the screenshot: is the tripple window layout shown for the word-doc comparison the default view shown after starting the svn diff? Or do you have to do some additional operations to achive that?

Brave Boss said...

I have visited your blog for the first time and found it a well organized blog. Keep sharing nice stuff.
Document management online

John Michle said...

If you want to run your business successfully than you have to do documentation of your business properly. These documents may be used in future.

Field Service Management Software

Anonymous said...

Version 1.7.12 of TortoiseSVN has an bug in the script "c:\Program Files\TortoiseSVN\Diff-Scripts\diff-doc.js". The constant vOffice2013 isn´t defined.

Inset at line 25 "var vOffice2013 = 13;" will fix it.

Alexander Brucksin said...

It is interesting to have Tips&Tricks on Document Management

Prologic Corporation said...

This is a good article & good site.Thank you for sharing this article. It is help us following categorize:
healthcare, e commerce, programming, multi platform,inventory management, cloud-based solutions, it consulting, retail, manufacturing, CRM, technology means, digital supply chain management, Delivering high-quality service for your business applications,
Solutions for all Industries,
Getting your applications talking is the key to better business processes,
Rapid web services solutions for real business problems,
Web-based Corporate Document Management System,
Outsourcing Solution,
Financial and Operations Business Intelligence Solution,

Our address:
2002 Timberloch Place, Suite 200
The Woodlands, TX 77380