Steps to migrate a code repository from CVS to GIT

CVS has served us well for many years. The interface in eclipse is so good it was really hard to justify switching to something else. 

That day has come. This tutorial was done on a mac from the command line interface. GIT comes with mac os now and cvs2git was avalible via the brew install system.
This process should also work on linux command line too.

Requirements:

  • A linux/Unix computer.
  • Install git http://git-scm.com/downloads
  • Install cvs2svn http://cvs2svn.tigris.org/cvs2git.html
  • Direct access to CVS repository files.

Step 1: Make a copy of your raw CVS repository.
This doesn't work correctly with a working copy.
For this example its on my desktop.

~/Desktop/cvs-local

Step 2: Let csv2git to process the repository.

cvs2git –blobfile="~/Desktop/blob.dat" –dumpfile="~/Desktop/dump.dat" –username="dkilgo" ~/Desktop/cvs-local/my-sample-project

If your cvs repository is corrupt, you may recieve warnings from cvs2git like the following:

ERROR: A CVS repository cannot contain both ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Release_text0.htm,v and ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Attic/Release_text0.htm,v
ERROR: A CVS repository cannot contain both ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Release_text1.htm,v and ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Attic/Release_text1.htm,v

In CVS when a file is deleted, it moves into the attic folder. Its not techniclly correct for both files to exist. You'll have to examine both files with a text editor and see which one has the highest revision number and delete the older file.
All the times I encountered this error, the Attic version was older.
Once you've resolved all the errors, re-run the cvs2git command.

A successful run will provide something like the following output:

Time for pass1 (CollectRevsPass): 5.119 seconds.
—– pass 2 (CleanMetadataPass) —–
Converting metadata to UTF8…
Done
Time for pass2 (CleanMetadataPass): 0.026 seconds.
—– pass 3 (CollateSymbolsPass) —–
Checking for forced tags with commits…
Done
Time for pass3 (CollateSymbolsPass): 0.011 seconds.
—– pass 4 (FilterSymbolsPass) —–
Filtering out excluded symbols and summarizing items…
Done
Time for pass4 (FilterSymbolsPass): 74.65 seconds.
—– pass 5 (SortRevisionsPass) —–
Sorting CVS revision summaries…
Done
Time for pass5 (SortRevisionsPass): 0.179 seconds.
—– pass 6 (SortSymbolsPass) —–
Sorting CVS symbol summaries…
Done
Time for pass6 (SortSymbolsPass): 0.110 seconds.
—– pass 7 (InitializeChangesetsPass) —–
Creating preliminary commit sets…
Done
Time for pass7 (InitializeChangesetsPass): 1.975 seconds.
—– pass 8 (BreakRevisionChangesetCyclesPass) —–
Breaking revision changeset dependency cycles…
Done
Time for pass8 (BreakRevisionChangesetCyclesPass): 0.727 seconds.
—– pass 9 (RevisionTopologicalSortPass) —–
Generating CVSRevisions in commit order…
Done
Time for pass9 (RevisionTopologicalSortPass): 0.547 seconds.
—– pass 10 (BreakSymbolChangesetCyclesPass) —–
Breaking symbol changeset dependency cycles…
Done
Time for pass10 (BreakSymbolChangesetCyclesPass): 1.134 seconds.
—– pass 11 (BreakAllChangesetCyclesPass) —–
Breaking CVSSymbol dependency loops…
Done
Time for pass11 (BreakAllChangesetCyclesPass): 1.681 seconds.
—– pass 12 (TopologicalSortPass) —–
Generating CVSRevisions in commit order…
Done
Time for pass12 (TopologicalSortPass): 1.520 seconds.
—– pass 13 (CreateRevsPass) —–
Mapping CVS revisions to Subversion commits…
Creating Subversion r1 (Project initialization)
Creating Subversion r2 (commit)
Creating Subversion r3 (commit)
Creating Subversion r4 (commit)



Done
Time for pass13 (CreateRevsPass): 1.538 seconds.
—– pass 14 (SortSymbolOpeningsClosingsPass) —–
Sorting symbolic name source revisions…
Done
Time for pass14 (SortSymbolOpeningsClosingsPass): 0.854 seconds.
—– pass 15 (IndexSymbolsPass) —–
Determining offsets for all symbolic names…
Done.
Time for pass15 (IndexSymbolsPass): 0.373 seconds.
—– pass 16 (OutputPass) —–
Time for pass16 (OutputPass): 1.455 seconds.

cvs2svn Statistics:
——————
Total CVS Files:              1417
Total CVS Revisions:          2982
Total CVS Branches:           1482
Total CVS Tags:              36376
Total Unique Tags:              59
Total Unique Branches:           2
CVS Repos Size in KB:        25792
Total SVN Commits:             579
First Revision Date:    Thu May 17 08:25:14 2007
Last Revision Date:     Fri Aug 17 11:13:14 2013
——————
Timings (seconds):
——————
 5.12   pass1    CollectRevsPass
 0.03   pass2    CleanMetadataPass
 0.01   pass3    CollateSymbolsPass
74.65   pass4    FilterSymbolsPass
 0.18   pass5    SortRevisionsPass
 0.11   pass6    SortSymbolsPass
 1.97   pass7    InitializeChangesetsPass
 0.73   pass8    BreakRevisionChangesetCyclesPass
 0.55   pass9    RevisionTopologicalSortPass
 1.13   pass10   BreakSymbolChangesetCyclesPass
 1.68   pass11   BreakAllChangesetCyclesPass
 1.52   pass12   TopologicalSortPass
 1.54   pass13   CreateRevsPass
 0.85   pass14   SortSymbolOpeningsClosingsPass
 0.37   pass15   IndexSymbolsPass
 1.46   pass16   OutputPass
91.90   total

Step 3: Create a git repository and import the data into it.

mkdir sample-git
cd sample-git
git init
cat ../blob.dat ../dump.dat | git fast-import

Which will create the following output:

git-fast-import statistics:
———————————————————————
Alloc'd objects:       5000
Total objects:         3057 (      1285 duplicates                  )
      blobs  :         1754 (      1270 duplicates       1254 deltas of       1668 attempts)
      trees  :          956 (        15 duplicates        758 deltas of        892 attempts)
      commits:          347 (         0 duplicates          0 deltas of          0 attempts)
      tags   :            0 (         0 duplicates          0 deltas of          0 attempts)
Total branches:          44 (         6 loads     )
      marks:     1073741824 (      3371 unique    )
      atoms:            997
Memory total:          2594 KiB
       pools:          2360 KiB
     objects:           234 KiB
———————————————————————
pack_report: getpagesize()            =       4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit      = 8589934592
pack_report: pack_used_ctr            =        802
pack_report: pack_mmap_calls          =         41
pack_report: pack_open_windows        =          1 /          1
pack_report: pack_mapped              =   31753114 /   31753114
———————————————————————

Have a look in the git repo directory and you'll find its empy. Where's all your stuff?
Its in the .git directory. You don't have a working copy checked out yet.

Do "git checkout" to get a working copy of your repo.

Now you have a functional git repository with all your CVS comments, tags and branches.

Side note:

Changing to git ment updating our deployment scripts as well.
The old way with CVS looked like this:

cvs -d /home/cvs/devel -q export -r $tag -d $stageDir/sample-project smaple-project

Git has a little more overhead in that by default the cloned repo contains all the branches and commits, not just the tagged/current version.
That means what it checks out is usually a bit bigger than what would be picked up by the CVS export.
To workaround that, you can set the –depth param to limit git where it only cares about the last version of each file and folder.

/usr/local/bin/git clone –depth 1 –branch $tag git\@myserver.mydomain.com:sample-project.git $stageDir/sample-project/

I always remove the revision control files before deployment. Its a pain with CVS and SVN. With git, its a snap because they're all in the same folder.

rm -f -r $stageDir/sample-project/.git

 

derak

 

Leave a Reply

Your email address will not be published. Required fields are marked *