Moving from gitolite to gogs

Problem: You have a ton of git repositories in gitolite and you’d like to switch to the github-esq gui provided by gogs.

Gogs is super easy to get setup and has thoughtfully added tools which make it useful for a private intranet type setup. It has not however come up with a great way of mass-importing git repositories from another tool.

The web interface include a “migration” tool which can be completed one at a time. I had 150 git repos to migrate so I added a repo and then polked at the database

Gogs also uses bare repositories just like gitolite. Loading them into gogs is as easy as rsync’ing them into the gogs-repo directory and adding some rows to the gogs database so gogs knows how to administer them.

I made a script to help me with that task. Note I’m using mysql as my database and my repos default to private.

With this, it took about a second to import all these repos. I did find one other person who had batch imported repos and chose to do it with curl but I couldn’t get it to work. All in all, it took about 4 days to figure out how to get gogs setup and get all the repos into it. Hopefully this script makes that process much quicker for you.

Using EFF’s CertBot with Apache 2.2 and CentOS 6

EFF has created a wonderful tool called CertBot to automate the retrieval and installation of letsencrypt certificates. The documentation is really good but it did require a little trial and error to get things working. Here’s a walkthrough with some of the gotcha’s I encountered.

Why do I need CertBot?

Letsencrypt certificates are only valid for 90 days. The verification and update process is a tedious process to complete manually. CertBot automates the retrieval and renewing certificates, it has a built-in webserver which can stand-in during the verification step, and it can modify your web server config and install the certificates if you happen to be using nginx or apache 2.4


Before you start, you should apply all software updates for your OS and restart. One of the updates overwrote a config file I customized which made the web server fail to start.

I was using CentOS 6 which doesn’t have a built-in package of CertBot. EFF also provides a version which can install its own dependencies called “CertBot Auto”.

Here’s how to install that:

[email protected]:~$ wget
[email protected]:~$ chmod a=rx ./certbot-auto
[email protected]:~$ ./certbot-auto --help

I’d also recommend moving “certbot-auto” into “/usr/local/bin” so it’s available to other users or cron scripts.

[email protected]:~$ mv ./certbot-auto /usr/local/bin/certbot


The docs are great. Have a look.

My site redirects all traffic to https which breaks the verification process which needs to happen over http. The “standalone” mode solves this. It requires that you stop the web server and cert-bot will then answer on port 80 to complete the registration process. The CLI also defines pre-update and post-update “hooks” which can be used to stop and start your web server automatically to minimise down time.

For my domain this looked something like:

[email protected]:~$ certbot certonly --standalone -d --pre-hook "service httpd stop" --post-hook "service httpd start"

If you’re setup includes both http and https, you can use apache to serve the verification files.

sudo certbot-auto certonly --webroot -w /var/www/vhosts/default/html -d -d

You can include up to 10 domains on each certificate.

Where does your certificate get stored?

Good question. On CentOS, certbot creates a /etc/letsencrypt folder. Each certificate you generate will have a folder in the “live” folder. The commands above created a /etc/letsencrypt/live/ with your certificates files.

Here’s what you’ll need to add to your httpd.conf or virtual host to use this new certificate.

#Using letsencrypt certificates.
SSLCertificateFile /etc/letsencrypt/live/
SSLCertificateKeyFile /etc/letsencrypt/live/
SSLCertificateChainFile /etc/letsencrypt/live/


The last thing to setup is cron task to automate renewal of our certificate(s).
“certbot renew -n” attempts to renew any certificates expiring in less than 30 days. The -n makes it non-interactive which is ideal cron task.
And that’s it. You’re all setup to take advantage of a letsencrypt certificate with centos and apache 2.2.
It was much easier than I though and after a lot of poking through the documentation I was able to piece together what needed to be done. Hopefully this saves you a few hours and a couple of server alerts.

Just one more thing. Ever heard of CCA?

One more wrinkle here is CCA checking. These freely available certificates mean that it’s not that hard to create a certificate which appears valid for a popular website. CCA checking adds a DNS record which informs certbot and other clients which authorities are allowed to register certificates for a given domain. All letsencrypt CAs will require this step after Sept. 2017. Because this requires a new DNS record type, not every host or registry is ready for this. I’m using namescheap for my .site address and it’s currently unsupported. Check with your domain registrar or whoever is providing your DNS service if they support CCA records. You might have to choose a new DNS provider if your registrar doesn’t if you want to continue using letsencrypt certificates.

mod_mime + php = hacked site.

So I learned something about mod_mime today that made my jaw drop.

The default way of telling apache to parse a php file looks something like this:

AddHandler php5-script .php

If you install php via the command line on RHEL 4,5 or 6, this is how it sets it up.

What I didn't know is that mod_mime expands the match (.php) to anywhere in the file name.
So test.php or test.php.csv or test.php.jpg would all be passed to the php handler to be executed.


That's a big deal when your application accepts file uploads and is only type-checking the last file extension.
Magento, expression engine, wordpress, etc…

The workaround is to only apply the php handler to files which end in ".php"

<FilesMatch \.php$>
    SetHandler php5-script

And for a little extra 'security', disable php for a directory if you're accepting uploads.

<Directory "/var/www/html/example/uploads">
    php_flag engine off

Which I'm going to go back and change on any server I've ever setup.
I learned this tid-bit from a security advisory from magento.

UPDATE: You may also see a lot of folks who recommend turning on "open_basedir" in php to lock thinks down.
There is a cavet there too. When "open_basdir" is in use, php disables the realpathcache. 
This makes loading/including files very slow.

Steps to migrate a code repository from CVS to GIT

CVS has served us well for many years. The interface in eclipse is so good it was really hard to justify switching to something else. 

That day has come. This tutorial was done on a mac from the command line interface. GIT comes with mac os now and cvs2git was avalible via the brew install system.
This process should also work on linux command line too.


  • A linux/Unix computer.
  • Install git
  • Install cvs2svn
  • Direct access to CVS repository files.

Step 1: Make a copy of your raw CVS repository.
This doesn't work correctly with a working copy.
For this example its on my desktop.


Step 2: Let csv2git to process the repository.

cvs2git –blobfile="~/Desktop/blob.dat" –dumpfile="~/Desktop/dump.dat" –username="dkilgo" ~/Desktop/cvs-local/my-sample-project

If your cvs repository is corrupt, you may recieve warnings from cvs2git like the following:

ERROR: A CVS repository cannot contain both ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Release_text0.htm,v and ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Attic/Release_text0.htm,v
ERROR: A CVS repository cannot contain both ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Release_text1.htm,v and ~/Desktop/cvs-local/my-sample-project/help/internal/compile/Attic/Release_text1.htm,v

In CVS when a file is deleted, it moves into the attic folder. Its not techniclly correct for both files to exist. You'll have to examine both files with a text editor and see which one has the highest revision number and delete the older file.
All the times I encountered this error, the Attic version was older.
Once you've resolved all the errors, re-run the cvs2git command.

A successful run will provide something like the following output:

Time for pass1 (CollectRevsPass): 5.119 seconds.
—– pass 2 (CleanMetadataPass) —–
Converting metadata to UTF8…
Time for pass2 (CleanMetadataPass): 0.026 seconds.
—– pass 3 (CollateSymbolsPass) —–
Checking for forced tags with commits…
Time for pass3 (CollateSymbolsPass): 0.011 seconds.
—– pass 4 (FilterSymbolsPass) —–
Filtering out excluded symbols and summarizing items…
Time for pass4 (FilterSymbolsPass): 74.65 seconds.
—– pass 5 (SortRevisionsPass) —–
Sorting CVS revision summaries…
Time for pass5 (SortRevisionsPass): 0.179 seconds.
—– pass 6 (SortSymbolsPass) —–
Sorting CVS symbol summaries…
Time for pass6 (SortSymbolsPass): 0.110 seconds.
—– pass 7 (InitializeChangesetsPass) —–
Creating preliminary commit sets…
Time for pass7 (InitializeChangesetsPass): 1.975 seconds.
—– pass 8 (BreakRevisionChangesetCyclesPass) —–
Breaking revision changeset dependency cycles…
Time for pass8 (BreakRevisionChangesetCyclesPass): 0.727 seconds.
—– pass 9 (RevisionTopologicalSortPass) —–
Generating CVSRevisions in commit order…
Time for pass9 (RevisionTopologicalSortPass): 0.547 seconds.
—– pass 10 (BreakSymbolChangesetCyclesPass) —–
Breaking symbol changeset dependency cycles…
Time for pass10 (BreakSymbolChangesetCyclesPass): 1.134 seconds.
—– pass 11 (BreakAllChangesetCyclesPass) —–
Breaking CVSSymbol dependency loops…
Time for pass11 (BreakAllChangesetCyclesPass): 1.681 seconds.
—– pass 12 (TopologicalSortPass) —–
Generating CVSRevisions in commit order…
Time for pass12 (TopologicalSortPass): 1.520 seconds.
—– pass 13 (CreateRevsPass) —–
Mapping CVS revisions to Subversion commits…
Creating Subversion r1 (Project initialization)
Creating Subversion r2 (commit)
Creating Subversion r3 (commit)
Creating Subversion r4 (commit)

Time for pass13 (CreateRevsPass): 1.538 seconds.
—– pass 14 (SortSymbolOpeningsClosingsPass) —–
Sorting symbolic name source revisions…
Time for pass14 (SortSymbolOpeningsClosingsPass): 0.854 seconds.
—– pass 15 (IndexSymbolsPass) —–
Determining offsets for all symbolic names…
Time for pass15 (IndexSymbolsPass): 0.373 seconds.
—– pass 16 (OutputPass) —–
Time for pass16 (OutputPass): 1.455 seconds.

cvs2svn Statistics:
Total CVS Files:              1417
Total CVS Revisions:          2982
Total CVS Branches:           1482
Total CVS Tags:              36376
Total Unique Tags:              59
Total Unique Branches:           2
CVS Repos Size in KB:        25792
Total SVN Commits:             579
First Revision Date:    Thu May 17 08:25:14 2007
Last Revision Date:     Fri Aug 17 11:13:14 2013
Timings (seconds):
 5.12   pass1    CollectRevsPass
 0.03   pass2    CleanMetadataPass
 0.01   pass3    CollateSymbolsPass
74.65   pass4    FilterSymbolsPass
 0.18   pass5    SortRevisionsPass
 0.11   pass6    SortSymbolsPass
 1.97   pass7    InitializeChangesetsPass
 0.73   pass8    BreakRevisionChangesetCyclesPass
 0.55   pass9    RevisionTopologicalSortPass
 1.13   pass10   BreakSymbolChangesetCyclesPass
 1.68   pass11   BreakAllChangesetCyclesPass
 1.52   pass12   TopologicalSortPass
 1.54   pass13   CreateRevsPass
 0.85   pass14   SortSymbolOpeningsClosingsPass
 0.37   pass15   IndexSymbolsPass
 1.46   pass16   OutputPass
91.90   total

Step 3: Create a git repository and import the data into it.

mkdir sample-git
cd sample-git
git init
cat ../blob.dat ../dump.dat | git fast-import

Which will create the following output:

git-fast-import statistics:
Alloc'd objects:       5000
Total objects:         3057 (      1285 duplicates                  )
      blobs  :         1754 (      1270 duplicates       1254 deltas of       1668 attempts)
      trees  :          956 (        15 duplicates        758 deltas of        892 attempts)
      commits:          347 (         0 duplicates          0 deltas of          0 attempts)
      tags   :            0 (         0 duplicates          0 deltas of          0 attempts)
Total branches:          44 (         6 loads     )
      marks:     1073741824 (      3371 unique    )
      atoms:            997
Memory total:          2594 KiB
       pools:          2360 KiB
     objects:           234 KiB
pack_report: getpagesize()            =       4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit      = 8589934592
pack_report: pack_used_ctr            =        802
pack_report: pack_mmap_calls          =         41
pack_report: pack_open_windows        =          1 /          1
pack_report: pack_mapped              =   31753114 /   31753114

Have a look in the git repo directory and you'll find its empy. Where's all your stuff?
Its in the .git directory. You don't have a working copy checked out yet.

Do "git checkout" to get a working copy of your repo.

Now you have a functional git repository with all your CVS comments, tags and branches.

Side note:

Changing to git ment updating our deployment scripts as well.
The old way with CVS looked like this:

cvs -d /home/cvs/devel -q export -r $tag -d $stageDir/sample-project smaple-project

Git has a little more overhead in that by default the cloned repo contains all the branches and commits, not just the tagged/current version.
That means what it checks out is usually a bit bigger than what would be picked up by the CVS export.
To workaround that, you can set the –depth param to limit git where it only cares about the last version of each file and folder.

/usr/local/bin/git clone –depth 1 –branch $tag git\ $stageDir/sample-project/

I always remove the revision control files before deployment. Its a pain with CVS and SVN. With git, its a snap because they're all in the same folder.

rm -f -r $stageDir/sample-project/.git

Open Webalizer DNS cache with PHP


Webalizer produces a cache file for DNS to speed up the process for resolving addresses to names but no utilities exist to browse this kind of file.


Webalizer is a log file analyser which can provide basic site usage stats by processing your apache log files. Its a standard utility in a RHEL envirnment. One feature of it is caching DNS information to speed up resolving addresses for reporting purposes. It uses a Berkeley DB to store these key/value pairs for quick lookup. Unfortunatly there really arn’t an gui tools for getting a look at this data. On my machine, its a 70MB file and I’d really like to get a look at that data.

So there are a couple of ways to do this.
MySQL comes with a bdb storage engine which can read this kind of file.

Or, you could use PHP and its driver to load this information.
I chose the latter.

I’m starting with a base install of ubuntu 11 and zend server.
Using zend server makes the process more complicated but makes the process more standardized.  ZS uses the same paths for files on every OS it installs on.

Install dependent software.

Zend doesn’t come with this module by default, so we’ll need to compile it.
Make sure you’ve got the basics needed to compile under ubuntu installed.

aptitude install install build-essential checkinstall autoconf

Next we need to make sure that libdb4 is installed so we can reference it in php.

aptitude install libdb4.8-dev db4.8-util db4.8-doc libdb4.8

And finaly, we need the header files for zend server’s php.

sudo aptitude install php-5.3-source-zend-server

Sanity check – Lets make sure that the module isn’t installed.

php --ri dba
Extension 'dba' not present.

Great. Lets build it.

cd /usr/local/zend/share/php-source/php-5.3.9/ext/dba/
./configure --with-php-config=/usr/local/zend/bin/php-config --with-db4
make install
echo "" > /usr/local/zend/etc/conf.d/dba.ini

lets check that again

php --ri dba

DBA support => enabled
Supported handlers => gdbm cdb cdb_make db4 inifile flatfile

Directive => Local Value => Master Value
dba.default_handler => flatfile => flatfile

Done. We can now access the dba functions from the command line.
They’ll also get picked up by apache the next time you restart.

Build a quick script to read the dns cache and output it as a comma separated list.

* Requires the dba module be enabled with db4 support.
$path = dirname(__FILE__) . '/dns_cache.db';
$logPath = $path . '.txt';
echo 'Loading ' . $path . chr(10);

$row = 0;
$h = dba_open($path,'r','db4');
$fh = fopen($logPath,'w+');

echo 'Failed to open db';

echo "Exporting Key/Values pairs...\n";
$key = dba_firstkey($h);
$value = dba_fetch($key,$h);
$value = filter_var($value,FILTER_SANITIZE_URL);
$key = filter_var($key,FILTER_SANITIZE_URL);
fwrite($fh, "$key,$value\n");

while($key = dba_nextkey($h)){
$value = dba_fetch($key,$h);
$value = filter_var($value,FILTER_SANITIZE_URL);
$key = filter_var($key,FILTER_SANITIZE_URL);
fwrite($fh, "$key,$value\n");
echo "Printed $row key/value pairs." . chr(10);
echo 'Done' . chr(10);

I called my script test.php.
Then run it from the command line.

php -r ./test.php

Loading /home/{MASKED}/dns_cache.db
Exporting Key/Values pairs...
Printed 703268 key/value pairs.

Not sure if this is the best way to get at this data but its the route I chose and it worked.
Your mileage may vary.