Cluster synchronization with Csync2
Cliﬀord Wolf, http://www.cliﬀord.at/
November 25, 2005

1

Introduction

Csync2 [1] is a tool for asynchronous ﬁle synchronization in clusters. Asynchronous ﬁle synchronization is good for ﬁles which are seldom modiﬁed such as conﬁguration ﬁles or application images but it is not adequate for some other types of data.
For instance a database with continuous write
accesses should be synced synchronously in order
to ensure the data integrity. But that does not
automatically mean that synchronous synchronization is better; it simply is diﬀerent and there are
many cases where asynchronous synchronization is
favored over synchronous synchronization. Some
pros of asynchronous synchronization are:
1. Most asynchronous synchronization tools (including Csync2 ) are implemented as single-shot
commands which need to be executed each time
in order to run one synchronization cycle. Therefore it is possible to test changes on one host before
deploying them on the others (and also return to
the old state if the changes turn out to be bogus).
2. The synchronization algorithms are much
simpler and thus less error-prone.
3. Asynchronous synchronization tools can be
(and usually are) implemented as normal user mode
programs. Synchronous synchronization tools need
to be implemented as operating system extensions.
Therefore asynchronous tools are easier to deploy
and more portable.
4. It is much easier to build systems which allow
setups with many hosts and complex replication
rules.
But most asynchronous synchronization tools are
pretty primitive and do not even cover a small portion of the issues found in real world environments.
I have developed Csync2 because I found none of
the existing tools for asynchronous synchronization
satisfying. The development of Csync2 has been
sponsored by LINBIT Information Technologies [2],

the company which also sponsors the synchronous
block device synchronization toolchain DRBD [3].
Note: I will simply use the term synchronization
instead of the semi-oxymoron asynchronous synchronization in the rest of this paper.

1.1

Csync2 features

Most synchronization tools are very simple wrappers for remote-copy tools such as rsync or scp.
These solutions work well in most cases but still
leave a big gap for more sophisticated tools such as
Csync2 . The most important features of Csync2
are described in the following sections.
1.1.1

Conﬂict detection

Most of the trivial synchronization tools just copy
the newer ﬁle over the older one. This can be a
very dangerous behavior if the same ﬁle has been
changed on more than one host. Csync2 detects
such a situation as a conﬂict and will not synchronize the ﬁle. Those conﬂicts then need to be resolved manually by the cluster administrator.
It is not considered as a conﬂict by Csync2 when
the same change has been performed on two hosts
(e.g. because it has already been synchronized with
another tool).
It is also possible to let Csync2 resolve conﬂicts
automatically for some or all ﬁles using one of the
pre-deﬁned auto-resolve methods. The available
methods are: none (the default behavior), first
(the host on which Csync2 is executed ﬁrst wins),
younger and older (the younger or older ﬁle wins),
bigger and smaller (the bigger or smaller ﬁle
wins), left and right (the host on the left side
or the right side in the host list wins).
The younger, older, bigger and smaller methods let the remote side win the conﬂict if the ﬁle

has been removed on the local side.
1.1.2

Replicating ﬁle removals

Many synchronization tools can not synchronize ﬁle
removals because they can not distinguish between
the ﬁle being removed on one host and being created on the other one. So instead of removing the
ﬁle on the second host they recreate it on the ﬁrst
one.
Csync2 detects ﬁle removals as such and can
synchronize them correctly.
1.1.3

Complex setups

Many synchronization tools are strictly designed for
two-host-setups. This is an inadequate restriction
and so Csync2 can handle any number of hosts.
Csync2 can even handle complex setups where
e.g. all hosts in a cluster share the /etc/hosts ﬁle,
but one /etc/passwd ﬁle is only shared among the
members of a small sub-group of hosts and another
/etc/passwd ﬁle is shared among the other hosts
in the cluster.
1.1.4

Reacting to updates

In many cases it is not enough to simply synchronize a ﬁle between cluster nodes. It also is important to tell the applications using the synchronized
ﬁle that the underlying ﬁle has been changed, e.g.
by restarting the application.
Csync2 can be conﬁgured to execute arbitrary
commands when ﬁles matching an arbitrary set of
shell patterns are synchronized.

2

The Csync2 algorithm

Many other synchronization tools compare the
hosts, try to ﬁgure out which host is the most upto-date one and then synchronize the state from
this host to all other hosts. This algorithm can not
detect conﬂicts, can not distinguish between ﬁle removals and ﬁle creations and therfore it is not used
in Csync2 .
Csync2 creates a little database with ﬁlesystem metadata on each host.
This database
(/var/lib/csync2/hostname.db) contains a list of
the local ﬁles under the control of Csync2 . The

database also contains information such as the ﬁle
modiﬁcation timestamps and ﬁle sizes.
This database is used by Csync2 to detect
changes by comparison with the local ﬁlesystem.
The synchronization itself is then performed using
the Csync2 protocol (TCP port 30865).
Note that this approach implies that Csync2 can
only push changes from the machine on which the
changes has been performed to the other machines
in the cluster. Running Csync2 on any other machine in the cluster can not detect and so can not
synchronize the changes.
Librsync [4] is used for bandwidth-saving ﬁle synchronization and SSL is used for encrypting the
network traﬃc. The sqlite library [5] (version 2)
is used for managing the Csync2 database ﬁles.
Authentication is performed using auto-generated
pre-shared-keys in combination with the peer IP
address and the peer SSL certiﬁcate.

3
3.1

Setting up Csync2
Building Csync2 from source

Simply download the latest Csync2 source tar.gz
from http://oss.linbit.com/csync2/, extract it
and run the usual ./configure - make - make
install trio.
Csync2 has a few prerequisites in addition to a C
compiler, the standard system libraries and headers
and the usual gnu toolchain (make, etc):
1. You need librsync, libsqlite (version 2) and
libssl installed (including development headers).
2. Bison and ﬂex are needed to build the conﬁguration ﬁle parser.

3.2

Csync2 in Linux distributions

As of this writing there are no oﬃcial Debian, RedHat or SuSE packages for Csync2 . Gentoo has a
Csync2 package, but is has not been updated for
a year now. As far as I know, ROCK Linux [6] is
the only system with an up-to-date Csync2 package. So I recommend that all users of non-ROCK
distributions built the package from source.
The Csync2 source package contains an RPM
.specs ﬁle as well as a debian/ directory. So it
is possible to use rpmbuild or debuild to build
Csync2 .

3.3

Post installation

Next you need to create an SSL certiﬁcate for the
local Csync2 server. Simply running make cert
in the Csync2 source directory will create and install a self-signed SSL certiﬁcate for you. Alternatively, if you have no source around, run the following commands:
openssl genrsa \
-out /etc/csync2_ssl_key.pem 1024
openssl req -new \
-key /etc/csync2_ssl_key.pem \
-out /etc/csync2_ssl_cert.csr
openssl x509 -req -days 600 \
-in /etc/csync2_ssl_cert.csr \
-signkey /etc/csync2_ssl_key.pem \
-out /etc/csync2_ssl_cert.pem
You have to do that on each host you’re running csync2 on. When servers are talking with each
other fr the ﬁrst time, they add each other to the
database.
The Csync2 TCP port 30865 needs to be added
to the /etc/services ﬁle and inetd needs to be
told about Csync2 by adding
csync2 stream tcp nowait root \
/usr/local/sbin/csync2 csync2 -i
to /etc/inetd.conf.

3.4

Conﬁguration File

Figure 1 shows a simple Csync2 conﬁguration ﬁle.
3.4.1

Synchronization Groups

In the example conﬁguration ﬁle you will ﬁnd
the declaration of a synchronization group called
mygroup. A Csync2 setup can have any number
of synchronization groups. Each group has its own
list of member hosts and include/exclude rules.
Csync2 automatically ignores all groups which
do not contain the local hostname in the host list.
This way you can use one big Csync2 conﬁguration
ﬁle for the entire cluster.
3.4.2

Host Lists

Host lists are speciﬁed using the host keyword.
You can eighter specify the hosts in a whitespace

seperated list or use an extra host statement for
each host.
The hostnames used here must be the local hostnames of the cluster nodes. That means you must
use exactly the same string as printed out by the
hostname command. Otherwise csync2 would be
unable to associate the hostnames in the conﬁguration ﬁle with the cluster nodes.
Sometimes it is desired that a host is receiving Csync2 connections on an IP address which
is not the IP address its DNS entry resolves to,
e.g. when a crossover cable is used to directly connect the hosts or an extra synchronization network
should be used. In this cases the syntax hostname@interfacename has to be used for the host
records (see host4 in the example conﬁg ﬁle).
Sometimes a host shall only receive updates from
other hosts in the synchronization group but shall
not be allowed to send updates to the other hosts.
Such hosts (so-called slave hosts) must be speciﬁed
in brackets, such as host3 in the example conﬁg
ﬁle.
3.4.3

Pre-Shared-Keys

Authentication is performed using the IP addresses
and pre-shared-keys in Csync2 . Each synchronization group in the conﬁg ﬁle must have exactly one
key record specifying the ﬁle containing the preshared-key for this group. It is recommended to use
a separate key for each synchronization group and
only place a key ﬁle on those hosts which actually
are members in the corresponding synchronization
group.
The key ﬁles can be generated with csync2 -k
ﬁlename.
3.4.4

Include/Exclude Patterns

The include and exclude patterns are used to
specify which ﬁles should be synced in the synchronization group.
There are two kinds of patterns: pathname patterns which start with a slash character (or a preﬁx
such as the %homedir% in the example; preﬁxes are
explained in a later section) and basename patterns
which do not.
The last matching pattern for each of both categories is chosen. If both categories match, the ﬁle
will be synchronized.

group mygroup
{
host host1 host2 (host3);
host host4@host4-eth2;

# A synchronization group (see 3.4.1)
# host list (see 3.4.2)

key /etc/csync2.key_mygroup;

# pre-shared-key (see 3.4.3)

include
include
exclude
exclude

# include/exclude patterns (see 3.4.4)

/etc/apache;
%homedir%/bob;
%homedir%/bob/temp;
*~ .*;

action
{

# an action section (see 3.4.5)
pattern /etc/apache/httpd.conf;
pattern /etc/apache/sites-available/*;
exec "/usr/sbin/apache2ctl graceful";
logfile "/var/log/csync2_action.log";
do-local;

}
backup-directory /var/backups/csync2;
backup-generations 3;
# backup old files (see 3.4.11)
auto none;

# auto resolving mode (see 3.4.6)

}
prefix homedir
{
on host[12]: /export/users;
on *:
/home;
}

# a prefix declaration (see 3.4.7)

Figure 1: Example Csync2 conﬁguration ﬁle
The pathname patterns are matched against the
beginning of the ﬁlename. So they must either
match the full absolute ﬁlename or must match a
directory in the path to the ﬁle. The ﬁle will not be
synchronized if no matching include or exclude
pathname pattern is found (i.e. the default pathname pattern is an exclude pattern).
The basename patterns are matched against the
base ﬁlename without the path. So they can e.g.
be used to include or exclude ﬁles by their ﬁlename
extensions. The default basename pattern is an
include pattern.
In our example conﬁg ﬁle that means that all

ﬁles from /etc/apache and %homedir%/bob are
synced, except the dot ﬁles, ﬁles with a tilde character at the end of the ﬁlename, and ﬁles from
%homedir%/bob/temp.
3.4.5

Actions

Each synchronization group may have any number of action sections. These action sections are
used to specify shell commands which should be
executed after a ﬁle is synchronized that matches
any of the speciﬁed patterns.
The exec statement is used to specify the command which should be executed. Note that if multi-

csync2 -cr /
if csync2 -M; then
echo "!!"
echo "!! There are unsynced changes! Type ’yes’ if you still want to"
echo "!! exit (or press crtl-c) and anything else if you want to start"
echo "!! a new login shell instead."
echo "!!"
if read -p "Do you really want to logout? " in &&
[ ".$in" != ".yes" ]; then
exec bash --login
fi
fi
Figure 2: The csync2 locheck.sh script
ple ﬁles matching the pattern are synced in one run,
this command will only be executed once. The special token %% in the command string is substituted
with the list of ﬁles which triggered the command
execution.
The output of the command is appended to the
speciﬁed logﬁle, or to /dev/null if the logfile
statement is omitted.
Usually the action is only triggered on the targed
hosts, not on the host on which the ﬁle modiﬁcation
has been detected in the ﬁrst place. The do-local
statement can be used to change this behavior and
let Csync2 also execute the command on the host
from which the modiﬁcation originated.

3.4.6

Conﬂict Auto-resolving

The auto statement is used to specify the conﬂict
auto-resolving mechanism for this synchronization
group. The default value is auto none.

3.4.7

Preﬁx Declarations

Preﬁxes (such as the %homedir% preﬁx in the example conﬁguration ﬁle) can be used to synchronize directories which are named diﬀerently on the cluster
nodes. In the example conﬁguration ﬁle the directory for the user home directories is /export/users
on the hosts host1 and host2 and /home on the
other hosts.
The preﬁx value must be an absolute path name
and must not contain any wildcard characters.

3.4.8

The nossl statement

Usually all Csync2 network communication is encrypted using SSL. This can be changed with the
nossl statement. This statement may only occur
in the root context (not in a group or prefix section) and has two parameters. The ﬁrst one is a
shell pattern matching the source DNS name for
the TCP connection and the second one is a shell
pattern matching the destination DNS name.
So if e.g. a secure synchronization network is
used between some hosts and all the interface DNS
names end with -sync, a simple
nossl *-sync *-sync;
will disable the encryption overhead on the synchronization network. All other traﬃc will stay SSL
encrypted.
3.4.9

The config statement

The config statement is nothing more then an include statement and can be used to include other
conﬁg ﬁles. This can be used to modularize the
conﬁguration ﬁle.
3.4.10

The ignore statement

The ignore statement can be used to tell Csync2
to not check and not sync the ﬁle user-id, the ﬁle
group-id and/or the ﬁle permissions. The statement is only valid in the root context and accepts
the parameters uid, gid and mod to turn oﬀ handling of user-ids, group-ids and ﬁle permissions.

CREATE TABLE file (
filename, checktxt,
UNIQUE ( filename ) ON CONFLICT REPLACE
);
CREATE TABLE dirty (
filename, force, myname, peername,
UNIQUE ( filename, peername ) ON CONFLICT IGNORE
);
CREATE TABLE hint (
filename, recursive,
UNIQUE ( filename, recursive ) ON CONFLICT IGNORE
);
CREATE TABLE action (
filename, command, logfile,
UNIQUE ( filename, command ) ON CONFLICT IGNORE
);
CREATE TABLE x509_sha1 (
peername, hash,
UNIQUE ( peername ) ON CONFLICT IGNORE
);
Figure 3: The Csync2 database schema
3.4.11

Backing up

Csync2 can back up the ﬁles it modiﬁes. This
may be useful for scenarios where one is afraid of
accidentally syncing ﬁles in the wrong direction.
The backup-directory statement is used to tell
Csync2 in which directory it should create the
backup ﬁles and the backup-generations statement is used to tell Csync2 how many old versions
of the ﬁles should be kept in the backup directory.
The ﬁles in the backup directory are named like
the ﬁle they back up, with all slashes substituted
by underscores and a generation counter appended.
Note that only the ﬁle content, not the metadata
such as ownership and permissions are backed up.
Per default Csync2 does not back up the
ﬁles it modiﬁes.
The default value for
backup-generations is 3.

3.5

Activating the Logout Check

The Csync2 sources contain a little script called
csync2 locheck.sh (Figure 2).
If you copy that script into your ~/.bash logout
script (or include it using the source shell command), the shell will not let you log out if there are
any unsynced changes.

4

Database Schema

Figure 3 shows the Csync2 database schema. The
database can be accessed using the sqlite command line shell. All string values are URL encoded
in the database.
The file table contains a list of all local ﬁles under Csync2 control, the checktxt attribute contains a special string with information about ﬁle
type, size, modiﬁcation time and more. It looks
like this:
v1:mtime=1103471832:mode=33152:

uid=1001:gid=111:type=reg:size=301
This checktxt attribute is used to check if a ﬁle
has been changed on the local host.
If a local change has been detected, the entry in
the file table is updated and entries in the dirty
table are created for all peer hosts which should
be updated. This way the information that a host
should be updated does not get lost, even if the host
in question is unreachable right now. The force
attribute is set to 0 by default and to 1 when the
cluster administrator marks one side as the right
one in a synchronization conﬂict.
The hint table is usually not used. In large setups this table can be ﬁlled by a daemon listening
on the inotify API. It is possible to tell Csync2 to
not check all ﬁles it is responsible for but only those
which have entries in the hint table. However, the
Linux syscall API is so fast that this only makes
sense for really huge setups.
The action table is used for scheduling actions.
Usually this table is empty after Csync2 has been
terminated. However, it is possible that Csync2
gets interrupted in the middle of the synchronization process. In this case the records in the action
table are processed when Csync2 is executed the
next time.
The x509 sha1 table is used to cache SHA1
hashes of the SSL cetriﬁcates used by the
other hosts in the csync2 cluster (like the SSH
known hosts ﬁle).

5

Running Csync2

Simply calling csync2 without any additional arguments prints out a help message (Figure 4). A
more detailed description of the most important usage scenarios is given in the next sections.

5.1

Just synchronizing the ﬁles

The command csync2 -x (or csync2 -xv) checks
for local changes and tries to synchronize them to
the other hosts. The option -d (dry-run) can be
used to do everything but the actual synchronization.
When you start Csync2 the ﬁrst time it compares its empty database with the ﬁlesystem and
sees that all ﬁles just have been created. It then

will try to synchronize the ﬁles. If the ﬁle is not
present on the remote hosts it will simply be copied
to the other host. There also is no problem if the
ﬁle is already present on the remote host and has
the same content. But if the ﬁle already exists on
the remote host and has a diﬀerent content, you
have your ﬁrst conﬂict.

5.2

Resolving a conﬂict

When two or more hosts in a Csync2 synchronization group have detected changes for the same ﬁle
we run into a conﬂict: Csync2 can not know which
version is the right one (unless an auto-resolving
method has been speciﬁed in the conﬁguration ﬁle).
The cluster administrator needs to tell Csync2
which version is the correct one. This can be done
using Csync2 -f, e.g.:
# csync2 -x
While syncing file /etc/hosts:
ERROR from peer apollo:
File is also marked dirty here!
Finished with 1 errors.
# csync2 -f /etc/hosts
# csync2 -xv
Connecting to host apollo (PLAIN) ...
Updating /etc/hosts on apollo ...
Finished with 0 errors.

5.3

Checking without syncing

It is also possible to just check the local ﬁlesystem without doing any connections to remote hosts:
csync2 -cr / (the -r modiﬁer tells Csync2 to do
a recursive check).
csync2 -c without any additional parameters
checks all ﬁles listed in the hints table.
The command csync2 -M can be used to print
the list of ﬁles marked dirty and therfore scheduled
for synchronization.

5.4

Comparing the hosts

The csync2 -T command can be used to compare
the local database with the database of the remote hosts. Note that this command compares the
databases and not the ﬁlesystems - so make sure

csync2 1.26 - cluster synchronization tool, 2nd generation
LINBIT Information Technologies GmbH <http://www.linbit.com>
Copyright (C) 2004, 2005 Clifford Wolf <clifford@clifford.at>
This program is free software under the terms of the GNU GPL.

Modifiers:
-r
-d
-B

Do not block everything into big SQL transactions. This
slows down csync2 but allows multiple csync2 processes to
access the database at the same time. Use e.g. when slow
lines are used or huge files are transferred.

-A

Open database in asynchronous mode. This will cause data
corruption if the operating system crashes or the computer
loses power.

-I

Init-run. Use with care and read the documentation first!
You usually do not need this option unless you are
initializing groups with really large file lists.

-X
-U

Also add removals to dirty db when doing a -TI run.
Don’t mark all other peers as dirty when doing a -TI run.

Usage: csync2 [-v..] [-C config-name] \
[-D database-dir] [-N hostname] [-p port] ..
With file parameters:
-h [-r] file..
-c [-r] file..
-u [-d] [-r] file..
-f file..
-m file..
Simple mode:
-x [-d] [[-r] file..]

Add (recursive) hints for check to db
Check files and maybe add to dirty db
Updates files if listed in dirty db
Force this file in sync (resolve conflict)
Mark files in database as dirty

Run checks for all given files and update
remote hosts.

-G Group1,Group2,Group3,...
Only use this groups from config-file.

Without file parameters:
-c
Check all hints in db and eventually mark files as dirty
-u [-d] Update (transfer dirty files to peers and mark as clear)
-H
-L
-M

-P peer1,peer1,...
Only update this peers (still mark all as dirty).

List all pending hints from status db
List all file-entries from status db
List all dirty files from status db

-S myname peername

Recursive operation over subdirectories
Dry-run on all remote update operations

List file-entries from status db for this
synchronization pair.

-T

Test if everything is in sync with all peers.

-T filename

Test if this file is in sync with all peers.

-T myname peername

Add new entries to dirty database with force flag set.

-t

Print timestamps to debug output (e.g. for profiling).

Creating key file:
csync2 -k filename
Csync2 will refuse to do anything when a /etc/csync2.lock file is found.

Test if this synchronization pair is in sync.

-T myname peer file

-F

Test only this file in this sync pair.

-TT

As -T, but print the unified diffs.

The modes -H, -L, -M and -S return 2 if the requested db is empty.
The mode -T returns 2 if both hosts are in sync.
-i
-ii
-iii

Run in inetd server mode.
Run in stand-alone server mode.
Run in stand-alone server mode (one connect only).

-R

Remove files from database which do not match config entries.

Figure 4: The Csync2 help message
that the databases are up-to-date on all hosts before running csync2 -T and run csync2 -cr / if
you are unsure.
The output of csync2 -T is a table with 4
columns:
1. The type of the found diﬀerence: X means
that the ﬁle exists on both hosts but is diﬀerent, L
that the ﬁle is only present on the local host and R
that the ﬁle is only present on the remote host.
2. The local interface DNS name (usually just
the local hostname).
3. The remote interface DNS name (usually just
the remote hostname).
4. The ﬁlename.
The csync2 -TT ﬁlename command can be used
for displaying uniﬁed diﬀs between a local ﬁle and
the remote hosts.

5.5

Bootstrapping large setups

The -I option is a nice tool for bootstrapping larger
Csync2 installations on slower networks. In such
scenarios one usually wants to initially replicate
the data using a more eﬃcient way and then use
Csync2 to synchronize the changes on a regular
basis.
The problem here is that when you start Csync2
the ﬁrst time it detects a lot of newly created ﬁles
and wants to synchronize them, just to ﬁnd out
that they are already in sync with the peers.
The -I option modiﬁes the behavior of -c so
it only updates the file table but does not create entries in the dirty table. So you can simply
use csync2 -cIr / to initially create the Csync2
database on the cluster nodes when you know for
sure that the hosts are already in sync.

The -I option may also be used with -T to add
the detected diﬀerences to the dirty table and so
induce Csync2 to synchronize the local status of
the ﬁles in question to the remote host.
Usually -TI does only schedule local ﬁles which
do exist to the dirty database. That means that
it does not induce Csync2 to remove a ﬁle on a
remote host if it does not exist on the local host.
That behavior can be changed using the -X option.
The ﬁles scheduled to be synced by -TI are usually scheduled to be synced to all peers, not just
the one peer which has been used in the -TI run.
This behavior can be changed using the -U option.

5.6

Cleaning up the database

It can happen that old data is left over in the
Csync2 database after a conﬁguration change (e.g.
ﬁles and hosts which are not referred anymore by
the conﬁguration ﬁle). Running csync2 -R cleans
up such old entries in the Csync2 database.

5.7

Multiple Conﬁgurations

Sometimes a higher abstracion level than simply
having diﬀerent synchronization groups is needed.
For such cases it is possible to use multiple conﬁguration ﬁles (and databases) side by side.
The additional conﬁgurations must have a unique
name.
The conﬁguration ﬁle is then named
/etc/csync2 myname.cfg and the database is
named /var/lib/csync2/hostname myname.db.
Accordingly Csync2 must be called with the -C
myname option.
But there is no need for multiple Csync2 daemons. The Csync2 protocol allows the client to
tell the server which conﬁguration should be used
for the current TCP connection.

6

Performance

In most cases Csync2 is used for syncing just some
(up to a few hundred) system conﬁguration ﬁles.
In these cases all Csync2 calls are processed in
less than one second, even on slow hardware. So
a performance analysis is not interesting for these
cases but only for setups where a huge amount of
ﬁles is synced, e.g. when syncing entire application
images with Csync2 .

A well-founded performance analysis which
would allow meaningful comparisons with other
synchronization tools would be beyond the scope
of this paper. So here are just some quick and
dirty numbers from a production 2-node cluster
(2.40GHz dual-Xeon, 7200 RPM ATA HD, 1 GB
Ram). The machines had an average load of 0.3
(web and mail) during my tests..
I have about 128.000 ﬁles (1.7 GB) of Linux kernel sources and object ﬁles on an ext3 ﬁlesystem
under Csync2 control on the machines.
Checking for changes (csync2 -cr /) took 13.7
seconds wall clock time, 9.1 seconds in user mode
and 4.1 seconds in kernel mode. The remaining 0.5
seconds were spent in other processes.
Recreating the local database without adding the
ﬁles to dirty table (csync2 -cIr after removing the
database ﬁle) took 28.5 seconds (18.6 sec user mode
and 2.6 sec kernel mode).
Comparing the Csync2 databases of both hosts
(csync2 -T) took 3 seconds wall clock time.
Running csync2 -u after adding all 128.000 ﬁles
took 10 minutes wall clock time. That means that
Csync2 tried to sync all 128.000 ﬁles and then recognized that the remote side had already the most
up-to-date version of the ﬁle after comparing the
checksums.
All numbers are the average values of 10 iterations.

7

Security Notes

As statet earlier, authentication is performed using the peer IP address and a pre-shared-key. The
traﬃc is SSL encrypted and the SSL certiﬁcate of
the peer is checked when there has been already an
SSL connection to that peer in the past (i.e. the
peer certiﬁcate is already cached in the database).
All DNS names used in the Csync2 conﬁguration ﬁle (the host records) should be resolvable via
the /etc/hosts ﬁle to guard against DNS spooﬁng
attacks.
Depending on the list of ﬁles being managed by
Csync2 , an intruder on one of the cluster nodes
can also modify the ﬁles under Csync2 control on
the other cluster nodes and so might also gain access on them. However, an intruder can not modify
any other ﬁles on the other hosts because Csync2

checks on the receiving side if all updates are OK
according to the conﬁguration ﬁle.
For sure, an intruder would be able to work
around this security checks when Csync2 is also
used to sync the Csync2 conﬁguration ﬁles.
Csync2 only syncs the standard UNIX permissions (uid, gid and ﬁle mode). ACLs, Linux
ext2fs/ext3fs attributes and other extended ﬁlesystem permissions are neither synced nor ﬂushed (e.g.
if they are set automatically when the ﬁle is created).

8

Alternatives
2

Csync is not the only ﬁle synchronization tool.
Some of the other free software ﬁle synchronization
tools are:

8.1

Rsync

Rsync [7] is a tool for fast incremental ﬁle transfers,
but is not a synchronization tool in the context of
this paper. Actually Csync2 is using the rsync
algorithm for ﬁle transfers. A variety of synchronization tools have been written on top of rsync.
Most of them are tiny shell scripts.

8.2

Unison

Unison [8] is using an algorithm similar to the one
used by Csync2 , but is limited to two-host setups. Its focus is on interactive syncs (there even
are graphical user interfaces) and it is targeting
on syncing home directories between a laptop and
a workstation. Unison is pretty intuitive to use,
among other things because of its limitations.

8.3

Version Control Systems

Version control systems such as Subversion [9] can
also be used to synchronize conﬁguration ﬁles or
application images. The advantage of version control systems is that they can do three way merges
and preserve the entire history of a repository. The
disadvantage is that they are much slower and require more disk space than plain synchronization
tools.

9

References

[1] Csync2
http://oss.linbit.com/csync2/
[2] LINBIT Information Technologies
http://www.linbit.com/
[3] DRBD
http://www.drbd.org/
[4] Librsync
http://librsync.sourceforge.net/
[5] SQLite
http://www.sqlite.org/
[6] ROCK Linux
http://www.rocklinux.org/
[7] Rsync
http://samba.anu.edu.au/rsync/
[8] Unison
http://www.cis.upenn.edu/˜bcpierce/unison/
[9] Subversion
http://subversion.tigris.org/

