HOWTO_use_OpenGFS_with_OpenDLM (Jun 16 2004)


Contact the author.

     HOWTO use OpenDLM as the locking manager for OpenGFS (V0.04)

Authors:
Ben Cahill
Stanley Wang

This document describes how to use OpenDLM as the locking manager in
an OpenGFS cluster.  It provides a simple example configuration for 
a 2-node cluster.

Within this document, we'll try to provide the basics of getting started,
without the need to study the various components before setting up OpenGFS.
However, you *should* study the projects sometime!  Recommended reading:

OpenGFS:  WHATIS-opengfs, HOWTO-nopool (some steps are required reading!)
OpenDLM:  WHATIS-opendlm, dlmbook_final.pdf (Programmer's Guide)
	HOWTO (Build, Install, and Configure OpenDLM ... required reading!)

You can find OpenDLM docs at:

	http://opendlm.sourceforge.net/docs.php


SOFTWARE COMPONENTS:
--------------------

OpenDLM is a Distributed Lock Manager, and provides an alternative to the
single point of failure characteristic of OpenGFS' legacy "memexp" locking
protocol; even though memexp's lock management is distributed among the
computer nodes, memexp has only a single lock storage server.  In contrast,
OpenDLM distributes both lock management *and* lock storage among all of
the computer nodes in the cluster.  If one of the nodes crashes, recovery of
relevant lock state is possible by the surviving nodes.

See the HOWTO doc on the OpenDLM project site for information on the
software components it depends on (linux-ha heartbeat or ccm, and libnet).



BUILDING AND INSTALLING OPENDLM AS LOCK SERVICE FOR OPENGFS
-----------------------------------------------------------

The following instructions should cover all types of Linux distributions,
since they describe how to download source code tarballs and build from scratch.
For best results, we recommend following this download/build procedure on
each machine in the cluster (rather than building on a single build machine,
then installing in the cluster machines).


1. Patch your kernel for OpenGFS, and build the kernel:
	See steps 1. and 2. in OpenGFS' HOWTO-nopool doc, then return here.

	IMPORTANT:  Use OpenGFS *CVS* code base.  OpenDLM is not yet supported
	in any OpenGFS release.  We're still working on code stability,
	so you should use the latest CVS!

	HINT:  Don't build the OpenGFS code yet (step 3 in HOWTO-nopool).
	You'll need OpenDLM source before doing that.

	HINT:  Once you're done with this step, you should have rebooted and be
	running the patched kernel!


2. Build, install, and configure OpenDLM:
	Just to be safe, since OpenDLM build requires kernel source, we're
	not building OpenDLM until *after* we've already patched the kernel
	(although it probably doesn't make a difference).  Follow instructions
	in the OpenDLM HOWTO doc, at:

	http://opendlm.sourceforge.net/docs.php

	You will need to follow all instructions up to, but not including
	"Start Locking Service".  If you want to, you could even start the
	locking service at this point (and finish the OpenDLM HOWTO), but
	it won't be needed quite yet.


3. Build OpenGFS:
	Now that you've obtained the OpenDLM source code, you can build OpenGFS.
	See step 3 in OpenGFS' HOWTO-nopool doc, using (at least) the
	following options for ./configure:

		--enable-opendlm
		--with-opendlm_includes=/your/path/to/opendlm/src/include

	then return here.

	IMPORTANT:  Use OpenGFS CVS code base.  OpenDLM is not yet supported
	in any OpenGFS release.  We're still working on code stability,
	so you should use the latest CVS!


4. If you have a pre-existing OpenGFS filesystem, you do *not* need to lose
	all of your data by re-partioning and re-making your filesystem!!
	You can switch back and forth between using the legacy memexp and the
	new OpenDLM lock protocols, without changing anything on disk.

	To specify OpenDLM as your lock protcol, add the following option
	to the mount command line during step 9 (later in this document):

		-o lockproto=opendlm

	Alternatively, you may set opendlm as the default locking protocol,
	by using the ogfs_tool utility (see man page for ogfs_tool).  This
	writes "opendlm" into the filesystem superblock, so you will not need
	to specify opendlm in the mount command line.

	In either case, you will not need to partition a drive, or make
	the filesystem, so skip the next two steps, and continue with
	step 7, Start Locking Service.


5. Partition the shared drive (using only one computer), into 2 partitions:
	(based on HOWTO-nopool, but using only 2 partitions, no cidev)

	HINT:  If you are creating a new filesystem, and want to be able
	to switch back and forth between the legacy memexp, and the new
	OpenDLM lock protocols, see HOWTO-nopool for information on
	creating a cluster information device (cidev), which requires its
	own small partition in addition to the filesystem partition and
	any external journal partitions.  HOWTO-nopool also describes how to
	make the filesystem.  Once done, return to this document at step 7,
	Start Locking Service.

	For a strictly OpenDLM (no option to switch to memexp) setup,
	we'll create an example configuration using 2 partitions.
	A medium partition (~128MB) will be used for an external journal.
	A large partition (the rest of the drive) will be used for the
	filesystem data and an internal journal.

	These instructions assume you are using a "real" device, but you
	may instead want to use a volume manager to partition your drive.
	See OpenGFS HOWTO-evms for information on doing this with EVMS.

	Use the dev name of your shared drive in place of "sdX" below.
	See the man page for sfdisk for more information:

	# sfdisk -R /dev/sdX  (make sure no drive partitions are in use)
	# sfdisk /dev/sdX     (this partitions the disk; follow the prompts)

	Hint:  sfdisk works in units of "cylinders".  When partitioning
	my drive, sfdisk showed that each cylinder was 1048576 bytes.
	So, for the first (small) partition, I entered 0,128 to start at
	cylinder 0, with a size of 128 cylinders (~128MB).
	For the second (large) partition, I entered nothing (except the
	Enter key), which defaulted to use the rest of the drive.
	For the next two (sfdisk asks for 4 partitions), I also entered
	nothing (except the Enter key).  These last 2 "partitions", of course,
	are empty.

	After you enter all 4 partitions, sfdisk asks you if you really
	want to write to the disk, so you can experiment a bit
	before committing.

	Check for success:  cat /proc/partitions shows the new partitions

	HINT:  These partitions must show up on all computers in the cluster,
	and must be named consistently.  After you create new partitions using
	one machine, you may need to find a way for the other machines to
	re-scan for partitions, or simply reboot the other machines.

	HINT:  If you find that the other computers see the partitions with
	different names, or if there is any chance you will be re-configuring
	your cluster computers and thereby affecting the device names, you will
	*need* to use a volume manager such as EVMS (see OpenGFS HOWTO-evms
	for information).  If using EVMS, you will need to create "native
	volumes".  These provide consistent naming from machine to machine.

	Check for success:  cat /proc/partitions shows the new partitions
		on every machine, with identical names

	In the following instructions, we'll call the two partitions
	/dev/sdx1 and /dev/sdx2, but you will need to substitute appropriate
	names in their stead.


6. Make the OpenGFS Filesystem (using only one computer):
	(copied from HOWTO-nopool and edited for OpenDLM)

	Use the OpenGFS tool "mkfs.ogfs" to create the file system on disk.
	This step writes one superblock and a number of resource group
	(a.k.a. block group) headers for the filesystem, and creates journals.
	In the superblock, it writes strings indicating the name of the default
	locking protocol (e.g. "opendlm") and a cluster-wide filesystem
	identifier (a.k.a. lock namespace or lockspace), as specified in the
	mkfs.ogfs command line.  See the man page for mkfs.ogfs for more
	information on options.

	A.  Edit a configuration file named journal.cf to read like the
		following.  You can find a copy of this file, with comments,
		as opengfs/docs/journal.cf.  This file, and all other opengfs
		configuration files, may reside anywhere on your computer.

		Remember, you will need to substitute appropriate names for
		sdx2 (the filesystem device, large partition) and sdx1
		(the external journal device, medium partition).

fsdev  /dev/sdx2

journals  2

journal  0  int 256
journal  1  ext /dev/sdx1


	B.  Run the following command to make sure everything is okay.  The
		-v prints extra information, and -n prevents mkfs.ogfs from
		writing anything to disk.  Check the output to verify device
		sizes, etc.  The external journal should start at a *very*
		high number.

		The -t option tells the lock protocol something to identify
		the cluster-wide lock namespace of the filesystem you are
		mounting.  For the legacy memexp protocol, this string was a
		path to a device (the "cidev") that contained "cluster
		information".  Such a device is not used for OpenDLM, but if
		you've been using memexp, you may continue to use the same
		identifier string.

		Or, if you're just going to use OpenDLM (and never memexp), just
		make one up yourself.  You *must* have a unique name for each
		OpenGFS filesystem that you mount.  For now, let's just use
		"/dev/sdx001" (an arbitrary, meaningless name) for this
		filesystem.

		HINT:  If you want to be able to switch back and forth between
		OpenDLM and memexp, you *must* use the cidev identifier.  See
		HOWTO-nopool for info on creating the cidev.  Remember, for
		memexp, you will need to substitute an appropriate name for
		sdx001 (the cluster information device, small partition).

	# mkfs.ogfs -p opendlm -t /dev/sdx001 -c journal.cf -v -n

		HINT:  If you want even more output, try the -d option.


	C.  Run the following command to write the filesystem and internal
		journal onto the filesystem device, and the external journal
		onto the external journal device.

	# mkfs.ogfs -p opendlm -t /dev/sdx001 -c journal.cf

		HINT:  This can take some time to write to disk, and shows
		no output until it is done.  Be patient.


7.  Start locking service (on each computer):
	Follow instructions in the OpenDLM HOWTO doc, at:

	http://opendlm.sourceforge.net/docs.php

	You will need to follow all instructions in the step labeled
	"Start Locking Service" (unless you already did this in step 2, above).


8. Update /lib/modules/[version]/modules.dep, and insert OpenGFS modules:
	Root privileges required for all following operations:

	# depmod -a
	# modprobe ogfs
	# modprobe opendlm

	HINT:  For debug output from opendlm lock module, use:
		# modprobe opendlm debug=1

	Check for success:  cat /proc/filesystems shows "ogfs", among others
	Check for success:  cat /proc/modules shows "ogfs", among others


9. Mount the Filesystem (on each computer).
	The following commands mount the ogfs file system at the /ogfs
	mount point.  You will need to substitute an appropriate name
	for sdx2 (the filesystem device, large partition).

	# mkdir /ogfs
	# mount -t ogfs /dev/sdx2 /ogfs

	HINT:  If you are using OpenDLM with a pre-existing OpenGFS filesystem,
		use the following additional option to use opendlm instead
		of your pre-existing default lock protocol (memexp):

		-o lockproto=opendlm

		The -o hostdata=192.168.0.x option, required when using memexp,
		is not needed for OpenDLM.


	HINT:  If you see an error from mount like:
	"mount:  dev/sdc3 is not a valid block device",
	a)  make sure you are using a valid /dev/* in the command line, or
	b)  you may be trying the mount from a machine *other* than the one you
	used for creating the new partitions, and this machine hasn't seen the
	new partitions yet.  Try rebooting so *this* machine can detect the
	new partitions.

	HINT:  If you freeze while mounting, check to make sure that your
	/etc/dlm.conf files are correct (especially the lines describing
	the nodes).  See OpenDLM HOWTO.

	HINT:  If you encounter an assertion regarding LVB size, check to
	make sure that you verified the value of MAXLOCKVAL in dlm.h.  See
	OpenDLM HOWTO.

	HINT:  If mount fails, check your syslog (e.g. /var/log/messages) for
	messages about the cause.  A mount will fail if OpenDLM has not put
	its node table together yet (as of this writing, it seems to take
	10 - 20 seconds or more).  You may be able to simply try mounting
	again, and be successful.


That's it, you are done!
You should now be able to use this like any other filesystem.


SHUTTING DOWN CLEANLY
---------------------

1. Unmount file system:
	# umount /ogfs

2. Stop OpenDLM and HA heartbeat:
	# killall dlmdu
	# /etc/init.d/heartbeat stop (this also kills ccm, if you're using it)

3. Unload the modules:
	# modprobe -r opendlm
	# modprobe -r ogfs
	# modprobe -r libdlmk
	# modprobe -r dlmdk.core


STARTING OpenGFS (e.g. after boot-up)
------------------------------------
As an alternative to manually executing the steps below, look in
opengfs/scripts for pool.* and ogfs.* startup scripts for Debian and Red Hat
distributions.  These will require modification for your particular setup
(especially since they were written before OpenDLM was an option!).

Once OpenGFS has been installed on your computers and storage media, only a
few steps are needed to get it going after a boot-up.  The following steps
assume that your storage media hardware and drivers are installed and visible
by all computers in the cluster.

Check for success:  cat /proc/partitions shows shared drive(s)

You will need root privilege for all steps below.  See the OpenDLM HOWTO
for information on the first 3 steps:

1.  Start heartbeat (on each computer).

	# /etc/init.d/heartbeat start

2.  If you're using CCM for OpenDLM membership, start it (on each computer).

	# /usr/lib/heartbeat/ccm &

3.  Start OpenDLM (on each computer, after all nodes' heartbeats are started).

	# /usr/local/sbin/dlmdu -C /etc/dlm.conf

4.  Load OpenGFS and OpenDLM kernel modules (on each computer).

	# modprobe ogfs
	# modprobe libdlmk
	# modprobe opendlm

	Check for success:  cat /proc/filesystems shows "ogfs", among others
	Check for success:  cat /proc/modules shows "opendlm", "ogfs", "libdlmk"
			among others


5.  Mount the Filesystem (on each computer).
	Remember, you will need to substitute an appropriate name
	for sdx2 (the filesystem device, large partition).

	# mount -t ogfs /dev/sdx2 /ogfs -o lockproto=opendlm

	(the -o lockproto= option is needed only if opendlm is *not* the default
	lock protocol for your OpenGFS filesystem.  See step 9, above).


That's it, you are done!
You should now be able to use this like any other filesystem.

Copyright 2002-2004 The OpenGFS Project
SourceForge Logo