26 May 2011

Replicating a ZFS FS Between Hosts

Occasionally, it can be useful to fully replicate a volume of data from
one host to another.  Perhaps you need to mirror a production filesystem
(FS) for use in development, need a sane backup, etc.  The following
describes one way of doing this using ZFS.  Additionally, we're going
to do so using disparate OSes (though like OSes could have been used).
Here are the setup details:
        HOSTS:          berkeley, sunspot
        PROMPT:         host [0]
        OSes:           FreeBSD 8.1 (berkeley)
                        Solaris 10 u8 (sunspot)
        FREEBSD DISK:   da1
        FREEBSD POOL:   mypool01
        SOLARIS DISK:   c1t1d0
        SOLARIS POOL:   bsdpool01
The following assumes no ZPools or ZFS FS are setup yet, though the
pertinent details are still the same otherwise.  Also, while the end
result is the replication of an FS, there's also some discussion about
ZFS snapshots.  Now without further ado, on berkeley (our FreeBSD host),
we check the disks installed, looking for our new 500 MB disk.  After
noting it's "da1", we create ZPool "mypool01" and ZFS "mypool01/configs"
and follow up with a check of their versions:
        berkeley [0] /usr/bin/egrep '^(ad|da|fla|aacd|mlxd|amrd|idad|twed)' /var/run/dmesg.boot
        da0 at mpt0 bus 0 scbus0 target 0 lun 0
        da0: <VBOX HARDDISK 1.0> Fixed Direct Access SCSI-5 device
        da0: 300.000MB/s transfers
        da0: Command Queueing enabled
        da0: 8192MB (16777216 512 byte sectors: 255H 63S/T 1044C)
        da1 at mpt0 bus 0 scbus0 target 1 lun 0
        da1: <VBOX HARDDISK 1.0> Fixed Direct Access SCSI-5 device
        da1: 300.000MB/s transfers
        da1: Command Queueing enabled
        da1: 512MB (1048576 512 byte sectors: 64H 32S/T 512C)
        berkeley [0] /sbin/zpool create mypool01 da1
        berkeley [0] zpool list
        NAME       SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
        mypool01   504M  73.5K   504M     0%  ONLINE  -
        berkeley [0] /sbin/zpool status
          pool: mypool01
         state: ONLINE
         scrub: none requested
        config:

                NAME        STATE     READ WRITE CKSUM
                mypool01    ONLINE       0     0     0
                  da1       ONLINE       0     0     0

        errors: No known data errors
        berkeley [0] /sbin/zfs create mypool01/configs
        berkeley [0] /sbin/zpool get all mypool01 | /usr/bin/grep version
        mypool01  version        14          default
        berkeley [0] /sbin/zfs get all mypool01 | /usr/bin/grep version
        mypool01  version               3                      -
        berkeley [0] /sbin/zfs get all mypool01/configs | /usr/bin/grep version
        mypool01/configs  version               3                      -
A quick 'df' to ensure that the volume is online before copying in some
data, a check of the data, and a subsequent 'df' for the space now used:
        berkeley [0] /bin/df -h
        Filesystem          Size    Used   Avail Capacity  Mounted on
        /dev/da0s1a         5.8G    1.2G    4.1G    22%    /
        devfs               1.0K    1.0K      0B   100%    /dev
        /dev/da0s1d         997M    584K    916M     0%    /var
        mypool01            472M     21K    472M     0%    /mypool01
        mypool01/configs    472M     18K    472M     0%    /mypool01/configs
        berkeley [0] /bin/cp -r /usr/local/etc/. /mypool01/configs/local_etc
        berkeley [0] /usr/bin/find /mypool01/configs -print | /usr/bin/head -5
        /mypool01/configs
        /mypool01/configs/local_etc
        /mypool01/configs/local_etc/rc.d
        /mypool01/configs/local_etc/rc.d/sshd
        /mypool01/configs/local_etc/rc.d/fsck
        berkeley [0] /usr/bin/find /mypool01/configs -print | /usr/bin/wc -l
             153
        berkeley [0] /bin/df -h /mypool01/configs
        Filesystem          Size    Used   Avail Capacity  Mounted on
        mypool01/configs    472M    303K    472M     0%    /mypool01/configs
At this point, we take a ZFS snapshot of "mypool01/configs", establishing
a point in time data set from which we could use to rollback the FS,
create a new FS from, use for backup purposes, etc:
        berkeley [0] /sbin/zfs snapshot mypool01/configs@20:24
        berkeley [0] /sbin/zfs list -t snapshot
        NAME                     USED  AVAIL  REFER  MOUNTPOINT
        mypool01/configs@20:24      0      -   302K  -
        berkeley [0] /sbin/zfs get all mypool01/configs@20:24
        NAME                    PROPERTY              VALUE                  SOURCE
        mypool01/configs@20:24  type                  snapshot               -
        mypool01/configs@20:24  creation              Thu May 26 20:24 2011  -
        mypool01/configs@20:24  used                  0                      -
        mypool01/configs@20:24  referenced            302K                   -
        mypool01/configs@20:24  compressratio         1.00x                  -
        mypool01/configs@20:24  devices               on                     default
        mypool01/configs@20:24  exec                  on                     default
        mypool01/configs@20:24  setuid                on                     default
        mypool01/configs@20:24  shareiscsi            off                    default
        mypool01/configs@20:24  xattr                 on                     default
        mypool01/configs@20:24  version               3                      -
        mypool01/configs@20:24  utf8only              off                    -
        mypool01/configs@20:24  normalization         none                   -
        mypool01/configs@20:24  casesensitivity       sensitive              -
        mypool01/configs@20:24  nbmand                off                    default
        mypool01/configs@20:24  primarycache          all                    default
        mypool01/configs@20:24  secondarycache        all                    default
        berkeley [0] /sbin/zfs get all mypool01/configs | /usr/bin/grep snapshot
        mypool01/configs  usedbysnapshots       0                      -
Once the snapshot has been created above, we see that it currently uses
no space of ZPool "mypool01" and it's a "version 3" snapshot referencing
about 302 KB of space for FS "mypool01/configs".  Below, we copy over
more data to FS "mypool01/configs", check the space used via 'df'
and 'zfs list', and now see that our snapshot uses 21 KB of space.
(As the active data set changes, a snapshot will consume more data):
        berkeley [0] /bin/cp -r /etc/. /mypool01/configs/etc
        berkeley [0] /usr/bin/find /mypool01/configs -print |  /usr/bin/wc -l
             535
        berkeley [0] /bin/df -h /mypool01/configs
        Filesystem          Size    Used   Avail Capacity  Mounted on
        mypool01/configs    472M    2.1M    470M     0%    /mypool01/configs
        berkeley [0] /sbin/zfs list mypool01/configs
        NAME               USED  AVAIL  REFER  MOUNTPOINT
        mypool01/configs  2.08M   470M  2.06M  /mypool01/configs
        berkeley [0] /sbin/zfs get all mypool01/configs | /usr/bin/grep snapshot
        mypool01/configs  usedbysnapshots       21K                    -
        berkeley [0] /sbin/zfs get all mypool01/configs@20:24 | /usr/bin/grep used
        mypool01/configs@20:24  used                  21K                    -
        berkeley [0] /sbin/zfs list -t all -r mypool01
        NAME                     USED  AVAIL  REFER  MOUNTPOINT
        mypool01                2.18M   470M    21K  /mypool01
        mypool01/configs        2.08M   470M  2.06M  /mypool01/configs
        mypool01/configs@20:24    21K      -   302K  -
Of note, ZFS snapshots can be accessed under "/FS/.zfs/snapshot".  Seen
below are details of snapshot "mypool01/configs@20:24".  The snapshot is
a directory identified by whatever snapshot name you chose, thus for ours
it is "20:24".  The contents of the snapshot directory are not hardlinks
to the original files though are references to them.  They will display
the same inode number, even though the contents may have changed on the
active set, as seen in the inode listing for "tmp" in "local_etc/rc.d":
        berkeley [0] /bin/ls -l /mypool01/configs/.zfs/snapshot
        total 2
        drwxr-xr-x  3 root  wheel  3 May 26 20:21 20:24
        berkeley [0] /bin/ls -l /mypool01/configs/.zfs/snapshot/20:24/
        total 2
        drwxr-xr-x  4 root  wheel  4 May 26 20:21 local_etc
        berkeley [0] /bin/ls -lid /mypool01/configs/.zfs/snapshot/20:24/local_etc/rc.d/tmp
        902 -r-xr-xr-x  1 root  wheel  2173 May 26 20:23 /mypool01/configs/.zfs/snapshot/20:24/local_etc/rc.d/tmp
        berkeley [0] /bin/ls -lid /mypool01/configs/local_etc/rc.d/tmp
        902 -r-xr-xr-x  1 root  wheel  2181 May 26 20:25 /mypool01/configs/local_etc/rc.d/tmp
With our snapshot created, let's move over to our secondary host (sunspot
(Solaris)).  Below, we use 'format' to see what disks are available,
using "c1t1d0" to create ZPool "bsdpool01":
        sunspot [0] echo | /usr/sbin/format
        Searching for disks...done


        AVAILABLE DISK SELECTIONS:
               0. c1t0d0 <DEFAULT cyl 4092 alt 2 hd 128 sec 32>
                  /pci@0,0/pci1000,8000@16/sd@0,0
               1. c1t1d0 <DEFAULT cyl 509 alt 2 hd 64 sec 32>
                  /pci@0,0/pci1000,8000@16/sd@1,0
        Specify disk (enter its number): Specify disk (enter its number):
        sunspot [1] /usr/sbin/zpool create bsdpool01 c1t1d0
        sunspot [0] echo "Just a place to keep bsd data" >> /bsdpool01/README
        sunspot [0] /bin/ls -li /bsdpool01/README
                 6 -rw-r--r--   1 root     root          30 May 26 20:56 /bsdpool01/README
        sunspot [0]
Now that we have some place to restore the snapshot on berkeley to,
we use 'zfs send' piped to 'ssh' on berkeley which executes 'zfs recv'
on sunspot to transfer and import the snapshot:
        berkeley [0] zfs send mypool01/configs@20:24 |
        > /usr/bin/ssh -l root sunspot-int /usr/sbin/zfs recv bsdpool01/configs
        Password:
        berkeley [0] 
Once the snapshot has been transferred to sunspot, it is incorporated
into ZPool "bsdpool01" as "bsdpool01/configs".  It's worth noting
that assuming ZPool "bsdpool01" exists on berkeley, without the 'ssh'
we could have easily used 'zfs send / recv' locally to recreate the
"mypool01/configs" FS in a new pool (bsdpool01/configs).  Below, we
check out "bsdpool01" on sunspot via 'df' and see that it's mounted.
We can also see that the restore process retained the original snapshot:
        sunspot [0] /bin/df -h | grep bsdpool01
        bsdpool01              464M    22K   464M     1%    /bsdpool01
        bsdpool01/configs      464M   303K   464M     1%    /bsdpool01/configs
        sunspot [0] /usr/sbin/zfs list -t all -r bsdpool01
        NAME                      USED  AVAIL  REFER  MOUNTPOINT
        bsdpool01                 388K   464M  22.5K  /bsdpool01
        bsdpool01/configs         304K   464M   304K  /bsdpool01/configs
        bsdpool01/configs@20:24      0      -   304K  -
        sunspot [0] /usr/sbin/zpool get all bsdpool01 | /bin/grep version
        bsdpool01  version        15          default
        sunspot [0] /usr/sbin/zfs get all bsdpool01 | /bin/grep version
        bsdpool01  version               4                      -
        sunspot [0] /usr/sbin/zfs get all bsdpool01/configs | /bin/grep version
        bsdpool01/configs  version               3                      -
In the above, we additionally see that ZPool "bsdpool01" is version 15,
and FS version 4, whereas on berkeley, ZPool "mypool01" was version 14
and FS version 3.  On both hosts, FS configs is version 3.  Below, we
simply verify that we have the same files as seen on berkeley, and also
validate the md5 checksum of a file between both sunspot and berkeley:
        sunspot [0] /usr/bin/find /bsdpool01/configs -print | head -5
        /bsdpool01/configs
        /bsdpool01/configs/local_etc
        /bsdpool01/configs/local_etc/rc.d
        /bsdpool01/configs/local_etc/rc.d/sshd
        /bsdpool01/configs/local_etc/rc.d/fsck
        sunspot [0] /usr/bin/find /bsdpool01/configs -print | /usr/bin/wc -l
             153
        sunspot [0] /usr/sfw/bin/openssl md5 /bsdpool01/configs/local_etc/rc.d/sshd
        MD5(/bsdpool01/configs/local_etc/rc.d/sshd)= abcaf51cdbd8fb8b84dcd56e4ca56279
        berkeley [0] /usr/bin/openssl md5 /mypool01/configs/local_etc/rc.d/sshd
        MD5(/mypool01/configs/local_etc/rc.d/sshd)= abcaf51cdbd8fb8b84dcd56e4ca56279
This is very cool, not only can we replicate an FS between hosts, but
we can do so between two different OSes quite simply.  Our work could
be considered done, but where's the fun in that.  On sunspot, below,
we upgrade FS "bsdpool01/configs" to version 4; ZFS on berkeley only
supports up to FS version 3.  After upgrading the FS, we destroy the
original snapshot and create a new one on version 4:
        sunspot [0] /usr/sbin/zfs upgrade bsdpool01/configs
        1 filesystems upgraded
        sunspot [0] /usr/sbin/zfs get all bsdpool01/configs | /bin/grep version
        bsdpool01/configs  version               4                      -
        sunspot [0] /usr/sbin/zfs destroy bsdpool01/configs@20:24
        sunspot [0] /usr/sbin/zfs snapshot bsdpool01/configs@21:11
        sunspot [0] /usr/sbin/zfs list -t all -r bsdpool01
        NAME                      USED  AVAIL  REFER  MOUNTPOINT
        bsdpool01                 390K   464M  23.5K  /bsdpool01
        bsdpool01/configs         304K   464M   304K  /bsdpool01/configs
        bsdpool01/configs@21:11      0      -   304K  -
From here, we send the new snapshot back to berkeley, only to see there
are issues:
        sunspot [0] /usr/sbin/zfs send bsdpool01/configs@21:11 |
        > /usr/bin/ssh -l root berkeley-int /sbin/zfs recv mypool01/newconfigs
        Password:
        cannot mount 'mypool01/newconfigs': Operation not supported
        sunspot [1]
On berkeley, 'zfs list' shows the new FS as "mypool01/newconfigs",
even consuming space, but we cannot get the FS version and see an empty
filesystem via 'ls':
        berkeley [0] /sbin/zfs list -t all -r mypool01
        NAME                        USED  AVAIL  REFER  MOUNTPOINT
        mypool01                   2.49M   470M    22K  /mypool01
        mypool01/configs           2.09M   470M  2.06M  /mypool01/configs
        mypool01/configs@20:24       30K      -   302K  -
        mypool01/newconfigs         302K   470M   302K  /mypool01/newconfigs
        mypool01/newconfigs@21:11      0      -   302K  -
        berkeley [0] /sbin/zfs get all mypool01/newconfigs | /usr/bin/grep version
        berkeley [1] /bin/ls -a /mypool01/newconfigs
        .       ..
Trying a 'zfs umount' suggests the volume isn't mounted and trying a
'zfs mount' gives back the same error seen in the transfer from sunspot:
        berkeley [0] /sbin/zfs umount mypool01/newconfigs
        cannot unmount 'mypool01/newconfigs': not currently mounted 
        berkeley [1] /sbin/zfs mount mypool01/newconfigs 
        cannot mount 'mypool01/newconfigs': Operation not supported
Our issue becomes somewhat apparent if we try to upgrade the FS:
        berkeley [1] /sbin/zfs upgrade mypool01/newconfigs
        mypool01/newconfigs: can not be downgraded; it is already at version 4
        0 filesystems upgraded
We upgraded the FS on sunspot to version 4, though berkeley only supports
up to version 3.  Given this, while we can transfer over the snapshot, the
resulting FS is unusable.  The point here is to verify that the receiving
host supports the snapshot / FS version of the FS from the sending host,
otherwise the data will be inaccessible on the receiving host.  To finish
out, in attempting to destroy "mypool01/newconfigs", we see that ZFS
protects us if there are dependencies (child objects).  Simply passing
'-r' to 'zfs destroy' resolves the issue, removing "mypool01/newconfigs"
and its dependencies:
        berkeley [1] /sbin/zfs destroy mypool01/newconfigs
        cannot destroy 'mypool01/newconfigs': filesystem has children
        use '-r' to destroy the following datasets:
        mypool01/newconfigs@21:11
        berkeley [1] /sbin/zfs destroy -r mypool01/newconfigs
        berkeley [0]