[Novalug] GFS?

Brian Steisslinger brian.steisslinger at gmail.com
Wed May 2 21:24:05 EDT 2007


Well data replication could fill an entire book.  Assuming you don't have a
million dollars an OC3 and significant knowledge in storage arrays and block
based replication I can safely assume you want to setup a cheap data
replication solution across the WAN.  So for a bit of background information


I generally categorize replication in three different classes

Block based, File based and application based.
Block based can be done within the storage array it self or via a hostbased
replication solution.  As block changes the change  written to a remote
disk.
Array based replication most commonly happens across fibre channel with
dedicated links.  It can be synchronous or asynchronous depending on
configuration and requirements.  Array based is really good when trying to
coordinate multiple boxes replicating and you need to maintain write order
consistency across luns in different boxes, like a multi-tier application,
also traffic shaping on the network is easier because you have fewer points
of replication. Host based is cheaper but it operates on the same principal
in that the host will have a sudo device that will have replication enabled
for it.  When you write to the disk the write is mirrored to the other
side.  Most block based replication requires the mirrored disk to be
unmounted at the remote site while it is being replicated to.  Most
enterprise storage arrays support array based replication and Symantec
Storage Foundations HA and Softek (IBM) TDMF are pretty common Host based
block replication solutions. On the open source side DRBD looks promising
but I don't see any details on it across a WAN link.

File based replication is all host based and uses a file system driver to
replicate file changes from one server to the other.  The advantage is that
the software can do just a file differential transfer minimizing bandwidth
requirements.  The remote site can be up and operational and even accessible
during replication.  Replication can be synchrnous and asynchrnous.
Symantec storage Migrator and DoubleTake are the two I am most familar
with.  Rsync works on a similar principal but it's not realtime but can be
used for a similar end goal.

Application based is what consider things like Oracle data guard SQL Server
log shipping and some of the new replication stuff in Exchange 2K7.
basically it just means you use a native solution dependent on your
application.  RDBMS are really unique in that most of the time just the SQL
statement is transferred and not the actually file or block changes.

GFS is really an example of parallel access file system  or cluster file
system like Oracle RAC and OCFS or ADVFS and Tru64.  Typically you use this
for clusters or grid based solutions like some of the new stuff you can do
with NFSv4. It relies on a shared storage backend like an iSCSI or Fibre
Channel SAN.

Hope that helps a bit... Feel free to ask if you have any questions...
storage replication\dr\coop can be very tough to get a handle on and vendors
will tell you their solution work and sell it to you even if the environment
you have wont support it.

-Brian



On 5/2/07, Smith, Michael J. <Michael.J.Smith at unisys.com> wrote:
>
> I own a couple of large-ish SANs.  There isn't a good way at the block
> level do this cheaply (both in the money sense and in the computer
> science sense) but there are a couple of vendors who will want to sell
> you something that does it--Symantec, EMC, FalconStor, and we have an
> in-house solution.  They all cost more than your pieces of disk do, so
> they're probably not cost-effective.
>
>
>
>
>
> Michael J Smith, CISSP-ISSEP michael.j.smith at unisys.com
> CISO, Unisys Federal Service Delivery Center
> 703.579.2271 O
> 703.855.0890 C
> "Those who do not understand Unix are condemned to reinvent it, poorly."
> --Henry Spencer
>
>
>
> > -----Original Message-----
> > From: novalug-bounces at calypso.tux.org
> > [mailto: novalug-bounces at calypso.tux.org] On Behalf Of Brian
> > Steisslinger
> > Sent: Wednesday, May 02, 2007 5:36 PM
> > To: John Franklin
> > Cc: NOVALUG
> > Subject: Re: [Novalug] GFS?
> >
> > mirroring across the wan is bad!
> >
> > You need a file level replication solutin.
> > What is yous link speed and latency? Do you synchrnous or
> > asynchronous replication. Rsync may work, do both sides need
> > to process data simultaneously or are you doing this DR/Coop?
> >
> > On 5/2/07, John Franklin <franklin at elfie.org> wrote:
> > > On Wed, 2007-05-02 at 16:13 -0400, Nick Danger wrote:
> > >
> > > > I think I misunderstood the purpose of GFS :-) What Im
> > looking for
> > > > is to have two geographically separated NAS units. NAS units are
> > > > cheap in single form, 3 terrabytes for less then 10grand.  The
> > > > question is, how can I mirror the two file systems for
> > failover? I
> > > > know how to do it at the application/network level, just
> > not at the
> > > > data/FS level.  I kept thinking GFS but that seems more
> > like making
> > > > lots of disks appear as one, not for mirroring. Unless Im
> > reading it wrong.
> > > >
> > > > So, pointers? Links? Case studies? I'll summarize what I find and
> > > > send it back out to the list.
> > > >
> > >
> > >
> > > If the NAS boxes support iSCSI, you can set up a software
> > RAID1 with
> > > them.  Make NAS1 and NAS2 two iSCSI targets that map to
> > /dev/sda and
> > > /dev/sdb, then use md.conf to connect them. That said, I
> > have no idea
> > > how fault-tolerant software RAID is, nor how much the lag
> > between the
> > > two would affect performance on a day-to-day basis.
> > >
> > > jf
> > > --
> > > John Franklin <franklin at elfie.org>
> > >
> > _______________________________________________
> > Novalug mailing list
> > Novalug at calypso.tux.org
> > http://calypso.tux.org/cgi-bin/mailman/listinfo/novalug
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://calypso.tux.org/pipermail/novalug/attachments/20070502/ceccec77/attachment.htm


More information about the Novalug mailing list