[Ma-linux] alternatives to NFS
Jay Berkenbilt
ejb at ql.org
Thu Jan 10 17:40:08 EST 2008
I'm curious to find out what others are doing to support network file
sharing in a medium to large scale Linux/UNIX environment. I'm
particularly interested in solutions that result in a uniform file
namespace across multiple systems. That is, one user should be able
to log into any Linux (or even other UNIX) workstation and see the
same files in the same place, transparently regardless of whether the
files are local or remote. The solution does not have to be low cost
or use free software. Though I always prefer such solutions, in this
instance, performance concerns trump those concerns.
The "obvious" way to do this is with the NFS automounter. I've used
the automounter in various flavors since 1991. It's relatively simple
to meet the above stated requirements using the NFS automounter, but
there is one significant problem with this approach: it involves using
NFS. (If you disagree that this is a problem, I'd like to hear that
too!) NFS, in addition to actually standing for "No File Security",
doesn't scale very well. In an environment with hundreds of users,
I haven't found that any amount of optimization or tuning can really
result in acceptable performance. NFS has a number of reliability
issues as well, but this isn't really a treatise about NFS. If
someone has successfully deployed NFS in an environment with more than
about 100 users trying to access the same set of files or files from
the same server, I'd be interested to hear how you did it. Even with
optimal performance (one client, one server, no other network
traffic), the fastest networks are not as fast as local disk. NFS may
be fine for basic office tasks, but it's not really a good platform
for doing software builds or other heavily I/O-intensive tasks as its
overhead is pretty high. (Writing a file over NFS even under optimal
conditions is slower than transferring the file with a very
low-overhead protocol like FTP. Years ago, it used to be 3:1. I
haven't measured it recently. I'm sure it's better than that, but
it's still not great as far as I know.)
I'm not really sure how SANs change this picture. It seems to me that
there's no escaping that the network is going to be a bottleneck, but
perhaps a well-configured SAN can handle more simultaneous users a
little better.
Another solution that springs to mind is AFS. I used AFS back in its
infancy in the late '80's, and I think it's a great file system. Is
anyone out there using AFS in a production environment? AFS, with its
replication capabilities and local caching, is a big win for "read
mostly" file systems. Does anyone have any recent experience with
using it for operations that require a lot of writes such as software
development? With a sufficiently large cache, how does writing to AFS
compare to writing to local disk? How much does the network speed
matter? By default, AFS saves files synchronously to the server on
close. (You can change this, but doing so is inadvisable when data
integrity is of concern.) This means that for operations like
software compilation, writes will still be bound by the speed of the
network, but reads may be faster because of the local cache.
I know that there have been other AFS-like distributed file systems
such as DFS and Coda, but they don't seem to have the same degree of
support or ongoing maintenance that AFS does.
So far, the main options I'm considering are sticking with NFS but
coming up with some kind of replication strategy and distribution of
files across various file systems to reduce the load on individual
servers, doing some benchmarking on AFS vs. NFS to see if it's really
a better option, or coming up with some strategy where people can work
mostly off their local disks with periodic rsyncs to a central
location for backup. It may end up being a combination of these.
In any case, if anyone out there has something a lot better than any
of this, I'd be very interested in hearing about it. Thanks.
--Jay B
--
Jay Berkenbilt <ejb at ql.org>
More information about the Ma-linux
mailing list