[Dclug] Question about Speed: data transmission vs write-to-disk
Jason
dclug at jasons.us
Mon Dec 17 15:44:49 EST 2007
On Sun, 16 Dec 2007, Tim wrote:
> Depending on your definition of RAID (originally "Redundant Array of
> Independent/Inexpensive Disks", though many people include non-redundant
> arrays in this category), it certainly can speed up writes.
>
> While your commonly used IDE/ATA, SCSI, SATA, or SAS drive interfaces
> will advertize maximum transfer rates in the hundreds of MB/sec, you'll
> find even sustained writes to a single disk are often much less than
> this. The speed of the interface is often designed to handle access to
> multiple disks in parallel with each disk being individually much slower
> than the interfaces. (This is not necessarily true for reads, since
> disks have big caches nowadays and can cough up a small portion of the
> disk quickly.)
True, but the cache is quickly over-run when doing large sequential reads.
Most RAID controllers have their own cache that tends to be significantly
larger than what's on the disks, but none of it helps for virgin reads.
> With a RAID 5 set, or something similar, you could get a speedup in
> writes if you're using 4 or more disks. However the logic is more
> complex and the write speed gains may vary greatly based on the kind of
> RAID controller you use.
True, though most RAID controllers these days have enough horsepower that
the "RAID penalty" is largely nil. If you're pushing things to the hairy
edge where you start to notice the performance hit you should probably be
looking at a bigger system. As powerful as modern CPUs are using software
RAID5 or RAID6 (ie: letting your system's CPU do the RAID calculations)
will almost certainly net you a boost in disk performance over a single
drive, assuming you have enough disks. The days of dedicated parity chips
are behind us for most intents and purposes. That's where large, NAS or
SAN systems are handy: put lots of disks in a place where lots of users
can get to them. Sure, a couple of users can swamp it, but at that point
it's a sizing exercise.
> Also something to note is that from what I understand, the beginning of
> disks is faster than the end. The first bytes in the drive are stored
> on the outside edge of the platters which have a higher linear speed and
> therefore faster access. I've heard stories of server farms where folks
> really wanted the fastest storage possible so they'd run SCSI striped
> arrays and only partition off the first 1/4 or 1/2 of each drive for
> this purpose. (Someone correct me if this is no longer the case with
> modern drives.)
Technically it's still the case but the performance gain is so minor
compared the the added complexity and housekeeping required that no one
does it. Pillar Data was working on this for a while but last I'd heard
they abandoned it as too much work for too little gain.
-Jason
-----
--- There are no absolute statements. I'm very probably wrong. ---
"The difference between genius and stupidity is that genius has its limits."
- Albert Einstein
More information about the Dclug
mailing list