Ad
Ad
Ad
Pages: « 1 2 [3] 4 »   Bottom of Page
Print
Author Topic: Hard Drive Reliability  (Read 21869 times)
mtselman
Newbie
*
Offline Offline

Posts: 29


WWW
« Reply #40 on: January 30, 2007, 08:24:36 PM »
ReplyReply

Quote
They say that misery loves company so just as a note/warning to those using these large capacity LaCie drives (I have a pile of them):

The way LaCie create these large drives is by installing 2 or more smaller drives and runs them as RAID 0 - stripping data over the drives so that they appear to be the size of the sum of the drive sizes. i.e. 2x250GB drives appear to be a 500Gb drive. Unfortunately this has a serious side effect in that the chances of a drive or controller failure hurting your data is actually much worse (2x ++) than a single high capacity drive would since either drive failing, or the controller, will leave you with lost data.

I had two of the LaCie BigDisk 500Gb drives fail. As it happens I live very close to the LaCie headquarters in Oregon so I was able to get both drives replaced under warranty in person. In my case, all data was lost when the drives failed but luckily they were at different times and were my backup mirror drives. One drive failed within a few days, the other was probably 6 months old. You can probably find posts on the internet about these drives & reliability. They are competitively priced because they are using commodity drives.

I'm sure that for every customer like me (or Ray!) there are probably 1000's who never have a failure. If that's you, great! The only thing I would suggest is that you have a good backup strategy just in case. I managed to have 50% failure rate with my 4 drives which isn't great statistics however you calculate them. I finally ended up investing in a 3Gb RAID 5 array & controller using the best SATA disks can buy at this time. With the benefit of hindsight I'd have gone this route from the very beginning.

On the bright side, LaCie were very accomodating and replaced my drives without question. I still use them but I don't trust them for critical content. Twice bitten etc etc ...
[{POST_SNAPBACK}][/a]

LaCie BigDisks seem to get really poor customer ratings on several sites. Amazon show many people complaining. The same on Newegg: [a href=\"http://www.newegg.com/Product/CustratingReview.asp?item=N82E16822154070]http://www.newegg.com/Product/CustratingRe...N82E16822154070[/url]
You may do better and cheaper with a third-party enclosure and a good hard drive.
As for RAID 0, as an IT person, I'll say that it's a totally ridiculous idea to use RAID 0 for anything but transient data that requires high performance reads/writes. I would not use other RAID setups for a back-up either, but rather duplicate onto separate drives. RAID 1 or 5 setup does protect you from a mechanical failure of one of the drives, however it does not protect you from user error (opps, I just deleted that folder/reformatted/etc..), does not protect you from a virus, does not allow to have drives in separate locations, does not protect you from RAID controller failure (and if it fails - good luck recovering your data except maybe for a RAID 1 setup.)

  --Misha
Logged
Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #41 on: January 30, 2007, 08:33:46 PM »
ReplyReply

Quote
I've seen this.  The neat thing is that it tends to get worse as you use the DVD.  So if you see your DVD start to delaminate you want to copy that data soon because it will only get worse.

Had an entire spindle of optorite disks do this.  You could see that some started to fail on the initial burn.  (around the center spindle.)
[a href=\"index.php?act=findpost&pid=98417\"][{POST_SNAPBACK}][/a]


I live in a humind, tropical environment. I've never had experience of this. I'm still convinced some of you guys are buying substandard rejects at either a bargain price, or a price that gives the seller a huge profit margin.

It's the responsibility of the consumer to complain like hell. If I ever bought some CDs or DVDs that started delaminating, I'd complain like hell. But if some of you guys are buying nondescript discs from nondescript suppliers who vanish by night, then I guess you have no recourse. Who do you complain to? Luminous Landscape?
Logged
Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #42 on: January 30, 2007, 08:47:35 PM »
ReplyReply

Quote
Two points offered only FWIW... My IT guru tells me that hard drive failure is more apt to occur in the first 10 hours of run on a new drive than the next 10,000, so it's always a good idea to stress drives a bit before using them for critical data (he has a program he runs that exercises the drives overnight).

Jack,
I used this drive for many more than 10 hours over a 3 week period, and even occasionally left it on all night. It doesn't switch off automatically with the computer.

Quote
Also, I understand regular DVD's (the $30 per spindle of 50 type) are not considered archival as gold DVD's are, which run about $3.00 each and the number I used for my cost comparison.  I think it has to do with the fact DVD/CD can delaminate and moisture can then enter and oxidise the aluminum foil rendering it unreadable, coupled with the cheaper dye layer being able to run if exposed to excessive heat which also will corrupt the read.

I have no CDs or DVDs, ranging from the cheapest I could buy (from a reputable store) to Kodak Gold, that have delaminated or given any trouble at all that wasn't related to a substandard CD-ROM drive.

If I was in the business of producing recordable CD/DVDs and the public became aware of a number of failures in the medium, that were highly exaggertated by the press, I would certainly think up some bullshit about increased protective layers or gold plating, and charge a premium for it. Wouldn't you? This is all politics... sorry! politics and business.
« Last Edit: January 30, 2007, 08:51:09 PM by Ray » Logged
Jack Flesher
Sr. Member
****
Offline Offline

Posts: 2595



WWW
« Reply #43 on: January 30, 2007, 08:50:31 PM »
ReplyReply

Quote
It's the responsibility of the consumer to complain like hell.

It isn't an isolated problem, but endemic; it's simply the nature of the beast. And from what I undrestand, it usually takes time for the delamination to occur -- I've heard 7 years is considered modal life-span for non-archival DVD (compared to 25 years for archival gold).   What is the "warranty" period on the DVD's you are buying?  Regardless of what it is, what do they do if a disc fails during the warranty period?  Probably replace the disc, and not likely the data on it.  So when it happens, for sure complain away -- but I doubt that will result in your lost data being replaced.    

Cheers,
Logged

Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #44 on: January 30, 2007, 08:58:18 PM »
ReplyReply

Quote
It isn't an isolated problem, but endemic; it's simply the nature of the beast. And from what I undrestand, it usually takes time for the delamination to occur -- I've heard 7 years is considered modal life-span for non-archival DVD (compared to 25 years for archival gold). What is the "warranty" period on the DVD's you are buying? Regardless of what it is, what do they do if a disc fails during the warranty period? Probably replace the disc, and not likely the data on it. So when it happens, for sure complain away -- but I doubt that will result in your lost data being replaced.

Cheers,
[a href=\"index.php?act=findpost&pid=98428\"][{POST_SNAPBACK}][/a]

You're not suggesting, Jack, are you that one day I might wake up and a 140GB of DVD storage has become corrupted. I expect if I'm still contributing in 10 years time to this site, I might start a new thread, "Hey, I've come acrross a 22 year old CD that's unreadable. I've tried every drive and every data recovery program but to no avail. I guess Jack Flesher was right after all. Serves me right!"  

I should add that in 10 years time it's highly unlikely I shall have valued images that are only archived on CD. In 10 years time they will probably all have been transferred to the successor of Blu-ray. But it will certainly be interesting to see just how long these early CD recordings last.
« Last Edit: January 30, 2007, 09:16:25 PM by Ray » Logged
Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #45 on: January 30, 2007, 09:45:28 PM »
ReplyReply

I just had further thoughts on this issue, but my internet connection was severed at the time I posted and I lost my post.

But, not to be deterred, here it is.

I'm deeply worried about accountability in America. Last night I watched a program on SBS in Australia (Special Broadcasting Services which is usually hard hitting), which outlined the total fiasco with regard to the Iraqi reconstruction fund. Here you have a case of billions of dollars being misspent, misappropriated and plain stolen.

There appears to be an enormous rip-off that has taken place.

When I see reports on this site of CDs and DVDs suffering from 'bit rot', what am I to think? I've not experienced it in Australia. Is America the recipient of all scams, I wonder.
Logged
jani
Sr. Member
****
Offline Offline

Posts: 1604



WWW
« Reply #46 on: January 31, 2007, 04:01:45 AM »
ReplyReply

Quote
I live in a humind, tropical environment.
Well, that certainly increases the risk of HDD (hard disk drive) failure.

I've stated my opinions regarding backups before, and I'll only reiterate one point:

If anyone chooses to go with HDD-based backups, keep in mind that when one of your HDDs fail, you'll have a time period when the duplicate is at risk without redundancy, unless you have triplicates instead of duplicates.

As an IT professional, such temporary lack of redundancy makes me more than a bit queasy.

So if someone is going for purely HDD-based backups, use triplicates, or at the very least make sure that you have the replacement drive(s) ready for data transfer at the time of failure, not later. Sod's Law implies that the second drive will fail within short time of the first, especially if you don't have a backup of the second, too.
Logged

Jan
Jonathan Wienke
Sr. Member
****
Offline Offline

Posts: 5759



WWW
« Reply #47 on: January 31, 2007, 10:05:40 AM »
ReplyReply

Raid 0 2-drive arrays do not double the the risk of failure, they square it. So using them for archival purposes is dubious at best. RAID 5 is slower, but the stability and fail-safeness are well worth the hassle of waiting a few extra seconds for a file to save to the array.
Logged

mtselman
Newbie
*
Offline Offline

Posts: 29


WWW
« Reply #48 on: January 31, 2007, 11:51:46 AM »
ReplyReply

Quote
Raid 0 2-drive arrays do not double the the risk of failure, they square it. So using them for archival purposes is dubious at best. RAID 5 is slower, but the stability and fail-safeness are well worth the hassle of waiting a few extra seconds for a file to save to the array.
[a href=\"index.php?act=findpost&pid=98526\"][{POST_SNAPBACK}][/a]
Jonathan, with all due respect, the risk of failure for a 2-disk RAID 0 is double (almost), not square. (if by risk you mean probability if failure)
Assume x is probability of failure of one disk in the next year.
Then the probability of a disk working properly for a year is (1-x). The probability of two disks working properly for a year then would be (1-x)(1-x).
Then the probablility of failure of 2-disk RAID 0 is:
 1-(1-x)(1-x) = 1 - 1 + 2x - xx = 2x - xx. As the value of x is a small value less then 1, the 2x in this formula is a dominating factor and xx is of a smaller order of magnitude.
So, if x = 0.05  (i.e a probability of failure is 5%), then the probability of failure of RAID 0 is 2x - xx = 0.10 - 0.0025 = 0.0975 (i.e just below 10%)

  --Misha

PS. As for your point regarding RAID 5, agree about reliability and stability, but look at my post a few posts above regarding general risks involved in relying on RAID setups. RAID has its benefits but I do not see them in the back-up/offline storage domain.
Logged
feppe
Sr. Member
****
Offline Offline

Posts: 2909

Oh this shows up in here!


WWW
« Reply #49 on: January 31, 2007, 12:25:07 PM »
ReplyReply

Quote
Well, that certainly increases the risk of HDD (hard disk drive) failure.

I've stated my opinions regarding backups before, and I'll only reiterate one point:

If anyone chooses to go with HDD-based backups, keep in mind that when one of your HDDs fail, you'll have a time period when the duplicate is at risk without redundancy, unless you have triplicates instead of duplicates.

As an IT professional, such temporary lack of redundancy makes me more than a bit queasy.

So if someone is going for purely HDD-based backups, use triplicates, or at the very least make sure that you have the replacement drive(s) ready for data transfer at the time of failure, not later. Sod's Law implies that the second drive will fail within short time of the first, especially if you don't have a backup of the second, too.
[a href=\"index.php?act=findpost&pid=98479\"][{POST_SNAPBACK}][/a]

This is not unique to HDDs, or to any backup media. Quite the contrary, actually. I'd argue this issue is even more prevalent for DVDs/CD-ROMs, as people tend to burn a DVD, tug it in a pouch and forget it. If they need that backup years later they might see bitrot has taken its toll - this might be the case even with redundant DVD backups.

With HDDs you are generally copying data to the same HDD, thus ensuring its integrity every time you do so. How many people backing up on DVDs check their integrity periodically?
« Last Edit: January 31, 2007, 12:26:07 PM by feppe » Logged

Jonathan Wienke
Sr. Member
****
Offline Offline

Posts: 5759



WWW
« Reply #50 on: January 31, 2007, 12:50:20 PM »
ReplyReply

Quote
So, if x = 0.05  (i.e a probability of failure is 5%), then the probability of failure of RAID 0 is 2x - xx = 0.10 - 0.0025 = 0.0975 (i.e just below 10%)

  --Misha

Sorry, not so, your formula is wrong. The correct formula is (1-x)^(number of drives). Let's say a drive has a 5% probability of failure in a given year. That means it has a 95% chance of operating failure-free. If you have 2 drives in a RAID0 array, the probability of the array NOT experiencing a drive failure in the next year is (.95 * .95), or 90.25%, which means that your odds of 100% data loss have jumped from 5% to 9.75%. Adding a third drive increases your failure odds to 14.26%, and so on.

On the other hand, in a RAID5 configuration, the likelihood of total data loss due to drive failure is x ^ (# of simultaneous failed drives required to fail the array, usually 2) * (repair time factor). The repair time factor is the time interval between a drive fault and the replacement of the defective drive and the rebuild of the array. If that time interval is one day, then your factor would be 1/365. If it wasa year, it would be 1. Given an X of 0.05, a 4-drive RAID0 array would have an 18.55% chance of total data loss in one year, while a 4-drive RAID5 array has a 0.25% chance of failure, if you don't fix it. If you repair and rebuild within 24 hours of a drive failure, the array failure probability (at least due to disk failure) drops to .000685%.

Of course, there are other risk factors involved, like controller failure, human error, viruses, fire, flood, power surges, etc that raise the probability of data loss considerably. But if you back up your data to two separate RAID5 devices that are not in the same building, on different branch circuits, etc. your data is about as safe as you can get it, but is still easily accessible.
« Last Edit: January 31, 2007, 12:53:24 PM by Jonathan Wienke » Logged

mtselman
Newbie
*
Offline Offline

Posts: 29


WWW
« Reply #51 on: January 31, 2007, 01:09:04 PM »
ReplyReply

Quote
Sorry, not so, your formula is wrong. The correct formula is (1-x)^(number of drives). Let's say a drive has a 5% probability of failure in a given year. That means it has a 95% chance of operating failure-free. If you have 2 drives in a RAID0 array, the probability of the array NOT experiencing a drive failure in the next year is (.95 * .95), or 90.25%, which means that your odds of 100% data loss have jumped from 5% to 9.75%. Adding a third drive increases your failure odds to 14.26%, and so on.
........
[a href=\"index.php?act=findpost&pid=98566\"][{POST_SNAPBACK}][/a]
Jonathan, you say my formula is wrong and then use exactly the same formula and arrive at exactly the same result. At least we agree in math.  As you can see above - 9.75% is exactly the number I arrived at and it is roughly double that of 5%, not square of 5%, which is what I was trying to prove since you stated:
Quote
Raid 0 2-drive arrays do not double the the risk of failure, they square it.

With respect,

  --Misha
Logged
Jack Flesher
Sr. Member
****
Offline Offline

Posts: 2595



WWW
« Reply #52 on: January 31, 2007, 06:40:58 PM »
ReplyReply

Quote
Jonathan, you say my formula is wrong and then use exactly the same formula and arrive at exactly the same result. At least we agree in math.  [a href=\"index.php?act=findpost&pid=98570\"][{POST_SNAPBACK}][/a]

Not to start an argument, but I think perhaps you're both wrong if you look at combinational probabilities...

So, I think the better formula would be: (n!/((k!)(n-k)!) * (p^k)((1-P)^(n-k)) where n = number of drives (2), k = number that would need to fail to cause an event (1), p = probability of failure of any drive (using .05).

Sooo...  2!/(1!)(1!) * .05^1 * (.95)^(2-1) = 2/1 * .05 * .95 = 2 * .05 = .095, or a 9.5% probability of failure in one year.  (Okay, not appreciably different than your results, but I think more technically correct  )

Ray, you're a statistics prof, right?  Maybe you can shed some light on this for us...

Cheers,
« Last Edit: January 31, 2007, 07:03:22 PM by Jack Flesher » Logged

jani
Sr. Member
****
Offline Offline

Posts: 1604



WWW
« Reply #53 on: January 31, 2007, 07:17:59 PM »
ReplyReply

Quote
This is not unique to HDDs, or to any backup media. Quite the contrary, actually.
However, when people use HDD "backups", they seem to stick to the second half of their mirror as the backup. So, essentially, you have one backup medium for your data, not two. That is, original HDD and mirror HDD. Dual DVDs means that you have the original HDD plus two DVD copies. I admit that this wasn't clear from my post.

But there's another important difference between HDDs and optical media, is that the HDDs are more exposed to OS write errors, or problems from a faulty power supply, motherboard, controller, electrical black- or brown-outs, etc., since they're not WORM (write once, read many) media.

When I wrote that HDDs have a high probability of failing within a short time of eachother, that was not a joke. This has to do with the drives having very similar runtimes and working conditions. It is also likely that the drives were purchased at the same time, and therefore from the same production lot, which increases the risk of similar production anomalies.

With WORM media, it is common for you to, well, write once, verify the integrity, and then "forget it". This increases the likelyhood of detecting similar production anomalies in the media, compared to HDDs. What you cannot detect, of course, is how the media deteriorates with time; is the plastic acid free and does it seal well enough, is the dye pure and of a time-resistant type (e.g. advanced metal AZO) or not (cyanine or phthalocyanine), and so on.

Quote
I'd argue this issue is even more prevalent for DVDs/CD-ROMs, as people tend to burn a DVD, tug it in a pouch and forget it. If they need that backup years later they might see bitrot has taken its toll - this might be the case even with redundant DVD backups.

With HDDs you are generally copying data to the same HDD, thus ensuring its integrity every time you do so. How many people backing up on DVDs check their integrity periodically?
How many people backing up to HDD check their integrity periodically? I think the proportion of people verifying their backups are similar, regardless of media.
Logged

Jan
Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #54 on: January 31, 2007, 09:08:58 PM »
ReplyReply

I find the issue interesting because it highlights just how difficult it is to get to the truth of a matter like this; ie, reliability of optical media versus hard drive storage.

Generally, people are only motivated to speak out when they are angry or outraged and/or feel cheated or misled in some way.

If you were to conduct a poll on such an issue; example, how many of you have experienced CD/DVD failure, how many have not etc, those who felt aggrieved would be more likely to respond to the poll, so the result would be skewed in that respect, but perhaps even more significant, there would be no way of verifying whether or not the failed optical disc had, for example, been left baking in the sun in a parked car; whether it had been heavily scratched and/or subjected to extremes of environmental conditions, or, in the event of it being just tucked away in a sleeve, whether or not it had been properly recorded in the first instance.

Nobody likes to admit they are incompetent. It's very easy to forget that, say 5 years ago a particular disc was left on a car seat all day, baking in the sun, or that, despite a verification process being used at the time of burning, the disc was perhaps still not recorded properly and such verification was not confirmed after burning by opening a few images.

It's also too easy to confuse DVD drive/software incompatibilities with physical deterioration of the disc. I've had that problem myself too often. Occasionally, a DVD disc I've recently burned on my laptop cannot be read on one of my desktop computers, or perhaps just one folder on that disc causes Adobe Bridge to 'not respond', yet the same disc is perfectly readable on another computer. I don't pretend to understand why this should happen.
Logged
Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #55 on: January 31, 2007, 09:23:20 PM »
ReplyReply

Quote
Ray, you're a statistics prof, right?  Maybe you can shed some light on this for us...

Cheers,
[a href=\"index.php?act=findpost&pid=98636\"][{POST_SNAPBACK}][/a]

Jack,
I'm not a professor of anything. Can I therefore claim to be relatively free of brainwashing, perhaps?  

However, it seems to me it doesn't really matter which formula is more precise if the MTBF data you are using in the formulas are not precise, and as Jani has pointed out, even if the MTBF data supplied by the manufacturer was broadly correct, it wouldn't account for the changed odds due to the fact you might have 2 or more drives from the same batch, a slightly substandard batch which doesn't meet the manufacturers publish MTBF specs.
Logged
Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #56 on: January 31, 2007, 10:29:05 PM »
ReplyReply

Jack,
Perhaps the 3 examples of calculating the probability of a second hard drive failure could be considered as a type of formulaic pixel-peeping   .
Logged
Jonathan Wienke
Sr. Member
****
Offline Offline

Posts: 5759



WWW
« Reply #57 on: February 01, 2007, 01:15:32 AM »
ReplyReply

Quote
Jonathan, you say my formula is wrong and then use exactly the same formula and arrive at exactly the same result. At least we agree in math.  As you can see above - 9.75% is exactly the number I arrived at and it is roughly double that of 5%, not square of 5%, which is what I was trying to prove since you stated:
Quote
Raid 0 2-drive arrays do not double the the risk of failure, they square it.

With respect,

  --Misha

Sorry, I'd just gotten off a 24-hour shift. You are squaring the (1-x) factor, at least for a 2-drive Raid 0 array. .95 * .95, etc. Jack, your question is valid for RAID5, since 2 drives have to fail simultaneously for the array to fail, but not for RAID0, where as soon as the first drive fails, you're screwed. Given the agreed meanings for x and n, and r as the repair/rebuild time factor, the correct formula would be:

(1-((1-x)^n)*(1-((1-x)^(n-1))*r

When one drive has failed, there is one less drive to create an additional failure. So the correct formula for a 4-drive RAID5 with a 24-hour repair/rebuild time is .1855*.1426*(1/365), or 0.274%. Jani raises some valid points about simultaneous failures, but calculating the risk factors for a power surge blowing several drives in the array simultaneously, or damage due to fire, flood, earthquake, terrorism, or human stupidity is a bit harder, as the risk factors are harder to quantify and may not even be known. So the 0.274% array failure risk is kind of a best-case-scenario based on competent setup, placement, maintenance, etc.
« Last Edit: February 01, 2007, 01:16:38 AM by Jonathan Wienke » Logged

jani
Sr. Member
****
Offline Offline

Posts: 1604



WWW
« Reply #58 on: February 01, 2007, 03:02:48 AM »
ReplyReply

Quote
However, it seems to me it doesn't really matter which formula is more precise if the MTBF data you are using in the formulas are not precise, and as Jani has pointed out, even if the MTBF data supplied by the manufacturer was broadly correct, it wouldn't account for the changed odds due to the fact you might have 2 or more drives from the same batch, a slightly substandard batch which doesn't meet the manufacturers publish MTBF specs.
In case anyone is confused about MTBF, I think it's worth pointing out that MTBF is not about individual drive reliability.

archive.org has a copy of IBM's explanation of how IBM calculated MTBF while they manufactured HDDs

It's a pretty good read.
Logged

Jan
Ray
Sr. Member
****
Offline Offline

Posts: 8939


« Reply #59 on: February 01, 2007, 05:36:31 PM »
ReplyReply

Quote
In case anyone is confused about MTBF, I think it's worth pointing out that MTBF is not about individual drive reliability.
[a href=\"index.php?act=findpost&pid=98678\"][{POST_SNAPBACK}][/a]

But surely in a sense it is. Perhaps I haven't understood the concept, so put me right if I haven't. The MTBF figures appear to be saying something like, 'If I use a 100 drives under the same conditions and to the same extent for, say half a million hours in total, then one of them can be expected to fail.

Another manufacturer might claim, 'If you use 100 of our drives under the same conditions, then they should last in total, one million hours before you would expect a drive to fail'.

The second scenario, with a quoted MTBF of 1 million hours would imply that the individual drives must be more reliable than in the first scenario where the MTBF is only half a million hours. Is this not the case?
Logged
Pages: « 1 2 [3] 4 »   Top of Page
Print
Jump to:  

Ad
Ad
Ad