[Grml] How to use GRML to check whether a hard disk is failing

Michael Whapples mwhapples at aim.com
Wed Dec 23 23:12:46 CET 2009


Thanks to everyone who replied. It actually turned out not to be the 
hard disk failing, I had misunderstood what was meant by the person when 
I spoke to them on the phone. It is that they are visually impaired as 
well and have a software package called guide, which is designed for 
simplicity and to lead them through all they may want to do on a 
computer (they are certainly not experienced with computer usage), it 
was that guide wasn't starting (well actually guide seems to have 
disappeared, but that's another matter and I should be able to deal with 
that).

This and some messages I have seen on mailing lists (they tend to come 
from ubuntu users first trying Linux) makes me wonder how much value 
there is in making things really simple. I know sounds controversial, 
but I mean by this, is it better to teach/explain computers fully to 
people than give them something simple to use which makes it hard for 
them to deal with unexpected problems due to their lack of computer 
literacy?

May be the above is why I like GRML, plenty of handy scripts to get 
things set up (eg. grml-x, grml-network, etc) but it doesn't isolate me 
from what is actually going on. Good work Mika and the rest of GRML 
contributors, its probably something not said enough mainly because good 
software won't get in the way and so less noticeable to the user.

Michael Whapples
On 22/12/09 19:06, David Maus wrote:
> At Tue, 22 Dec 2009 15:33:30 +0000,
> Michael Whapples wrote:
>    
>> Hello,
>> I am wondering whether GRML can help me here. I have agreed to check a
>> computer (tomorrow) for someone as it isn't booting properly (its a
>> windows XP computer). By the sound of it I suspect the hard disk is
>> failing or totally failed or windows has become corrupted to the point
>> it won't boot.
>>      
> IMHO the first question should be how important the data on the hard
> disk is. If the hdd sounds unhealthy chances are goot the it may be a
> mechanical defect that gets worse and destroys data simply by spinning
> the discs. I personally refuse to check computers whose hard disks
> make unhealthy noises.
>
> If you decide to check it, my second step would be booting grml and
> making a backup of the drive using ddrescue.
>
> To check the hdd I would use smartmontools that queries the internal
> log of the hard disk.
>
> smartctl -a /dev/<disk>
>
> Displays a overview over the hard disk's state. I normally check the line
>
> SMART overall-health self-assessment test result:
>
> and on the SMART Attributes
>
>    - 196: Reallocated_Event_Count
>
>      Physically damaged sectors are reallocated; it's okay if this
>      happes sometimes but an increasing number of reallocated sectors
>      is troubel ahead.
>
>    - 197: Current_Pending_Sector
>
>      Pending sectors are sectors that are marked for reallocation but
>      can't be reallocated for some reason.
>
> Please be aware that the attribute table is hard to interpret because
> what most of the values actually /mean/ depends on the hard disk
> manufacturer. It is for instance normal for a "Seagate Barracuda
> 7200.10 family" that the raw value of attribute 1: Raw_Read_Error_Rate
> is about 124438548 etc.
>
> It's my practical expirience as a sysadmin that the attributes 196 and
> 197 are good indicators of failing hdds.
>
> You may also start an internal self-test of the hdd (smartctl -t) --
> the possible test routines depend on the hdd model but I would try a
> long selftest (smartctl -t long).
>
> As I had to debug a failing hdd recently I can only stress that what
> ever you do you should check the SMART values occasionally. In my case
> I noticed an increasing rate of reallocated sectors while trying to
> fix the filesystem.
>
> On the question how to check and/or fix a broken ntfs filesystem, I am
> lost.
>
> HTH
>
>   -- David
>
>    




More information about the Grml mailing list