Difference between revisions of "Bad Block Howto"
(→hdparm) |
|||
Line 127: | Line 127: | ||
Unique ID : 203f40fb7 | Unique ID : 203f40fb7 | ||
Checksum: correct | Checksum: correct | ||
− | </ | + | </source> |
= Links = | = Links = |
Revision as of 10:38, 5 October 2020
Your hard disk fails on Linux - now what?
I happened to me a few times in the last few weeks. A filesystem check for a hard disk would take for ever - my system won't boot. One of the harddisks was reporting bad block errors. At this time i have four different hard disks that are giving me this kind of trouble. All broken harddisks have been taken out of the system and I am analyzing the problem using a virtual machine and a USB-SATA bridge.
First step: remove harddisk from system and put into SATA USB docking station
Hopefully your disk is not needed to boot or fully operate your system. In that case you might want to boot of a USB stick or other media. In any case my procedure is to remove the disk from the original system and use a different system for analysis. In my case i am using an Ubuntu based virtual machine and connect the drive via a USB-SATA bridge. With USB 2 devices the performance is poor. I am not using my old Logilink QP002 Sata Docking Station anymore for this reason.
The USB 3 device i bought in 2015: still serves me well.
Second step - check problems
Tools needed
- A Linux virtual machine
- smartctl
- hdparam
- debugfs
- mount
Examples
disk Digda
Where is it mounted?
sudo mount | grep Digda
/dev/sdb1 on /media/wf/Digda type ext2 (rw,nosuid,nodev,relatime,uhelper=udisks2)
get basic info
hdparm
sudo hdparm -I /dev/sdb
/dev/sdb:
ATA device, with non-removable media
Model Number: SAMSUNG HD204UI
Serial Number: S2H7J9EZC04171
Firmware Revision: 1AQ10001
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6
Standards:
Used: unknown (minor revision code 0x0028)
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 3907029168
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
device size with M = 1024*1024: 1907729 MBytes
device size with M = 1000*1000: 2000398 MBytes (2000 GB)
cache/buffer size = unknown
Form Factor: 3.5 inch
Nominal Media Rotation Rate: 5400
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = ?
Advanced power management level: disabled
Recommended acoustic management value: 254, current value: 0
DMA: mdma0 mdma1 *mdma2 udma0 udma1 udma2 udma3 udma4 udma5 udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Advanced Power Management feature set
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* 64-bit World wide name
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
* NCQ priority information
DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Read/Write Long (AC1), obsolete
* SCT Write Same (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
supported: enhanced erase
344min for SECURITY ERASE UNIT. 344min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 50024e9203f40fb7
NAA : 5
IEEE OUI : 0024e9
Unique ID : 203f40fb7
Checksum: correct
Links
- https://www.smartmontools.org/wiki/BadBlockHowto
- https://github.com/hradec/fix_smart_last_bad_sector
- http://dcere.com/hardware/2016/09/18/hard-disk.html
- https://serverfault.com/questions/461203/how-to-use-hdparm-to-fix-a-pending-sector
- https://serverfault.com/a/641135/162693
- https://linux.die.net/man/8/smartctl