This page is an archive of www.
Seagate's Seek Error Rate,
Raw Read Error
Rate, and Hardware ECC Recovered SMART attributes
Seagate's Seek
Error Rate, Raw
Read Error
Rate, and Hardware
ECC Recovered SMART
attributes create a lot of anxiety amongst Seagate users. This is
because the raw values are typically very high, and the normalised
values (Current / Worst / Threshold) are usually quite low. Despite
this, the numbers in most cases are perfectly OK.
The anxiety arises because we intuitively expect that the normalised
values should reflect a "health" score, with 100 being the ideal value.
Similarly, we would expect that the raw values should reflect an error
count, in which case a value of 0 would be most desirable. However,
Seagate calculates and applies these attribute values in a
counterintuitive way.
In fact the normalised values of Seagate's Seek Error Rate, Raw Read
Error Rate, and Hardware ECC Recovered attributes are logarithmic, not
linear, and the raw values are sector counts or seek counts, not error
counts.
Seagate's SMART documentation is not publicly available. The following
information has not been gleaned from any official source, but is based
on my own testing and observation, and on testing by others. Therefore
it may contain errors.
Seek
Error Rate
The raw value of each SMART attribute occupies 48
bits.
Seagate's Seek Error Rate attribute consists of two parts -- a 16-bit
count of seek errors in the
uppermost 4 nibbles, and a 32-bit
count
of seeks in the lowermost 8
nibbles. In
order to see these data, we will need a SMART utility that reports all
48 bits, preferably in hexadecimal. Two such utilities are HD Sentinel
and HDDScan.
I believe the relationship between the raw and normalised values of the
SER attribute is given by ...
normalised SER = -10 log
(lifetime seek errors / lifetime seeks)
In the above formula, if the drive has recorded no errors, then we
would still need to set the number of errors to 1, otherwise the result
would be indeterminate.
The following table correlates the normalised SER against the actual
error rate:
90 — <= 1 error per 1000 million seeks
80 — <= 1 error per 100 million
70 — <= 1 error per 10 million
60 — <= 1 error per million
50 — 10 errors per million
40 — 100 errors per million
30 — 1000 errors per million
20 — 10 errors per thousand
A drive that has not yet recorded 1
million seeks
will show 100
and 253
for the Current
and Worst
values. I believe this is because the data
are not considered to be statistically significant until the drive has
recorded 1 million seeks. When this target is reached, the values drop
to 60 and 60, assuming there have been no errors.
By way of example, here are the SMART data for my 13GB Seagate HDD:
http://www.users.on.net/~fzabkar/SmartUDM/13GB.RPT
Attribute ID Threshold Value Worst Raw
======================================================
Seek Error Rate 7 30 53 38 052E0E3000EC
The number of lifetime
seek errors = 0x052E
(uppermost 4 nibbles)
The number of lifetime
seeks = 0x0E3000EC
(lowermost 8 nibbles)
Using Google's calculator ...
0x052E = 1326
0x0E3000EC = 238 026 988
http://www.google.com/search?q=0x052E+in+decimal
http://www.google.com/search?q=0x0E3000EC+in+decimal
Applying the formula ...
normalised SER = -10 log
(0x052E / 0x0E3000EC)
http://www.google.com/search?q=-10+log+(0x052E+/+0x0E3000EC)
... we get a result of 52.54.
Here is a second example:
http://www.users.on.net/~fzabkar/SmartUDM/120GB.RPT
Attribute ID Threshold Value Worst Raw
======================================================
Seek Error Rate 7 30 79 60 00000580A6AC
The above drive is in fact
error free. It has recorded 0x0580A6AC
seeks (= 92 million) without error.
Applying the formula ...
normalised SER = -10 log
(1 / 0x0580A6AC)
... we get a result of 79.65
Note that we have used 1 instead of 0 for the error count (because log
0 is indeterminate).
Raw Read Error Rate and Hardware ECC Recovered
The raw values of the RRER
and HER
attributes represent a sector
count, not an error
count. This figure rolls over to 0 once the count reaches about 250
million. I suspect that the
drive records the total number of
errors in each block of 250 million sectors, and then recalculates the
normalised values of each attribute accordingly. This means that RRER
and HER would be updated according to a rolling
average
rather than on a lifetime basis. I'm almost certain that the normalised
values are also logarithmic, but I'm not sure how they are calculated.
The above figure of 250 million sectors applies to the 7200.11
and DiamondMax 22
models, but may not apply to all.
While writing this article I came upon a Seagate document entitled "Diagnostic
Commands". It doesn't
discuss SMART attributes, but it refers
to "Error Recovery Usage
Rate" and defines it as ...
Error Recovery Usage Rate
=
-log10 {(Number of sectors
in which controller invoked
specified error recovery scheme)/[(Number of sectors transferred) *
(512 bytes/sector) * (8 bits/byte)]}
This lends support for my Seek Error Rate formula, and suggests that
the RRER and HER attributes may be similarly calculated.
In fact the document mentions (but does not discuss) 5 different error
recovery schemes:
"On The Fly"
means that errored data is corrected
using the ECC bytes, without an additional access of the platters.
Based on the abovementioned Error Recovery Usage Rate formula, I now
postulate that the normalised value of the Raw Read Error Rate
attribute could be calculated as follows:
normalised RRER = -10 log
(number of errored sectors / total
bits transferred)
The total number of bits is ...
(250 million sectors) x
(512 bytes/sector) x (8 bits/byte) =
1.024 x 10^12
It seems to me that it makes more sense to use a round figure, say 10^12.
If we now let the number of errors equal 0 (or 1), then we have ...
max normalised RRER = -10
log (1 / 10^12) = 120
Similarly, if we let the number of errors equal 250 million (ie every
sector is errored), then we have ...
min normalised RRER = -10
log (1 / 4096) = 36
Therefore, if my hypothesis is correct, we would expect that the
threshold value of the RRER attribute would be 36, and its maximum
possible value would be 120. In fact my Internet research tends to
confirm a maximum of 120
for 7200.11 models, but the threshold
figure is 6.
FWIW, here are the numbers for my own Seagate drives:
Attribute ID Threshold Value Worst Raw
===============================================================
Raw Read Error Rate 1 6 114 100 00000386EBBA (ST3320620A)
Raw Read Error Rate 1 6 64 62 00000AFD20E3 (ST3120026A)
Raw Read Error Rate 1 34 77 66 000007820F8F (ST340016A)
Raw Read Error Rate 1 0 79 78 00000753BA8E (ST313021A)
Hardware ECC recovered 195 0 100 63 00000C62F66E (ST3320620A)
Hardware ECC recovered 195 0 64 62 00000AFD20E3 (ST3120026A)
Hardware ECC recovered 195 0 77 66 000007820F8F (ST340016A)
http://www.users.on.net/~fzabkar/SmartUDM/320GB.RPT
http://www.users.on.net/~fzabkar/SmartUDM/120GB.RPT
http://www.users.on.net/~fzabkar/SmartUDM/40GB.RPT
http://www.users.on.net/~fzabkar/SmartUDM/13GB.RPT
Nevertheless, if we ignore the threshold anomaly, then for each block of 10^12 bits read ...
Number of sectors requiring retries = 10^ [(120 - normalised RRER) / 10]120 — <=1 errored sector in 10^12 bits read
110 — 10 errored sectors in 10^12 bits read
100 — 100 errored sectors in 10^12 bits read
90 — 1000 errored sectors in 10^12 bits read
References
Here are several Usenet
discussions where I have
posted the results of my experiments:
Seagate - SMART Raw Read
Error Rate test:
http://groups.google.com/group/comp.sys.ibm.pc.hardware.storage/browse_thread/thread/b6eb8aa2476f9cac/030c515959145d44#030c515959145d44
SER, RRER, and HEC
discussion:
http://groups.google.com/group/comp.sys.ibm.pc.hardware.storage/browse_thread/thread/54b8ad6d34549e95/ae6ca014b3ff211a#ae6ca014b3ff211a
Seek Error Rate discussion:
http://groups.google.com/group/comp.sys.ibm.pc.hardware.storage/browse_thread/thread/87001db5c567fb9a/63ccf100808bc3f6#63ccf100808bc3f6
A report from a Seagate
user regarding the RRER attribute:
http://forums.seagate.com/t5/Barracuda-XT-Barracuda-Barracuda/New-Maxtor-STM3500320AS-500GB-S-M-A-R-T-Problem/m-p/22276
HD Sentinel (DOS / Windows
/ Linux):
http://www.hdsentinel.com/
HDDScan for Windows:
http://hddscan.com/
Explanation of SMART
attributes:
http://en.wikipedia.org/wiki/S.M.A.R.T.
Kingston® SF-2000
Based SSD SMART Attributes:
http://hddguardian.googlecode.com/svn/docs/Kingston%20SMART%20attributes%20details.pdf