The use of ECC is strongly recommended with NAND flash parts owing to their
tendency to occasionally bit-flip. This is usually done with a variant of
a Hamming code which calculates column and line parity.
The computed ECC is stored in the spare area of the page to which it relates.
The NAND library automatically computes and stores the ECC of data as it
is written to the chip. On read, the code is calculated for the data
actually read; this is compared with the stored code and the data
repaired if necessary.
The NAND library comes with a software ECC implementation named
linux_mtd_ecc. This is compatible with the
ECC used in the Linux MTD layer, hence its name. It calculates
a 3-byte ECC on a 256-byte data block.
This algorithm is adequate for most circumstances, but it is
strongly recommended to use any hardware ECC support which may
be available because of the performance gains it yields.
(In testing, we observed that up to two thirds of the time
taken by every page read and program call was used in computing
ECC in software.)
This library draws a semantic distinction between
hardware and
software
ECC implementations.
A software ECC implementation will typically not require an
initialisation step. The calculation function will always be
called with a pointer to the data bytes to compute.
A hardware implementation is assumed to read and act upon the data
as it goes past.
Therefore, it will not be passed a pointer to the data
when its calculate step is invoked.
An ECC is defined by the following parameters:
The size of data block it handles, in bytes.
The size of ECC it calculates on those blocks, in bytes.
Whether the algorithm is hardware or software.
An ECC algorithm must provide the following functions:
/* Initialises an ECC computation. May be NULL if not required. */
void my_ecc_init(struct _cyg_nand_device_t *dev);
/* Returns the ECC for the given data block.
* If IS_HARDWARE:
* - dat and nbytes are ignored
* If ! IS_HARDWARE:
* - dat and nbytes are required
* - if nbytes is less than the chunk size, the remainder are
* assumed to be 0xff.
*/
void my_ecc_calc(struct _cyg_nand_device_t *dev,
const CYG_BYTE *dat, size_t nbytes, CYG_BYTE *ecc);
/* Repairs the ECC for the given data block, if needed.
* Call this if your read-from-chip ECC doesn't match what you computed
* over the data block. Both *dat and *ecc_read may be corrected.
*
* `nbytes' is the number of bytes we're interested in; if a correction
* is indicated outside of that range, it will be ignored.
*
* Returns:
* 0 for no errors
* 1 for a corrected single bit error in the data
* 2 for a corrected single bit error in the ECC
* -1 for an uncorrectable error (more than one bit)
*/
int my_ecc_repair(struct _cyg_nand_device_t *dev,
CYG_BYTE *dat, size_t nbytes,
CYG_BYTE *ecc_read, const CYG_BYTE *ecc_calc);
In some cases - particularly where hardware assistance is in use -
it is necessary to specify different functions
for calculating the ECC depending on whether the operation at hand is
a page read or a page write. In that case, two calc
functions may be supplied, each taking the same prototype.
The algorithm parameters and functions are then tied together
with one of the following macros:
Tip: It's OK to use software ECC while getting things going, but if you do
then switch to a hardware implementation, you probably need to erase
your entire NAND chip including its Bad Block Table. The
nanderase utility may come in handy for this.)
Warning
You must be sure that your ECC repair algorithm is correct.
This can be quite tricky to test.
However, it is often possible to hoodwink the controller into computing
ECCs for you even if the data is not going to affect the data stored
on the NAND chip, for example if you send it data but haven't told it to
program a page.
A variant of the sweccwalk test may come in handy
for this purpose.
An example implementation, including an ECC calculation and repair test
named eccwalk, may be found in the STM3210E
evaluation board platform HAL,
packages/hal/cortexm/stm32/stm3210e_eval.
The chip NAND controller has on-board ECC calculation, but does not
undertake to repair data; a repair function was written specially.