ECC implementation

The use of ECC is strongly recommended with NAND flash parts owing to their tendency to occasionally bit-flip. This is usually done with a variant of a Hamming code which calculates column and line parity. The computed ECC is stored in the spare area of the page to which it relates.

The NAND library automatically computes and stores the ECC of data as it is written to the chip. On read, the code is calculated for the data actually read; this is compared with the stored code and the data repaired if necessary.

The NAND library comes with a software ECC implementation named linux_mtd_ecc. This is compatible with the ECC used in the Linux MTD layer, hence its name. It calculates a 3-byte ECC on a 256-byte data block. This algorithm is adequate for most circumstances, but it is strongly recommended to use any hardware ECC support which may be available because of the performance gains it yields. (In testing, we observed that up to two thirds of the time taken by every page read and program call was used in computing ECC in software.)

The ECC interface

This library draws a semantic distinction between hardware and software ECC implementations.

  • A software ECC implementation will typically not require an initialisation step. The calculation function will always be called with a pointer to the data bytes to compute.

  • A hardware implementation is assumed to read and act upon the data as it goes past. Therefore, it will not be passed a pointer to the data when its calculate step is invoked.

An ECC is defined by the following parameters:

  • The size of data block it handles, in bytes.

  • The size of ECC it calculates on those blocks, in bytes.

  • Whether the algorithm is hardware or software.

An ECC algorithm must provide the following functions:

/* Initialises an ECC computation. May be NULL if not required. */
void my_ecc_init(struct _cyg_nand_device_t *dev);


/* Returns the ECC for the given data block.
 * If IS_HARDWARE:
 *  - dat and nbytes are ignored
 * If ! IS_HARDWARE:
 *  - dat and nbytes are required
 *  - if nbytes is less than the chunk size, the remainder are
 *    assumed to be 0xff.
 */
void my_ecc_calc(struct _cyg_nand_device_t *dev, 
                 const CYG_BYTE *dat, size_t nbytes, CYG_BYTE *ecc);


/* Repairs the ECC for the given data block, if needed.
 * Call this if your read-from-chip ECC doesn't match what you computed
 * over the data block. Both *dat and *ecc_read may be corrected.
 *
 * `nbytes' is the number of bytes we're interested in; if a correction
 * is indicated outside of that range, it will be ignored.
 *
 * Returns: 
 *       0 for no errors
 *       1 for a corrected single bit error in the data
 *       2 for a corrected single bit error in the ECC
 *      -1 for an uncorrectable error (more than one bit)
 */
int my_ecc_repair(struct _cyg_nand_device_t *dev,
                  CYG_BYTE *dat, size_t nbytes, 
                  CYG_BYTE *ecc_read, const CYG_BYTE *ecc_calc);

In some cases - particularly where hardware assistance is in use - it is necessary to specify different functions for calculating the ECC depending on whether the operation at hand is a page read or a page write. In that case, two init and calc functions may be supplied, each taking the same prototype.

The algorithm parameters and functions are then tied together with one of the following macros:

CYG_NAND_ECC_ALG_SW(my_ecc, _datasize, _eccsize, my_ecc_init, my_ecc_calc, my_ecc_repair);

CYG_NAND_ECC_ALG_HW(my_ecc, _datasize, _eccsize, my_ecc_init, my_ecc_calc, my_ecc_repair);

CYG_NAND_ECC_ALG_HW2(my_ecc, _datasize, _eccsize, my_ecc_init, my_ecc_calc_read, my_ecc_calc_write, my_ecc_repair);

CYG_NAND_ECC_ALG_HW3(my_ecc, _datasize, _eccsize, my_ecc_init_read, my_ecc_init_write, my_ecc_calc_read, my_ecc_calc_write, my_ecc_repair);

Tip: It's OK to use software ECC while getting things going, but if you do then switch to a hardware implementation, you probably need to erase your entire NAND chip including its Bad Block Table. The nanderase utility may come in handy for this.)

Warning

You must be sure that your ECC repair algorithm is correct. This can be quite tricky to test. However, it is often possible to hoodwink the controller into computing ECCs for you even if the data is not going to affect the data stored on the NAND chip, for example if you send it data but haven't told it to program a page. A variant of the sweccwalk test may come in handy for this purpose.

An example implementation, including an ECC calculation and repair test named eccwalk, may be found in the STM3210E evaluation board platform HAL, packages/hal/cortexm/stm32/stm3210e_eval. The chip NAND controller has on-board ECC calculation, but does not undertake to repair data; a repair function was written specially.

2017-02-09
Documentation license for this page: eCosPro License