SMP Support

eCos contains support for limited Symmetric Multi-Processing (SMP). This is only available on selected architectures and platforms.

Target Hardware Limitations

To allow a reasonable implementation of SMP, and to reduce the disruption to the existing source base, a number of assumptions have been made about the features of the target hardware.

  • Modest multiprocessing. The typical number of CPUs supported is two to four, with an upper limit around eight. While there are no inherent limits in the code, hardware and algorithmic limitations will probably become significant beyond this point.

  • SMP synchronization support. The hardware must supply a mechanism to allow software on two CPUs to synchronize. This is normally provided as part of the instruction set in the form of test-and-set, compare-and-swap or load-link/store-conditional instructions. An alternative approach is the provision of hardware semaphore registers which can be used to serialize implementations of these operations. Whatever hardware facilities are available, they are used in eCos to implement spinlocks.

  • Coherent caches. It is assumed that no extra effort will be required to access shared memory from any processor. This means that either there are no caches, they are shared by all processors, or are maintained in a coherent state by the hardware. It would be too disruptive to the eCos sources if every memory access had to be bracketed by cache load/flush operations. Any hardware that requires this is not supported.

  • Uniform addressing. It is assumed that all memory that is shared between CPUs is addressed at the same location from all CPUs. Like non-coherent caches, dealing with CPU-specific address translation is considered too disruptive to the eCos source base. This does not, however, preclude systems with non-uniform access costs for different CPUs.

  • Uniform device addressing. As with access to memory, it is assumed that all devices are equally accessible to all CPUs. Since device access is often made from thread contexts, it is not possible to restrict access to device control registers to certain CPUs.

  • Interrupt routing. The target hardware must have an interrupt controller that can route interrupts to specific CPUs. It is acceptable for all interrupts to be delivered to just one CPU, or for some interrupts to be bound to specific CPUs, or for some interrupts to be local to each CPU. At present dynamic routing, where a different CPU may be chosen each time an interrupt is delivered, is not supported. ECos cannot support hardware where all interrupts are delivered to all CPUs simultaneously with the expectation that software will resolve any conflicts.

  • Inter-CPU interrupts. A mechanism to allow one CPU to interrupt another is needed. This is necessary so that events on one CPU can cause rescheduling on other CPUs.

  • CPU Identifiers. Code running on a CPU must be able to determine which CPU it is running on. The CPU Id is usually provided either in a CPU status register, or in a register associated with the inter-CPU interrupt delivery subsystem. ECos expects CPU Ids to be small positive integers, although alternative representations, such as bitmaps, can be converted relatively easily. Complex mechanisms for getting the CPU Id cannot be supported. Getting the CPU Id must be a cheap operation, since it is done often, and in performance critical places such as interrupt handlers and the scheduler.

HAL Support

SMP support in any platform depends on the HAL supplying the appropriate operations. All HAL SMP support is defined in the cyg/hal/hal_smp.h header. Variant and platform specific definitions will be in cyg/hal/var_smp.h and cyg/hal/plf_smp.h respectively. These files are include automatically by this header, so need not be included explicitly.

SMP support falls into a number of functional groups.

CPU Control

This group consists of descriptive and control macros for managing the CPUs in an SMP system.


A type that can contain a CPU id. A CPU id is usually a small integer that is used to index arrays of variables that are managed on an per-CPU basis.


A type that can contain a bitmask of all CPUs in the system. In this mask, bit n corresponds to CPU n.


The maximum number of CPUs that can be supported. This is used to provide the size of any arrays that have an element per CPU.


The maximum possible CPU ID. This should normally be one less that HAL_SMP_CPU_COUNT.


Returns the CPU id of the current CPU.


A value that does not match any real CPU id. This is uses where a CPU type variable must be set to a null value.


A value for the HAL_SMP_CPU_MASK type that has a bit set for each CPU supported. This value can be derived from HAL_SMP_CPU_COUNT.


Starts the given CPU executing at a defined HAL entry point. After performing any HAL level initialization, the CPU calls up into the kernel at cyg_kernel_cpu_startup().


Sends the CPU a reschedule interrupt, and if wait is non-zero, waits for an acknowledgment. The interrupted CPU should call cyg_scheduler_set_need_reschedule() in its DSR to cause the reschedule to occur.


Sends the CPU a timeslice interrupt, and if wait is non-zero, waits for an acknowledgment. The interrupted CPU should call cyg_scheduler_timeslice_cpu() to cause the timeslice event to be processed.

Test-and-set Support

Test-and-set is the foundation of the SMP synchronization mechanisms.


The type for all test-and-set variables. The test-and-set macros only support operations on a single bit (usually the least significant bit) of this location. This allows for maximum flexibility in the implementation.

HAL_TAS_SET( tas, oldb )

Performs a test and set operation on the location tas. oldb will contain true if the location was already set, and false if it was clear.

HAL_TAS_CLEAR( tas, oldb )

Performs a test and clear operation on the location tas. oldb will contain true if the location was already set, and false if it was clear.


Spinlocks provide inter-CPU locking. Normally they will be implemented on top of the test-and-set mechanism above, but may also be implemented by other means if, for example, the hardware has more direct support for spinlocks.


The type for all spinlock variables.


A value that may be assigned to a spinlock variable to initialize it to clear.


A value that may be assigned to a spinlock variable to initialize it to set.

HAL_SPINLOCK_INIT( lock, val )

A macro to initialize a spinlock at runtime. The current state of the spinlock is set according to val: zero for clear, non-zero for set.


The caller spins in a busy loop waiting for the lock to become clear. It then sets it and continues. This is all handled atomically, so that there are no race conditions between CPUs.


The caller clears the lock. One of any waiting spinners will then be able to proceed.

HAL_SPINLOCK_TRY( lock, val )

Attempts to set the lock. The value put in val will be true if the lock was claimed successfully, and false if it was not.

HAL_SPINLOCK_TEST( lock, val )

Tests the current value of the lock. The value put in val will be true if the lock is claimed and false of it is clear.

Scheduler Lock

The scheduler lock is the main protection for all kernel data structures. By default the kernel implements the scheduler lock itself using a spinlock. However, if spinlocks cannot be supported by the hardware, or there is a more efficient implementation available, the HAL may provide macros to implement the scheduler lock.


A data type, possibly a structure, that contains any data items needed by the scheduler lock implementation. A variable of this type will be instantiated as a static member of the Cyg_Scheduler_SchedLock class and passed to all the following macros.


Initialize the scheduler lock. The lock argument is the scheduler lock counter and the data argument is a variable of HAL_SMP_SCHEDLOCK_DATA_TYPE type.


Increment the scheduler lock. The first increment of the lock from zero to one for any CPU may cause it to wait until the lock is zeroed by another CPU. Subsequent increments should be less expensive since this CPU already holds the lock.


Zero the scheduler lock. This operation will also clear the lock so that other CPUs may claim it.

HAL_SMP_SCHEDLOCK_SET( lock, data, new )

Set the lock to a different value, in new. This is only called when the lock is already known to be owned by the current CPU. It is never called to zero the lock, or to increment it from zero.

Interrupt Routing

The routing of interrupts to different CPUs is supported by two new interfaces in hal_intr.h.

Once an interrupt has been routed to a new CPU, the existing vector masking and configuration operations should take account of the CPU routing. For example, if the operation is not invoked on the destination CPU itself, then the HAL may need to arrange to transfer the operation to the destination CPU for correct application.

HAL_INTERRUPT_SET_CPU( vector, mask )

Route the interrupt for the given vector to any of the CPUs whose bit is set in mask.

HAL_INTERRUPT_GET_CPU( vector, mask )

Set mask to the set of CPUs to which this vector is routed.

Documentation license for this page: Open Publication License