Memory Debug Data

Memory Debug Data
	Part V. eCos Support for Dynamic Memory Allocation

Description

Generally it is more difficult to debug an application that allocates memory dynamically than one that relies entirely on static allocation. Some problems such as buffer overflows can affect both. However the locations of static variables are readily determined from the linker map and debug information, so it is much easier to figure out which static buffer overflowed and then find the offending code. With dynamic allocation buffer overflows can still be detected, but it is much harder to figure out what each buffer is used for.

Another problem is excessive memory usage. A typical embedded system is designed with the smallest amount of memory that should suffice for the application. Often the application uses more memory than expected, and it is necessary to find out exactly where it is all going and where savings could be made. The alternative is a hardware redesign, associated delays, and increased manufacturing costs. A linker map gives details of the static data but not of dynamic allocations.

A third problem is memory leaks. If an application allocates memory that does not get freed then the heap will eventually run out. Usually this causes a system failure and means a reboot. It may take hours, days or even weeks, but any system failure is at best undesirable and at worst totally unacceptable.

The memory allocation package provides a debug data facility to assist developers faced with these problems. This involves storing additional metadata on the target for each allocated memory chunk, for example the function where the allocation occurred and the time that it happened. Configuration options control exactly what metadata gets collected. The debug data can be transferred from the target to the host in a gdb session, and then analyzed using the ecosmdd program. This provides a number of sub-commands: stats, dump, history and diff. It also provides various options for filtering, sorting and formatting the debug data.

Configuration Options

Memory debug data is not free. Collecting the debug data on the target requires extra memory and cpu cycles. To be useful the debug data has to be transferred to the host, and this can be time-consuming. If the application developer is tracking down problems with running out of memory then the debug data exacerbates the situation. Hence by default memory debug data is disabled, and there are configuration options to control exactly what gets enabled.

The first option to consider is CYGDBG_MEMALLOC_DEBUG_DEBUGDATA. This has to be explicitly enabled by the developer. If it is left disabled then no debug data functionality is available.

Once the main debug data option has been enabled the memory allocation code will collect information about all current allocations. The minimum information needed is a pointer to the allocated data, the number of bytes involved, a 32-bit sequence number to allow the host-side to identify and sort the allocations, plus another pointer for linked list management. This gives a minimum overhead of 16 bytes per allocated chunk (assuming a typical 32-bit processor). However this allows for only limited analysis. Additional fields are controlled by separate configuration options:

CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_ACTUAL_SIZE

When the application requests say 12 bytes of data the memory allocation code will actually allocate more than this. There is some unavoidable overhead to keep track of the various allocations. There may be alignment restrictions. Optional Debug guards add to the overhead. If CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_ACTUAL_SIZE is enabled then the debug data will include the actual size of each allocation, not just the requested size. By default this option is enabled. The cost is an extra size_t, usually four bytes, in each allocation record.

CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_TIMESTAMP

Every allocation record in the debug data contains a unique sequence number, a simple 32-bit counter. Amongst other uses this allows host-side tools to sort allocation events in time-order. However a sequence number does not give any information about the time elapsed between allocations. More detailed time information can be very useful, for example to associate allocations with external events. This takes the form of a cyg_tick_count_t as returned by the kernel function cyg_current_time(). The typical cost is an extra eight bytes in each allocation record.

CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_TIMESTAMP is enabled by default if the eCos kernel CYGPKG_KERNEL is present. It cannot be enabled if the configuration does not include the kernel.

CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD

In multi-threaded applications it can be useful to know which thread allocated which chunk of memory. For example if the application is structured as a set of mostly independent subsystems operating in a separate threads then each subsystem's memory usage can be analyzed separately. Optionally the debug data can include thread information, consisting of a unique numerical thread id, the cyg_handle_t identifying the thread, and the thread name as passed to cyg_thread_create(). The overhead is a 32-bit integer in each allocation record, plus a small amount of extra memory to keep track of the threads that have performed memory allocations.

CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD is enabled by default if the eCos kernel CYGPKG_KERNEL is present. It cannot be enabled if the configuration does not include the kernel.

CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_BACKTRACE

Arguably the most useful information about each memory allocation is a partial backtrace, identifying the code responsible for each allocation. On the target side this is implemented using the support function __builtin_return_address() provided by the gcc compiler. On the host-side the executable can be disassembled to map a return address onto the calling function. If the executable contains -g debug information then it may also be possible to work out the corresponding source file name and line number, and hence the exact line of code that performed the allocation.

CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_BACKTRACE is enabled by default, with a value of 1. This means the debug data will contain a single level of backtrace, e.g. the function that called malloc(). The backtrace level can be increased up to a maximum of 8, giving more detailed information about each allocation. This is especially useful when allocations occur inside library code since it gives a closer association between application actions and memory allocations. Higher levels do involve extra memory overhead, a 32-bit integer per level per allocation record, and extra cpu cycles.

Important

	Important
On many architectures the GNU tools only provided limited backtrace functionality. Often only a single level of backtrace is available. If `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_BACKTRACE` is set to a value greater than 1 the compiler will issue warnings when building the memory allocation package, and the extra debug data backtrace slots will just be filled with zeroes. Even if backtrace information is available it is not always as useful as might be thought. Because of compiler optimizations the relation between the generated code and the original source is not always obvious, so when the host-side tools convert a return address to a source file and line number the results may not be exactly correct. For backtrace levels greater than 1 the results may even be completely wrong. The details will vary from architecture to architecture. When the code involves C++ template instantiation the compiler may not provide enough debug information to allow the backtrace pointers to be analyzed fully.

On many architectures the GNU tools only provided limited backtrace functionality. Often only a single level of backtrace is available. If CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_BACKTRACE is set to a value greater than 1 the compiler will issue warnings when building the memory allocation package, and the extra debug data backtrace slots will just be filled with zeroes.

Even if backtrace information is available it is not always as useful as might be thought. Because of compiler optimizations the relation between the generated code and the original source is not always obvious, so when the host-side tools convert a return address to a source file and line number the results may not be exactly correct. For backtrace levels greater than 1 the results may even be completely wrong. The details will vary from architecture to architecture. When the code involves C++ template instantiation the compiler may not provide enough debug information to allow the backtrace pointers to be analyzed fully.

Depending on which options and how many backtrace levels are enabled, each allocation record will take up between 16 and 64 bytes of data on a 32-bit processor, and somewhat more on a 64-bit processor.

By default memory debug data is collected only for current allocations. This is sufficient for many debug purposes. For example if the problem is a buffer overflow then looking at the current allocations usually allows the developer to determine what the buffer and the surrounding allocations are used for. A complete dump of all current allocations can be used to figure out what all the memory is being used for. Examining two dumps separated in time can be used to track down memory leaks. However sometimes it is necessary to know about free operations as well as current allocations. A good example is identifying which thread freed a chunk that other threads still believe to be usable. To support this it is possible to collect historical debug data as well as the details of all current allocations.

There is a major problem with historical debug data. The number of current allocations is limited by the memory available on the target, so typically will be somewhere between 100 and 10000. The corresponding debug data will occupy between 2K and 640K of the available target-side memory, and there is an implicit upper bound. Historical data does not have an upper bound: an application may make millions of malloc() and free() calls yet never have more than a 100 allocations at any one time. Those millions of history records would occupy many megabytes of target-side memory. Typical targets do not have such amounts of spare memory, and even if they do transferring the history to the host for analysis would be very time-consuming. Therefore it is not practical to keep a full history. Instead the history debug data goes into a circular buffer, so only the last n records are kept. Overflows are detected and the application developer can take action, if desired.

By default history is disabled, controlled by the configuration option CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_HISTORY. If enabled the number of entries in the history circular buffer is controlled by CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_HISTORY_RECORDS, with a default value of 2048. Each history record stores both allocation and free debug data, so is approximately twice the size of an allocation record. With default settings the history circular buffer will occupy approximately 100K of target-side memory.

Enabling memory debug data does not affect the memory allocation APIs: applications just call malloc() and free() as usual. Similarly C++ applications can use the new and delete operators, but to get the maximum benefits of the backtrace info it is desirable to enable CYGFUN_MEMALLOC_MALLOC_CXX_OPERATORS.

Dumping the Debug Data with GDB

When an application is linked with a suitable eCos configuration, the memory debug data will be collected automatically on the target-side. This debug data needs to be transferred to the host, and a number of gdb macros are provided for this purpose. The application is debugged in a gdb session as usual. At an appropriate time the target is halted and the appropriate gdb macro is invoked. This will transfer the current debug data to the host, generating a file mddout.0 which can then be fed into the ecosmdd analysis program.

The gdb macros can be found in the file mdd.gdb in the memory allocation packages' host subdirectory. Typically this gdb script will be source'd by the user's own .gdbinit gdb initialization script, so that the macros are always available. Alternatively the macros can be copied directly into that file, albeit at the risk of complications if the macros get updated in a future version of this package. The host subdirectory also contains a program ecosmdd (actually a portable Tcl script). This must be installed in an appropriate location that is on the user's PATH. The gdb macros rely on being able to execute this program.

The main macro is mdd_dump. It does not take any arguments. Usually it will just transfer the memory debug data to the host. However there is a problem if the target-side code was in the middle of updating the debug data: that data may not be in an entirely consistent state. To avoid problems the mdd_dump will check a target-side busy flag. If appropriate it will report that a dump may currently be unsafe, instead of proceeding with the dump anyway. The function cyg_memalloc_dd_done will be called once the debug data has been updated, so an application developer can set a temporary breakpoint on that function and let the application continue briefly. Alternatively there is a separate macro mdd_dumpnow. This will ignore the busy flag and proceed with the dump, irrespective of what the target happened to be doing when it was halted. There is a very small possibility that the resulting dump file will have problems.

Note

	Note
The memory allocation code treats the actual allocation and the updating of the debug info as separate steps. Hence it is possible that a chunk of memory has just been allocated or freed, but the `mddout.0` dump file will not yet show this. Usually this temporary discrepancy is not important: it can only matter if the application developer is analysing the debug data and the target-side state concurrently. However application developers should be aware of the possibility. An alternative implementation involving more locking would be possible, but at the cost of potentially significant changes in the application's behaviour.

The memory allocation code treats the actual allocation and the updating of the debug info as separate steps. Hence it is possible that a chunk of memory has just been allocated or freed, but the mddout.0 dump file will not yet show this. Usually this temporary discrepancy is not important: it can only matter if the application developer is analysing the debug data and the target-side state concurrently. However application developers should be aware of the possibility. An alternative implementation involving more locking would be possible, but at the cost of potentially significant changes in the application's behaviour.

The time taken to generate a dump file will depend both on how much debug data is collected and on the debug communication channel. It can take anywhere from several seconds to many minutes. Enabling the history circular support can significantly increase the time needed.

Sometimes it is desirable to generate more than one mddout dump file in a single debug session. For example the user may want to halt the application at two specific points in the run and find out what allocations have occurred between these points. The first invocation of mdd_dump or mdd_dumpnow will produce a dump file mddout.0. Subsequent invocations will produce dump files mddout.1, mddout.2, and so on. If desired the numbering can be reset using the mdd_reset macro. The next debug session will again produce files mddout.0, mddout.1 and so on, overwriting the previous run's results. The macro scripting facilities in gdb are rather limited, so the file naming is actually handled by invoking the ecosmdd program.

If the debug data includes the history circular buffer there is special support for handling overflows. This makes it possible to collect complete history information, spread over a number of mddout dump files, which can then be analyzed together. When an overflow occurs the target-side will call the function cyg_memalloc_dd_history_overflow(). Application developers can set a breakpoint on this function, and use mdd_dump whenever the breakpoint is hit to generate another dump file with a whole buffer's worth of history records. mdd_dump will automatically reset the circular buffer. CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_HISTORY_RECORDS can be increased to reduce the number of dump files that are needed, at the cost of target-side memory.

A similar technique can be used for other purposes. For example the application developer may want to know the state of the heap once it has reached approximately 80% full. One way of achieving this is to have a separate high-priority thread which calls mallinfo at regular intervals. When it detects the desired condition it calls a special function. The developer sets a breakpoint on that function and can then take appropriate action when the condition is satisfied.

Extracting Statistics

The ecosmdd stats command is the simplest of the available analysis tools. It just takes a single argument, an mddout dump file:

$ ecosmdd stats mddout.0
mddout.0: statistics
Heap      : 0x00097d68 to 0x01ffffff, size 32160K (32932504 bytes)
History   : 132773 memory allocations, 130257 frees
Current   : 1268K (1298850 bytes) used in 2516 allocations
Actual    : approximately 1508K (1544784 bytes)
Overhead  : approximately 240K (245934 bytes), 15%
Debugdata : approximately 107K (110280 bytes) static, 92K (94628 bytes) dynamic
          : (debug data is in addition to other overheads)
Allocators:
malloc() 1009
new(nothrow) 788
new(nothrow)[] 451
calloc() 251
realloc() 17
Threads   :
        1 : handle 0x00075670, Idle Thread
        2 : handle 0x00093af8, main
        3 : handle 0x000739b0, thread_0
        4 : handle 0x00073a50, thread_1
        5 : handle 0x00073af0, thread_2
        6 : handle 0x00073b90, thread_3
Options   : actual_size enabled, time stamps enabled, thread info enabled
          : backtrace enabled, 1 levels
          : history enabled, 2048 records max

The fields in the output are as follows:

The start and end address of the heap and its size. This example is for a development board with a generous 32MB. Approximately 600K is used for application code and static data and for RedBoot, leaving most of the memory available for dynamic memory allocation.
Total numbers of past allocations and frees, with the difference corresponding to the number of current allocations. Note that the total size of past allocations is not recorded because of the likelihood of an overflow and hence misleading data.
Totals for the current allocations, giving the size as requested by application code.
The actual amount of memory used for these allocations, allowing for overhead. This information is only available if CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_ACTUAL_SIZE is enabled. Note that the numbers are approximate: they only count per-allocation overhead; there may be additional costs for pool data structures and the like which are not included; usually these are sufficiently small that they can be ignored.
The difference between the above two fields. For this example the overhead is comparatively high. The configuration included support for debug guards which adds an extra 12 bytes to each allocation plus whatever was needed by the allocation code itself. Most of the allocations were small, so the guards have a disproportionate effect.
Additional memory needed for the debug data, both static and dynamic. The configuration included a history circular buffer with default settings, accounting for most of the static cost. The debug data for 2516 current allocations account for most of the dynamic costs, and is not included in the earlier figures. The results of mallinfo() will include the dynamic debug data.
Counts for the various types of dynamic memory allocation.
A list of the various threads: unique id, a cyg_handle_t handle, and the name passed to cyg_thread_create. This information is only available if CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD is enabled. The ids can be used in a filter to show only allocations performed by the specified thread. The code only keeps track of threads involved with dynamic memory allocation, not every thread in the system. It is actually unlikely that the idle thread allocated any memory. Instead allocations during system initialization, before the scheduler was started, will usually be ascribed to the idle thread.
Details of the relevant configuration options. This can be useful when figuring out what filters, sort keys, or format specifiers are permitted, as an alternative to checking the configuration options.

Dumping Current Allocations

The ecosmdd dump can be used to analyze an mddout dump file and report on all current allocations.

$ ecosmdd dump consume mddout.0
0x00097d78 : malloc() 256 bytes, actual size 272 (+16), seqno 0, time 0
  By thread 1, 0x00075670 Idle Thread
  1) backtrace 0x0004da74 function Cyg_StdioStreamBuffer::set_buffer(unsigned, unsigned char*)
    /opt/ecos/packages/language/c/libc/stdio/current/src/common/streambuf.cxx:96
    "        malloced_buf = (cyg_uint8 * )malloc( size );"
0x000c0f50 : malloc() 13 bytes, actual size 32 (+19), seqno 229605, time 3960
  By thread 3, 0x000739b0 thread_0
  1) backtrace 0x00040ed8 function worker2()
    /tmp/mdd/consume.cxx:393
    "            allocs[index].data.c    = malloc(size);"
0x000c0f70 : new(nothrow) 1024 bytes, actual size 1040 (+16), seqno 251083, time 4329
  By thread 5, 0x00073af0 thread_2
  1) backtrace 0x00040c48 function worker1(int)
    /tmp/mdd/consume.cxx:315
    "            allocs[index].data.large    = new(std::nothrow) Large;"
…

consume is the executable. This output shows the first three allocation records, sorted in address order. The fields are as follows:

The address of the allocated chunk. This is the pointer that would be returned by e.g. malloc(). The memory allocation code may store some header information before this address, but that is transparent to the application. There is a big gap between the first and second records because the application freed a large buffer just before the dump file was generated.
The memory allocation function that was called to get this chunk. This may be a standard C library function or a C++ operator.
The allocation size requested by the application.
The actual allocation size and, in brackets, the overhead. This is provided only if CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_ACTUAL_SIZE is enabled.
A sequence number. The first record shows the very first dynamic memory allocation in this test run, performed by the standard I/O initialization code. Sequence numbers are generated using a simple incrementing counter and are unique within a test run. The counter can overflow, but that is only likely to happen if an application makes very intensive use of malloc() and runs for several days.
A timestamp. This is a kernel cyg_tick_count_t as returned by the kernel function cyg_current_time(). Usually it corresponds to a counter running at 100Hz, so the second record is for a malloc() that occurred about 40 seconds into the run. Timestamps are only listed if CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_TIMESTAMP is enabled.
A line of thread information showing the thread id, handle and name. This requires CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD.
The level 1 backtrace. The first line gives the return address and the calling function. The second line gives a source code file name and line number. The third line shows the actual source line. In the third record the source code shows a C++ Large object being created. If enabled, additional levels of backtrace will follow.
The function name is only available if the executable is specified on the command line. The file name and line number are only available if the executable contains -g debug information for the specified function. Usually this will be true for the application code itself and for eCos code, but not for other libraries supplied in binary format. The source line is only available if the file name and line number are known and the relevant file can be found on the current system. Again this may not be true for libraries supplied in binary format.

The executable does not have to be specified on the command line. Disassembling it can take considerable time, and serves only to provide more detailed backtrace information. Typical output without an executable would look like:

$ ecosmdd dump mddout.0 | more
0x00097d78 : malloc() 256 bytes, actual size 272 (+16), seqno 0, time 0
  By thread 1, 0x00075670 Idle Thread
  1) backtrace return address 0x0004da74
…

The dump subcommand accepts the standard options for architecture, ignoring certain files, sorting the output, applying filters, and formatting each record. For example to show only partial information for the allocations performed by thread 4 between approximately 40 and 42 seconds into the run, sorted by size with largest first, then by allocation time earliest first, the following can be used:

$ ecosmdd dump -Fthread=4 -Ftime_min=4000 -Ftime_max=4200 -SNs \
        -f '%p %a %n @ %T' mddout.0
0x002a43e8 malloc() 1553 @ 4079
0x000f9308 malloc() 1139 @ 4149
0x000c2a80 new(nothrow) 1024 @ 4104
0x00292428 new(nothrow)[] 388 @ 4194
0x000e0998 malloc() 240 @ 4147
0x00275678 new(nothrow) 128 @ 4013
0x00238c40 new(nothrow) 128 @ 4104
0x0023f7a8 new(nothrow) 128 @ 4194
0x000e1048 malloc() 18 @ 4014
0x0032cfe8 new(nothrow) 16 @ 4106
0x00131fa0 malloc() 8 @ 4107
0x0015d1a8 malloc() 8 @ 4125
0x002162d8 malloc() 7 @ 4129
0x001217c0 malloc() 7 @ 4190

The options should immediately follow the dump subcommand, before the executable or mddout file.

Showing the History

$ ecosmdd history consume mddout.0
Caution: history is incomplete.

malloc() 256 bytes: 0x00097d78 , actual size 272 (+16), seqno 0, time 0
  By thread 1, 0x00075670 Idle Thread
  1) backtrace 0x0004da74 function Cyg_StdioStreamBuffer::set_buffer(unsigned, unsigned char*)
    /opt/ecos/packages/language/c/libc/stdio/current/src/common/streambuf.cxx:96
    "        malloced_buf = (cyg_uint8 * )malloc( size );"
malloc() 131072 bytes: 0x00097e88 (freed) , actual size 131088 (+16), seqno 1, time 0
  By thread 2, 0x00093af8 main
  1) backtrace 0x000415b4 function main
    /tmp/mdd/consume.cxx:575
    "    spare   = malloc(128 * 1024);"
new(nothrow) 16 bytes: 0x00319270 (freed) , actual size 32 (+16), seqno 223425, time 3851
  By thread 3, 0x000739b0 thread_0
  1) backtrace 0x00040f4c function worker2()
    /tmp//consume.cxx:409
    "            allocs[index].data.small    = new(std::nothrow) Small;"
…
delete 16 bytes: 0x00156218 , actual size 40 (+24), seqno 258950, time 4461
  By thread 5, 0x000739b0 thread_0
  1) backtrace 0x00040cb8 function worker1(int)
    /tmp/mdd/consume.cxx:251
    "        break;"
free() 347 bytes: 0x001e6b08 , actual size 368 (+21), seqno 258951, time 4461
  By thread 6, 0x000739b0 thread_0
  1) backtrace 0x000409a4 function worker1(int)
    /tmp/mdd/consume.cxx:216
    "            free(allocs[index].data.c);"
…

Here ecosmdd has processed the executable and read in both the history data and the current allocation records from mddout.0. The file does not contain complete history information: there have been at least 258951 allocation and free operations, and the history buffer only stores the last 2048 frees. Each record is output in a similar format to ecosmdd dump. However history analysis is based around the order of events rather than the current state of the heap so the allocation function is shown before the heap.

The first record shows the first allocation in the system, and it is still allocated. Next comes the second allocation, which has been freed. This information will have come from the history circular buffer, implying that the buffer was freed in one of the last 2048 free operations. The third record shows another buffer that has been freed recently. There are no records between sequence numbers 1 and 223425, so all memory that has been allocated in the interval has already been freed and the relevant records are no longer in the history buffer.

The next two records show delete and free() operations. The format is essentially the same. The sequence number, timestamp, thread and backtrace information correspond to the free operation, not the allocation. Note that for the delete operation ecosmdd failed to get the source line number right: the delete invocation actually occurred a couple of lines earlier. Unfortunately the debug information in the executable was not sufficiently precise.

By default the history records will be shown earliest first. This order can be reversed with a -r option. ecosmdd history also accepts the standard options for architecture, ignoring certain files, applying filters, and formatting each record. The standard sort option is not supported because history implies sorting in time order. For example:

$ ecosmdd history -r -f '%a %p, %n bytes, seqno %s' consume mddout.0
Caution: history is incomplete.

free() 0x00097e88, 131072 bytes, seqno 263029
free() 0x00302490, 11 bytes, seqno 263028
new(nothrow) 0x00182cc0, 128 bytes, seqno 263027
…

If the desired history information is spread over more than one mddout file then they can all be passed to ecosmdd history. For example:

$ ecosmdd history -r consume mddout.0 mddout.1 mddout.2 mddout3
…

Options and the executable are handled as before. The mddout files should be listed in order of creation, and should correspond to a single test run. ecosmdd will extract both the history circular buffer and the current allocation data for the last file, but only the history buffers for the earlier ones - details of their current allocations can be found in later files. Obviously if eCos has been configured with CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_HISTORY disabled then only the last file will contain useful information.

Comparing Two mddout Files

Sometimes, especially when tracking down a memory leak, it is useful to compare two dump files taken at different times and see what has changed. This functionality is provided by ecosmdd diff:

$ ecosmdd diff consume mddout.1 mddout.2
File mddout.1 : 1496K (1532048 bytes) used in 2543 allocations.
File mddout.2 : 1488K (1523769 bytes) used in 2483 allocations.
1331 new allocations in mddout.2 but not in mddout.1
1391 allocations in mddout.1 but freed in mddout.2

New allocations in mddout.2 but not in mddout.1
0x000ba8a8 : new(nothrow)[] 228 bytes, actual size 256 (+28), seqno 11001, time 214
  By thread 3, 0x000739b0 thread_0
  1) backtrace 0x00040fc0 function worker2()
    /tmp/mdd/consume.cxx:417
    "            allocs[index].data.smallv   = new(std::nothrow) Small[count];"
0x000ba9a8 : new(nothrow) 128 bytes, actual size 144 (+16), seqno 11528, time 222
  By thread 5, 0x00073af0 thread_2
  1) backtrace 0x00040b78 function worker1(int)
    /tmp/mdd/consume.cxx:296
    "            allocs[index].data.medium    = new(std::nothrow) Medium;"
…
Allocations in mddout.1 but freed in mddout.2
0x000ba9a8 : new(nothrow) 128 bytes, actual size 144 (+16), seqno 8213, time 162
  By thread 5, 0x00073af0 thread_2
  1) backtrace 0x00040b78 function worker1(int)
    /tmp/mdd/consume.cxx:296
    "            allocs[index].data.medium    = new(std::nothrow) Medium;"
0x000baa68 : new(nothrow) 128 bytes, actual size 144 (+16), seqno 9539, time 185
  By thread 6, 0x00073b90 thread_3
  1) backtrace 0x00041028 function worker2()
    /tmp/mdd/consume.cxx:424
    "            allocs[index].data.medium    = new(std::nothrow) Medium;"
…

The output begins with some statistics about the two dump files. Next comes a list of all memory chunks allocated in the second file but not in the first, and of all chunks allocated in the first but not the second. The diff uses the unique sequence number so will not be fooled if a chunk is freed and then allocated again.

ecosmdd diff accepts the standard options for architecture, ignoring certain files, sorting the output, applying filters, and formatting each record. Optionally these options can be followed by the executable, to get extended backtrace information. Finally there should be two mddout files:

$ ecosmdd diff -Fsize_min=10240 -f '%n bytes at %p by %f1' -SN \
        consume mddout.1 mddout.2
File mddout.1 : 1496K (1532048 bytes) used in 2543 allocations.
File mddout.2 : 1488K (1523769 bytes) used in 2483 allocations.
1331 new allocations in mddout.2 but not in mddout.1
1391 allocations in mddout.1 but freed in mddout.2

New allocations in mddout.2 but not in mddout.1
19691 bytes at 0x0025d498 by worker1(int)
19233 bytes at 0x00273758 by worker2()
…

Allocations in mddout.1 but freed in mddout.2
19858 bytes at 0x0025d498 by worker2()
18085 bytes at 0x001a2bf8 by worker2()
…

Standard Options

The various ecosmdd subcommands accept a number of standard options for specifying the architecture, ignoring certain source files, sorting and filtering the output, and formatting each record.

Specifying the Architecture

To provide extended backtrace information ecosmdd needs to disassemble the supplied executable. This involves running the appropriate objdump command, for example arm-elf-objdump or m68k-elf-objdump. ecosmdd reads in the executable's ELF header and uses this to work out the architecture. If it fails the architecture must instead be specified on the command line, for example:

$ ecosmdd dump -Adeepthought-elf …

ecosmdd will now try to run deepthought-elf-objdump to disassemble the executable.

Ignoring Selected Source Files

When the application involves extended use of header files with inline functions, the backtrace information can get even more confused than usual. Consider a function tom() which invokes an inline function dick() in a header file <harry.h, and dick() makes a memory allocation call. At run-time, because of the inlining the return address will be inside function tom(). However the debug information for the return address will usually specify the header file, not the source file containing tom(). This can make it much more difficult to interpret the backtrace.

There is no perfect solution to this problem, but ecosmdd contains an attempt at a partial solution. When disassembling an executable by default it will ignore any debug info where the file name matches the glob pattern */include/*, if more accurate information for the current function is already available. This should catch inline functions in eCos, gcc and libstdc++ headers, and hence the backtrace output should more closely match what is actually happening in the application.

The default behaviour can be suppressed using the -n option, for example:

$ ecosmdd dump -n consume mddout.0
…

Alternatively a different glob pattern can be specified with the -I option (taking care to stop the shell from expanding the glob pattern prematurely):

$ ecosmdd dump -I\*.h consume mddout.0

Sorting the Output

By default the dump and diff will output their results sorted by increasing address. A different sort can be specified using the -S option, for example:

$ ecosmdd dump -SNs consume mddout.0

The -S should be followed by one or more sort keys. In the above example the primary sort key is N, specifying sort by decreasing allocation size so the largest allocations come first. When two allocations are the same size the secondary sort key (if specified) comes into play. Here the secondary key is s, meaning by increasing sequence number, so two allocations of the same size will be shown in history order. Any number of sort keys can be specified but it does not make sense to repeat a sort key or its inverse. Sequence numbers are unique so it also does not make sense to specify another sort key after s or S. If two allocations remain unsorted after all the specified sort keys have been processed then the output order is undefined. The available sort keys are:

p	Sort by increasing address, so the lowest address comes first.
P	Sort by decreasing address, so the highest address comes first.
n	Sort by increasing allocation size, so the smallest allocations come first.
N	Sort by decreasing allocation size, so the largest allocations come first.
s	Sort by increasing sequence numbers, so oldest allocations come first.
S	Sort by decreasing sequence number, so newest allocations come first.
a	Sort by memory allocation function, so for example all `realloc()` allocations will be grouped together.
t	Sort by increasing thread id. ecosmdd stats can be used to get details of the various threads. This sort key is only available if `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD` is enabled.
T	Sort by decreasing thread id. ecosmdd stats can be used to get details of the various threads. This sort key is only available if `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD` is enabled.

Filtering out Unwanted Data

Non-trivial applications can result in very large amounts of memory debug data. ecosmdd provides a number of filters to eliminate unwanted data. For example, to show only allocations of 1K or larger:

$ ecosmdd dump -Fsize_min=1024 consume mddout.0
…

A filter takes the form -F<key>=<value>. The supported keys are:

thread=<id>	Only show allocations performed by the specified thread. This filter can only be used if `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD` is enabled.
size_min=<size>	Ignore any allocations smaller than the specified size.
size_max=<size>	Ignore any allocations larger than the specified size.
seqno_min=<start>	Only show the event identified by the sequence number and subsequent ones.
seqno_max=<end>	Only show events up to and including the one identified by the sequence number.
time_min=<start>	Discard any records prior to the specified time.
time_max=<end>	Discard any records after the specified time.
ptr_min=<base>	Filter out allocations before the specified address.
ptr_max=<limit>	Filter out allocations after the specified address.

Multiple filters can be specified. For example to show only allocations performed by thread 6 which are larger than 4K and which occurred in a certain time interval:

$ ecosmdd dump -Fthread=6 -Fsize_min=4096 -Ftime_min=4000 -Ftime_max=5000 \
    consume mddout.0

Formatting the Output

By default ecosmdd outputs all available information for each record. Sometimes it is better to see only some of the fields. At other times a different format may be preferred, for example to feed the ecosmdd output into some other tool. Hence it is possible to specify a custom format string, along similar lines to the C strftime and printf functions:

$ ecosmdd dump -f '%a for %n bytes -> %p'
malloc() for 256 bytes -> 0x00097d78
malloc() for 131072 bytes -> 0x00097e88
malloc() for 4 bytes -> 0x000b7e98
calloc() for 3724 bytes -> 0x000b7eb0
…

A % character introduces a conversion sequence. Other characters are just passed straight through. The supported conversion sequences are:

%%	A single % character.
%p	The address of the allocated chunk.
%n	The requested allocation size.
%m	The actual allocation size. This requires `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_ACTUAL_SIZE`.
%o	The allocation overhead for this chunk. This requires `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_ACTUAL_SIZE`.
%s	The sequence number.
%a	The allocating function, for example `malloc()`
%T	A timestamp for the event. This requires `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_TIMESTAMP`
%t	The thread identifier. This requires `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD`
%h	The thread handle. This requires `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD`
%N	The thread name. This requires `CYGDBG_MEMALLOC_DEBUG_DEBUGDATA_THREAD`
%b1 to %b8	The backtrace return address for the appropriate level. It is an error to specify a level greater than what is actually present in the `mddout` file.
%f1 to %f8	The backtrace function name for the appropriate level. This can only be used if the executable has been specified on the command line.
%w1 to %w8	The backtrace location for the appropriate level, in the form filename:linenumber. This can only be used if the executable has been specified on the command line, and even then the information is not always available.
%l1 to %l8	The backtrace source line for the appropriate level. This can only be used if the executable has been specified on the command line, and even then the information is not always available.

The usual format string for a dump operation, assuming default configuration settings, is: '%p : %a %n bytes, %m (+%o), seqno %s, time %T\n By thread %t, %h %N\n 1) backtrace %b1 function %f1\n %w1\n \"%l1\"'

Name

Synopsis