Valgrind

	Information Source Code Documentation Contact How to Help Gallery \| Projects / Users \| Press / Media \| Awards \| Surveys \| Artwork / Clothing \|
COMPILE TIME Debugging Memory Problems by Steve Best May 2003 Courtesy of Linux Magazine Dynamic memory allocation seems straightforward enough: you allocate memory on demand -- using `malloc()` or one of its variants -- and free memory when it's no longer needed. Indeed, memory management would be that easy -- if only we programmers never made mistakes. Alas, we do make mistakes (from time to time) and memory management problems do occur. For example, a memory leak occurs when memory is allocated but never freed. Leaks can obviously be caused if you `malloc()` without a corresponding `free()`, but leaks can also be inadvertently caused if a pointer to dynamically-allocated memory is deleted, lost, or overwritten. Memory corruption can occur when allocated (and in use) memory is overwritten accidentally. Buffer overruns -- caused by writing past the end of a block of allocated memory -- frequently corrupt memory. Regardless of the root cause, memory management errors can have unexpected, even devastating effects on application and system behavior. With dwindling available memory, processes and entire systems can grind to a halt, while corrupted memory often leads to spurious crashes. System security is also susceptible to buffer overruns. Worse, it might take days before evidence of a real problem appears. Today, it's common for Linux systems to have a gigabyte of main memory. If a program leaks a small amount of memory, it'll take some time before the application and system show symptoms of a problem. Memory management errors can be quite insidious and very difficult to find and fix. This month, let's take a look at Valgrind, a tool that can help detect common memory management errors. We'll review the basics, write some "buggy" code, and then use Valgrind to find the mistakes. Valgrind was written by Julian Seward and is available under the GNU Public License. Dynamic Memory Functions Of all of the library calls in Linux, only four manage memory: `malloc()`, `calloc()`, `realloc()`, and `free()`. All of these functions have prototypes in the `stdlib.h` include file. `malloc()` allocates a memory block. It's prototype is `void* malloc(size_t size)` and the single argument is the number of bytes of memory to allocate. If the allocation is successful, `malloc` returns a pointer to the memory. If memory allocation fails for some reason (for example, if the system is out of memory), `malloc()` returns a `NULL` pointer. `calloc()` allocates an array in memory and initializes all of the memory to zero (with `malloc()`, the allocated memory is uninitialized). `void* calloc(size_t nmemb, size_t size)` is the prototype. The first argument is the number of elements in the array and the second argument is the size (in bytes) of each element. Like `malloc()`, `calloc()` returns a pointer if the memory allocation was successful, and `NULL` otherwise. `realloc()` is defined as `void* realloc (void ptr, size_t size)`. `realloc()` changes the size of the object referenced by the pointer to a new size specified by the second argument. `realloc()` returns a pointer to the moved block of memory. `free()` deallocates a memory block. It takes a pointer as an argument, as shown in its prototype, `void free (void ptr)`, and releases that memory. While the API for memory management is unusually small, the number and kind of memory errors that can occur is quite substantial, including reading and using uninitialized memory; reading/writing memory after it has been freed; reading/writing from memory past the allocated size; reading/writing inappropriate areas on the stack; and memory leaks. Luckily, Valgrind can detect all of those problems. When a program is run under Valgrind, all memory reads and writes are inspected and all calls to `malloc()/new()` and `free()/ delete()` are intercepted. Installing Valgrind Valgrind is closely tied to the architecture of the operating system. Currently, it's only supported on Linux x86 machines with kernels from the 2.2.x and 2.4.x series and glibc 2.1.x or 2.2.x. You can get the source for Valgrind at http://www.valgrind.org/. (At the time of this writing, the latest stable release of Valgrind is 1.0.4. The latest development release is 1.9.4.) Download the latest stable release (or the latest development release, depending on your sense of adventure) and build the software: % bunzip2 valgrind-1.0.4.tar.bz2 % tar xvf valgrind-1.0.4.tar % cd valgrind-1.0.4 % ./configure % make % make install One great feature of Valgrind is that it doesn't require you to build (or re-build) your application in any special way. Simply place valgrind right in front of the program you want to inspect. For example, the command % valgrind ls -all inspects and monitors the `ls` command. (Running this command on Red Hat Linux 8.0 showed no errors.) The output from Valgrind has the following format. ==20691== 8192 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==20691== at 0x40048434: malloc (vg_clientfuncs.c:100) ==20691== by 0x806910C: fscklog_init (fsckwsp.c:2491) ==20691== by 0x806E7D0: initial_processing (xchkdsk.c:2101) ==20691== by 0x806C70D: main (xchkdsk.c:289) The `==I<xxxxx>==` string prefixes each line of Valgrind-specific output. (Application-specific output does not have the prefix.) In the sample output shown above, the `20691` is the process ID. The message indicates that there is a memory leak of 8192 bytes and provides diagnostics and a kind of trace to direct you to the error. The second and subsequent lines indicate that the memory is initially allocated on line 2491 in the routine `fscklog_init()` (in the file `fsckwsp.c`); `fscklog_init()` was called from `initial_ processing()` at line number 2101; and `main()` called `initial_processing()`. By the way, if `fscklog_init()` is called more than once in the `initial_processing()` routine, the line number clearly identifies which call caused the problem. Losing Your Memory Let's use Valgrind to find some common memory errors. The first sample program, sample1.c, shown in Listing One, has more than one memory leak. The code in Listing One allocates two 512-byte blocks of memory and then sets the pointer to the block to the second block. As a result, the address of the second block is lost, causing a memory leak. There are also 512 10-byte blocks of memory that are never freed. Listing One: sample1.c, a program with a memory leak #include <stdlib.h> #include <stdio.h> void get_mem() { char ptr; ptr = (char ) malloc (10); /* memory not freed / } int main(void) { int i; char ptr1, ptr2; ptr1 = (char ) malloc (512); ptr2 = (char ) malloc (512); ptr2 = ptr1; / causes the memory leak of ptr1 / free(ptr2); free(ptr1); for ( i = 0; i < 512; i++ ) { get_mem(); } } Compile and analyze sample1.c using the following commands: % gcc sample1.c -o sample1 % valgrind -v --leak-check=yes ./sample1 Valgrind produces the output shown in Figure One, correctly identifying the 512-byte and the 512 10-byte memory leaks. The `-v` provides verbose feedback and the `--leak-check=yes` option searches for memory leaks when the client program exits. Figure One: Valgrind output for Listing One* ==2009== 512 bytes in 1 blocks are definitely lost in loss record 1 of 2 ==2009== at 0x400483E4: malloc (vg_clientfuncs.c:100) ==2009== by 0x80484D4: main (in /jfs/article/sample1) ==2009== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==2009== by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample1) ==2009== ==2009== 5120 bytes in 512 blocks are definitely lost in loss record 2 of 2 ==2009== at 0x400483E4: malloc (vg_clientfuncs.c:100) ==2009== by 0x80484A0: get_mem (in /jfs/article/sample1) ==2009== by 0x8048519: main (in /jfs/article/sample1) ==2009== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==2009== ==2009== LEAK SUMMARY: ==2009== definitely lost: 5632 bytes in 513 blocks. sample2.c, shown in Listing Two, demonstrates another common memory error: reading beyond the end of an array of bytes. Again, build the sample code and run Valgrind to analyze it. Listing Two: sample2.c, a program that tries to access memory beyond the end of an array #include <stdlib.h> #include <stdio.h> int main(void) { char chptr; char chptr1; int i = 1; chptr = (char ) malloc(512); chptr1 = (char ) malloc (512); for ( i; i <= 513; i++ ) { chptr[i] = '?'; /* error when i = 513 invalid write / chptr1[i] = chptr[i]; / error when i = 513 invalid read and write / } free(chptr1); free(chptr); } % gcc sample2.c -o sample2 % valgrind ./sample2 As you can see from the output in Figure Two, references to element 513 in the two arrays cause a write error, a read error, and another write error. The message `Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd` indicates that there is no storage beyond the end of the array of 512 bytes. Figure Two: Valgrind output for Listing Two* ==3016== ... ==3016== Invalid write of size 1 ==3016== at 0x80484DA: main (in /jfs/article/sample2) ==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==3016== by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2) ==3016== Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd ==3016== at 0x400483E4: malloc (vg_clientfuncs.c:100) ==3016== by 0x80484AA: main (in /jfs/article/sample2) ==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==3016== by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2) ==3016== ==3016== Invalid read of size 1 ==3016== at 0x80484EB: main (in /jfs/article/sample2) ==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==3016== by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2) ==3016== Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd ==3016== at 0x400483E4: malloc (vg_clientfuncs.c:100) ==3016== by 0x80484AA: main (in /jfs/article/sample2) ==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==3016== by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2) ==3016== ==3016== Invalid write of size 1 ==3016== at 0x80484EB: main (in /jfs/article/sample2) ==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==3016== by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2) ==3016== Address 0x40CA0454 is 0 bytes after a block of size 512 alloc'd ==3016== at 0x400483E4: malloc (vg_clientfuncs.c:100) ==3016== by 0x80484BF: main (in /jfs/article/sample2) ==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==3016== by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2) Finally, to show how Valgrind finds invalid use of uninitialized memory, let's look at the results of analyzing the Journaled File System's (JFS) `fsck` utility. As before, we fun `fsck` under the auspices of Valgrind: % valgrind -v -leak-check=yes \ fsck.jfs /dev/hdb1 Figure Three shows a snippet of the output. Figure Three: Valgrind output for the Journaled File System utility fsck.jfs ... ==12903== Conditional jump or move depends on uninitialised value(s) ==12903== at 0x8079FCC: __divdi3 (in /sbin/fsck.jfs) ==12903== by 0x805CB0E: validate_super (fsckmeta.c:2331) ==12903== by 0x805C266: validate_repair_superblock (fsckmeta.c:1833) ==12903== by 0x806E2B5: initial_processing (xchkdsk.c:1968) The `validate_super()` routine can be found in the `jfsutils` package in `jfsutils-1.x.x/fsck/fsckmeta.c`. Listing Three shows a portion of the code: Listing Three: A code snippet from fsckmeta.c int validate_super( int which_super ) { int64_t bytes_on_device; /* get physical device size / vfs_rc = ujfs_get_dev_size(Dev_IOPort, &bytes_on_device); . . . dev_blks_on_device = bytes_on_device / Dev_blksize; / Line 2331 / if (sb_ptr->s_pbsize != Dev_blksize) { ... The output from Valgrind indicates that an uninitialized variable is used on line 2331 -- that's the line that says, `dev_blks_on_device = bytes_ on_device / Dev_blksize`. As you can see, bytes_on_device is not set before its used. Using Valgrind, this memory management problem was identified and fixed before an end-user ever came across it. Cache Profiling Valgrind can also perform cache simulations and annotate your source line-by-line with the number of cache misses. In particular, it records: L1 instruction cache reads and misses L1 data cache reads and read misses, and writes and write misses L2 unified cache reads and read misses, and writes and writes misses L1 is a small amount of SRAM memory that's used as a cache. L1 temporarily stores instructions and data, ensuring that the processor has a steady supply of data to process while memory catches up delivering new data. L1 is integrated or packaged within the same module as the processor. Level 2 caching is performed in L2. Valgrind's cachegrind tool is used to do cache profiling -- you use it just like valgrind. For example, the following command looks at the `fsck.jfs` program: % cachegrind fsck.jfs -n -v /dev/hdb1 The output of cachegrind is collected in the file cachegrind.out. Sample output from analyzing `fsck.jfs` is shown in Figure Four. Figure Four: cachegrind.out, cachegrind's analysis of fsck.jfs* ==11004== I refs: 99,813,615 ==11004== I1 misses: 4,301 ==11004== L2i misses: 3,210 ==11004== I1 miss rate: 0.0% ==11004== L2i miss rate: 0.0% ==11004== ==11004== D refs: 68,846,938 (65,916,678 rd + 2,930,260 wr) ==11004== D1 misses: 63,883 ( 37,768 rd + 26,115 wr) ==11004== L2d misses: 37,485 ( 14,330 rd + 23,155 wr) ==11004== D1 miss rate: 0.0% ( 0.0% + 0.8% ) ==11004== L2d miss rate: 0.0% ( 0.0% + 0.7% ) ==11004== ==11004== L2 refs: 68,184 ( 42,069 rd + 26,115 wr) ==11004== L2 misses: 40,695 ( 17,540 rd + 23,155 wr) ==11004== L2 miss rate: 0.0% ( 0.0% + 0.7% ) Events recorded abbreviations are: Ir : I cache reads (ie. instructions executed) I1mr: I1 cache read misses I2mr: L2 cache instruction read misses Dr : D cache reads (ie. memory reads) D1mr: D1 cache read misses D2mr: L2 cache data read misses Dw : D cache writes (ie. memory writes) D1mw: D1 cache write misses D2mw: L2 cache data write misses Next, you can annotate the output from cachegrind, by using `vg_annotate`: % vg_annotate `vg_annotate` produces output like that shown in Figure Five. The figure shows one annotation for the routine `dmap_pmap_ verify()`. The entry states that 88,405,584 instructions of 99,813,615 total instructions were spent in `dmap_pmap_ verify()`. This information is invaluable for deciding where to tune the program. You can also further annotate `dmap_pmap_ verify()` to find the actual instructions executed in that routine. Figure Five: annotation of one entry of cachegrind for fsck.jfs ------------------------------ Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw ------------------------------ 88,405,584 23 23 61,740,960 14,535 98 576,828 9 9 fsckbmap.c:dmap_pmap_verify END For a complete description of cachegrind, see the Valgrind User's Manual in the Valgrind distribution. Some Limitations Of Valgrind There are two issues that you should be aware of when analyzing an application with Valgrind. First, an application running under Valgrind consumes more memory. Second, your program will run slower. However, these two minor annoyances shouldn't stop you from using this powerful memory management debug tool. About the Author Steve Best works in the Linux Technology Center of IBM in Austin, Texas. He is currently working on the Journaled File System (JFS) for Linux project. Steve has done extensive work in operating system development with a focus in the areas of file systems, internationalization, and security. He can be reached at sbest@us.ibm.com. You can download the sample programs used in this article here.
Copyright © 2000-2023 Valgrind™ Developers Hosting kindly provided by sourceware.org Best Viewed With A(ny) Browser