Requirements

Introduction

What follows is a quick way to determine if a process is leaking heap memory. We use a tool called the "User Mode Dump Heap" or umdh.exe for short. This tool allows us to see the changes in a processes' heap over a period of time. Differences, in this case means memory allocation calls without matching de-allocation calls, not just the size.

Dumping a Heap

Environment

First, we must make sure the environment is correctly configured; otherwise the system symbols will be inaccessible. We only need to set _NT_SYMBOL_PATH to something appropriate:

  set _NT_SYMBOL_PATH=srv*C:\Symbols*http://msdl.microsoft.com/download/symbols

Also, the Debugging Tools for Windows should be in your path. That, for our purposes, will do the trick, but you can get fancier if you like.

Global Flags

Now, we must also enable the global flag to create a user mode stack trace database. This creates a registry value which is read when the process starts up that allows the system to keep track of the functions allocating memory inside the process.

  gflags.exe -i faucet.exe +ust

Where -i tells gflags.exe which image file we are interested in, and +ust enables the "Create user mode stack trace database" option. The output you should expect looks something like this:

  Current Registry Settings for faucet.exe executable are: 00001000
      ust - Create user mode stack trace database

You can also simply run gflags.exe and use the GUI to do your dirty work (you will also find many other options there under the "Image File" tab).

Dumping the Heap

We can liken the process of detecting a memory leak to that of taking a diff of two files: we take two snapshots of the heap and then ask for the difference--in terms of memory calls--between them. If there have been any serious leaks, we will see them.

Taking the snapshots themselves is easy, determining when to take them may be difficult, and highly dependent on how and when the leak presents itself.

Lets assume, for simplicity, that we have a constant and easily reproducible leak. This way we can present the procedures, which we can later generalize and use on more difficult problems.

First Snapshot

We first run the executable:

  > faucet.exe

And then we must acquire its Process ID (PID) through the Task Manager or via some other means. Using the PID we can tell umdh.exe to dump the state of the processes' heap at this junction:

  > umdh -p:1234 -f:faucet-1.txt

If we peek inside faucet-1.txt we will see multiple entries of the type:

  18 bytes + 18 at 1DBE810 by BackTrace1C62924
	7710A2A0
	4E7ED3
	4E72AA
	4A680E
	4A6BF6
	4A6D60
	4A6FC7
	4A722E
	4A732A
	4A7426
	4A74C7
	4A75B2
	49EB6F
	4A0486
	4BD884
	42DD88

These reflect the number of bytes allocated at a particular offset, and the stack frame that resulted in the allocation. It is possible at this point to resolve the symbols using the umdh.exe command, which will make the entry highly consumable by human eyes:

  > umdh -d faucet-2.txt -f:symbols-1.txt

Where symbols-1.txt will contain entries of the form (output has been slightly modified to ensure that it fits on one screen--full paths will always be given in a real dump):

  +      18 (    30 -    18)      2 allocs	BackTrace1C62924
  +       1 (     2 -     1)	BackTrace1C62924	allocations

	ntdll!RtlAllocateHeap+000001FE
	condor_schedd!malloc+00000079 (f:\...\crt\src\malloc.c, 163)
	condor_schedd!operator new+0000001F (f:\...\crt\src\new.cpp, 59)
	condor_schedd!ParseFactor+0000011D (c:\...\classad.old\parser.cpp, 181)
	condor_schedd!ParseMultOp+00000017 (c:\...\classad.old\parser.cpp, 361)
	condor_schedd!ParseAddOp+00000017 (c:\...\classad.old\parser.cpp, 422)
	condor_schedd!ParseEqualityOp+00000017 (c:\...\classad.old\parser.cpp, 509)
	condor_schedd!ParseSimpleExpr+00000017 (c:\...\classad.old\parser.cpp, 579)
	condor_schedd!ParseAndExpr+00000017 (c:\...\classad.old\parser.cpp, 616)
	condor_schedd!ParseExpr+00000017 (c:\...\classad.old\parser.cpp, 652)
	condor_schedd!ParseAssignExpr+00000024 (c:\...\classad.old\parser.cpp, 686)
	condor_schedd!Parse+00000026 (c:\...\classad.old\parser.cpp, 727)
	condor_schedd!AttrList::Insert+00000017 (c:\...\classad.old\attrlist.cpp, 839)
	condor_schedd!AttrList::Assign+00000062 (c:\...\classad.old\attrlist.cpp, 2767)
	condor_schedd!SelfMonitorData::ExportData+00000030 (c:\...\self_monitor.cpp, 113)
	condor_schedd!Scheduler::count_jobs+000006C1 (c:\...\schedd.cpp, 829)

From this output it is clearly helpful to have both the system symbols and the application symbols, as they help provide an incredible useful context from which to start debugging.

Second Snapshot

Now we wait for a sign of the leak, or exercise the process in a way that is known to cause the leak in question. Once we are sure that we have seen one or more instances of the leak, and then we take a second measurement:

  > umdh -p:1234 -f:faucet-2.txt

The leaky process can now be stopped and the analysis of the leak can begin.

Analysis

To understand a leak and where it is coming from can be a difficult exercise. Fortunately, having enabled the user mode stack trace, we will be given the context in which the leak is occurring--provided we supplied the executable's symbols.

To begin the analysis we must first determine what changes occurred between the periods defined by the two dumps we just made. To get this information, run the umdh.exe one last time, with a slight modification in its command-line:

  > umdh faucet-1.txt faucet-2.txt > faucet-diff.txt

Now the contents of faucet-diff.txt will contain all memory leaks plus a stack trace which reflects the location where memory was allocated, but never subsequently freed.

The output here looks similar to that of the output we saw in the first dump with its symbols resolved, but in this case it represent the difference between the two points of measurement (again, the output has been altered to fit the screen):

  +      18 (    30 -    18)      2 allocs	BackTrace1C62924
  +       1 (     2 -     1)	BackTrace1C62924	allocations

	ntdll!RtlAllocateHeap+000001FE
	condor_schedd!malloc+00000079 (f:\...\crt\src\malloc.c, 163)
	condor_schedd!operator new+0000001F (f:\...\crt\src\new.cpp, 59)
	condor_schedd!ParseFactor+0000011D (c:\...\classad.old\parser.cpp, 181)
	condor_schedd!ParseMultOp+00000017 (c:\...\classad.old\parser.cpp, 361)
	condor_schedd!ParseAddOp+00000017 (c:\...\classad.old\parser.cpp, 422)
	condor_schedd!ParseEqualityOp+00000017 (c:\...\classad.old\parser.cpp, 509)
	condor_schedd!ParseSimpleExpr+00000017 (c:\...\classad.old\parser.cpp, 579)
	condor_schedd!ParseAndExpr+00000017 (c:\...\classad.old\parser.cpp, 616)
	condor_schedd!ParseExpr+00000017 (c:\...\classad.old\parser.cpp, 652)
	condor_schedd!ParseAssignExpr+00000024 (c:\...\classad.old\parser.cpp, 686)
	condor_schedd!Parse+00000026 (c:\...\classad.old\parser.cpp, 727)
	condor_schedd!AttrList::Insert+00000017 (c:\...\classad.old\attrlist.cpp, 839)
	condor_schedd!AttrList::Assign+00000062 (c:\...\classad.old\attrlist.cpp, 2767)
	condor_schedd!SelfMonitorData::ExportData+00000030 (c:\...\self_monitor.cpp, 113)
	condor_schedd!Scheduler::count_jobs+000006C1 (c:\...\schedd.cpp, 829)

Here we have not found a leak, which is good, but this does illustrate how useful the umdh.exe command can be when one suspects a leak.

Conclusion

While short, this tutorial should enable any Windows programmer to identify, track down and eliminate any heap-based memory leaks. While the above instructions will not automajically fix or find leaks, they will give any programmer a good place to start.

TODO: Write a leaking application to show the actual benefit of this application.