Saturday, April 12, 2008

Garbage Collection

What is garbage collection?

Garbage collection is a heap-management strategy where a run-time component takes responsibility for managing the lifetime of the memory used by objects. This concept is not new to .NET - Java and many other languages/runtimes have used garbage collection for some time.

Is it true that objects don't always get destroyed immediately when the last reference goes away?

Yes. The garbage collector offers no guarantees about the time when an object will be destroyed and its memory reclaimed.

Why doesn't the .NET runtime offer deterministic destruction?

Because of the garbage collection algorithm. The .NET garbage collector works by periodically running through a list of all the objects that are currently being referenced by an application. All the objects that it doesn't find during this search are ready to be destroyed and the memory reclaimed. The implication of this algorithm is that the runtime doesn't get notified immediately when the final reference on an object goes away - it only finds out during the next sweep of the heap.Futhermore, this type of algorithm works best by performing the garbage collection sweep as rarely as possible. Normally heap exhaustion is the trigger for a collection sweep.

Is the lack of deterministic destruction in .NET a problem?

It's certainly an issue that affects component design. If you have objects that maintain expensive or scarce resources (e.g. database locks), you need to provide some way for the client to tell the object to release the resource when it is done. Microsoft recommend that you provide a method called Dispose() for this purpose. However, this causes problems for distributed objects - in a distributed system who calls the Dispose() method? Some form of reference-counting or ownership-management mechanism is needed to handle distributed objects - unfortunately the runtime offers no help with this.

Does non-deterministic destruction affect the usage of COM objects from managed code?

Yes. When using a COM object from managed code, you are effectively relying on the garbage collector to call the final release on your object. If your COM object holds onto an expensive resource which is only cleaned-up after the final release, you may need to provide a new interface on your object which supports an explicit Dispose() method.

Do I have any control over the garbage collection algorithm?

A little. For example, the System.GC class exposes a Collect method - this forces the garbage collector to collect all unreferenced objects immediately.

How can I find out what the garbage collector is doing?

Lots of interesting statistics are exported from the .NET runtime via the '.NET CLR xxx' performance counters. Use Performance Monitor to view them.

I've heard that Finalize methods should be avoided. Should I implement Finalize on my class?

An object with a Finalize method is more work for the garbage collector than an object without one. Also there are no guarantees about the order in which objects are Finalized, so there are issues surrounding access to other objects from the Finalize method. Finally, there is no guarantee that a Finalize method will get called on an object, so it should never be relied upon to do clean-up of an object's resources.Microsoft recommend the following pattern:

public class CTest : IDisposable

{

public void Dispose()

{... // Cleanup activities

GC.SuppressFinalize(this);

}

~CTest() // C# syntax hiding the Finalize() method

{

Dispose();

}

}

In the normal case the client calls Dispose(), the object's resources are freed, and the garbage collector is relieved of its Finalizing duties by the call to SuppressFinalize() In the worst case, ie, the client forgets to call Dispose(), there is a reasonable chance that the object's resources will eventually get freed by the garbage collector calling Finalize() Given the limitations of the garbage collection algorithm this seems like a pretty reasonable approach

What is the lapsed listener problem?

The lapsed listener problem is one of the primary causes of leaks in .NET applications. It occurs when a subscriber (or ‘listener’) signs up for a publisher’s event, but fails to unsubscribe. The failure to unsubscribe means that the publisher maintains a reference to the subscriber as long as the publisher is alive. For some publishers, this may be the duration of the application.
This situation causes two problems. The obvious problem is the leakage of the subscriber object. The other problem is the performance degredation due to the publisher sending redundant notifications to ‘zombie’ subscribers.
There are at least a couple of solutions to the problem. The simplest is to make sure the subscriber is unsubscribed from the publisher, typically by adding an Unsubscribe() method to the subscriber.

When do I need to use GC.KeepAlive?

It's very unintuitive, but the runtime can decide that an object is garbage much sooner than you expect. More specifically, an object can become garbage while a method is executing on the object, which is contrary to most developers' expectations.

Example:

using System;

using System.Runtime.InteropServices;

class Win32

{

[DllImport("kernel32.dll")]

public static extern IntPtr CreateEvent( IntPtr lpEventAttributes,bool bManualReset,bool bInitialState, string lpName);

[DllImport("kernel32.dll", SetLastError=true)]

public static extern bool CloseHandle(IntPtr hObject);

[DllImport("kernel32.dll")]

public static extern bool SetEvent(IntPtr hEvent);

}

class EventUser

{

public EventUser()

{

hEvent = Win32.CreateEvent( IntPtr.Zero, false, false, null );

}

~EventUser()

{

Win32.CloseHandle( hEvent );

Console.WriteLine("EventUser finalized");

}

public void UseEvent()

{

UseEventInStatic( this.hEvent );

}

static void UseEventInStatic( IntPtr hEvent )

{ //GC.Collect();

bool bSuccess = Win32.SetEvent( hEvent );

Console.WriteLine( "SetEvent " + (bSuccess ? "succeeded" : "FAILED!") );

}

IntPtr hEvent;

}

class App

{

static void Main(string[] args)

{

EventUser eventUser = new EventUser();

eventUser.UseEvent();

}

}
If you run this code, it'll probably work fine, and you'll get the following output:

SetEvent succeeded

EventDemo finalized
However, if you uncomment the GC.Collect() call in the UseEventInStatic() method, you'll get this output:

EventDemo finalized

SetEvent FAILED!
(Note that you need to use a release build to reproduce this problem.)
So what's happening here? Well, at the point where UseEvent() calls UseEventInStatic(), a copy is taken of the hEvent field, and there are no further references to the EventUser object anywhere in the code. So as far as the runtime is concerned, the EventUser object is garbage and can be collected. Normally of course the collection won't happen immediately, so you'll get away with it, but sooner or later a collection will occur at the wrong time, and your app will fail.

A solution to this problem is to add a call to GC.KeepAlive(this) to the end of the UseEvent method

No comments: