Friday, March 5, 2010

How Garbage Collector Works?

How, then, does the garbage collector achieve its magic? The basic idea is pretty simple: it examines how objects are laid out in memory and identifies all those objects that can be ‘reached’ by the running program by following some series of references.

When a garbage collection starts, it looks at a set of references called the ‘GC roots’. These are memory locations that are designated to be always reachable for some reason, and which contain references to objects created by the program. It marks these objects as ‘live’ and then looks at any objects that they reference; it marks these as being ‘live’ too. It continues in this manner, iterating through all of the objects it knows are ‘live’. It marks anything that they reference as also being used until it can find no further objects.

An object is identified, by the Garbage Collector, as referencing another object if it, or one of its superclasses, has a field that contains the other object.

Once all of these live objects are known, any remaining objects can be discarded and the space re-used for new objects. .NET compacts memory so that there are no gaps (effectively squashing the discarded objects out of existence) - this means that free memory is always located at the end of a heap and makes allocating new objects very fast.

GC roots are not objects in themselves but are instead references to objects. Any object referenced by a GC root will automatically survive the next garbage collection. There are four main kinds of root in .NET:

A local variable in a method that is currently running is considered to be a GC root. The objects referenced by these variables can always be accessed immediately by the method they are declared in, and so they must be kept around. The lifetime of these roots can depend on the way the program was built. In debug builds, a local variable lasts for as long as the method is on the stack. In release builds, the JIT is able to look at the program structure to work out the last point within the execution that a variable can be used by the method and will discard it when it is no longer required. This strategy isn’t always used and can be turned off, for example, by running the program in a debugger.

Static variables are also always considered GC roots. The objects they reference can be accessed at any time by the class that declared them (or the rest of the program if they are public), so .NET will always keep them around. Variables declared as ‘thread static’ will only last for as long as that thread is running.

If a managed object is passed to an unmanaged COM+ library through interop, then it will also become a GC root with a reference count. This is because COM+ doesn’t do garbage collection: It uses, instead, a reference counting system; once the COM+ library finishes with the object by setting the reference count to 0 it ceases to be a GC root and can be collected again.

If an object has a finalizer, it is not immediately removed when the garbage collector decides it is no longer ‘live’. Instead, it becomes a special kind of root until .NET has called the finalizer method. This means that these objects usually require more than one garbage collection to be removed from memory, as they will survive the first time they are found to be unused.

Ek zara si baat per voh yun rooth ker chal diye.
Jaise sadyo se unhe kisi bahane ki talash thi