Get Notified that an Object has Been Garbage Collected via SoftReferences

I learned something new while poring over the AspectJ source code, but it wasn’t what I was expecting to learn.

Background: AspectJ is a framework for “weaving” Java class files with additional functionality written in what are called aspect files. Classes can be woven as part of the compilation process (source weaving), after compilation (binary weaving), or even as late as when the class is being loaded by the JVM (load-time weaving). You can do some really cool stuff with AspectJ, including: adding fields or methods to existing classes (your own, or in a third-party library), intercepting method calls, replacing constructor calls with subclass constructor calls, and more. A full discussion of what’s possible is well beyond the scope of this post, so if you’re curious, check out the AspectJ Programming Guide and AspectJ 5 Developer’s Notebook to get the complete picture.

But enough about that. Back to my story.

I initially the AspectJ source code because I was curious about the Agent passed on the command line that handles the load-time weaving (LTW) of class files — the code behind the -javaagent:path/to/aspectjweaver.jar that you add to your app startup script. I found the relevant files located at org.aspectj.weaver.loadtime.{Agent, ClassPreProcessorAgentAdapter} after a little bit of digging, but after finding them I discovered that there wasn’t a lot there. However, ClassPreProcessorAgentAdapter creates an instance of a ClassPreProcessorAj.java — which is where things start to get interesting.

The purpose of Aj, as I understand it, is to maintain a mapping of ClassLoader => AspectJ weaver for each ClassLoader in use, and delegate the weaving of any classes loaded by that ClassLoader to its associated weaver. A ClassLoader, though, is just like any other class in that under the right circumstances it becomes eligible for garbage collection (GC). If we just naively throw ClassLoaders into a HashMap as keys, the HashMap‘s reference to the ClassLoader will keep it from being GCed, and result in a memory leak over time. Wouldn’t it be convenient if we had a way to know when an object had been GCed and should be removed from the map?

It turns out Java does provide a mechanism to be notified when an object has been collected: WeakReference.

The internet is full of literature on WeakReferences and their cousins SoftReference and PhantomReference. There’s a particularly good discussion of WeakReference vs. SoftReference on stackoverflow. PhantomReferences are stranger, and not particularly relevant to this post.

WeakReferences have the property that their pointer to an object won’t prevent it from being garbage collected. That is, if at any time the only references to an object are through WeakReferences, that object is eligible to be reclaimed by the collector. By itself, this doesn’t help us, but Weak-, Soft-, and PhantomReferences also have a constructor that takes in a ReferenceQueue. When the object the Reference points to is collected, the *Reference will be placed on the queue. This gives us all the tools we need to implement a mapping where the keys can be collected, and when they’re collected their associated values can be removed from the queue.

The implementation used by Aj (roughly):

  1. Keep an instance of a reference queue around — adaptorQueue
  2. Subclass WeakReference to wrap a ClassLoader and provide implementations of hashCode() and equals() that make the object suitable for use as a key in a Map, and also work properly even if the referenced ClassLoader has been garbage collected.
  3. Any time a class is being loaded, loop through the items currently in the adaptorQueue (synchronizing for thread safety), and remove any entries from the map where the ClassLoader has already been collected: checkQ()
  4. In other classes, take care to ensure that any references to the ClassLoader from the map value are also WeakReferences or similar so that we don’t end up in a situation where code elsewhere is causing the memory leak.

Those who have some experience with WeakReferences might be wondering why AspectJ doesn’t just use WeakHashMap, which basically provides all of the functionality above for free and has been around since JDK 1.2. I was certainly curious so I did some digging, and I found that the initial implementation of Aj.java was implemented with a WeakHashMap. It was removed and replaced with the current implementation in early 2008 as part of a larger set of fixes for memory leaks (relevant: bug #210470; commit #255c5aa). The same commit changed the backing map from a WeakHashMap to a synchronized HashMap, which makes me wonder if there was a thread-safety issue or synchronization bottleneck addressed by that commit as well. The current implementation also gives clients of AspectJ code the ability to force a flush of GCed entries from the map, which WeakHashMap does not provide. Sadly, I spent all my time writing this blog post instead of posting the question to the AspectJ mailing list, so we may never find out the answer for sure.

Takeaway: not only can WeakReferences be used to keep a reference to an object without preventing it from being garbage collected, but you can also receive a notification in the form of an entry in a ReferenceQueue when the reference object has been collected. This notifications allows you to perform any cleanup you might need to do.

Discussion