Groovy Retrospective: An Addendum - Memory usage & PermGen

I can't really have a Groovy retrospective without mentioning memory.

Over the last four years I have spent more time than any sane person should have to investigating memory leaks in production Groovy code. The dynamic nature of Groovy, and it's dynamic meta-programming presents different considerations for memory management compared to Java, simply because perm gen is no longer a fixed size. Java has a fixed number of classes that would normally be loaded in to memory (hot-reloading in long living containers aside), where as Groovy can easily change or create new classes on the fly. As a result, permgen GC is not as sophisticated (pre moving permgen to the normal heap anyway) and largely in Java if you experienced an Out-Of-Memory permgen exception then you would just increase your permgen size to support the required number of classes being loaded by the application.

To be fair, the majority of the problems encountered were due to trying to hot-reload our code coupled the setup of the application container (in this case tomcat) and having Groovy on the server classpath rather than bundled with the application (much like you wouldn't bundle Java itself with an application, however, bundling Groovy with your application is recommended).

A couple of points of interest, if you are considering Groovy (especially relevant in a long running process where you attempt to reload your code):


The MetaClassRegistry

When you meta-program a Groovy class, e.g. dynamically add a method to a class,  its MetaClass is added to the MetaClassRegistry, which is a static object on the GroovySystem class. This means that any dynamically programmed class creates a tie back to the core Groovy classes.

The main consideration to keep in mind when meta programming in a Groovy environment is that if you want to reload your classes you now have a link between your custom code and the core Groovy code so you must either 1) explicitly clear out the MetaClassRegistry; 2) reload the core Groovy classes as well (throw everything out on reload)

I think coming from a Java environment, where you would likely use the JAVA_HOME on the server for long running applications, it can often seem logical to have a similar server classpath entry for Groovy also - but actually, the easiest approach is to bundle the groovy classes with your application so is a normal candidate for reloading.

If you decide not to reload Groovy, you can add explicit code to clear out the registry - this is pretty simple code, but a note of warning, without just throwing everything away there are still plenty of risks that you can leave links that stop your classes being collected (which was the case for me for a long time until just making all the third party classes (groovy included) candidates to be thrown away and reloaded.  Even with throwing away all third party libraries you can still be caught out if you use shutdown hooks (e.g. jvm shutdown hook to clean up connections will tie your classes back to your underlying JRE, meaning that no classes can be collected until you restart your application!)

Anyway, above is code to clear the registry, it assumes you have access to the GroovyClassLoader, but you can also follow the same approach by just grabbing the MetaClassRegistry from any arbitrary Groovy class and iterate through that. Give it a try, if you play around a little you will probably find its quite easy to create a leak if you want to!


Anonymous Classes

Another thing to keep in mind in Groovy applications is the generation of classes by your application. As you would expect, as you load in Groovy classes to your application they will be compiled to class files (or if you are pre-compiling into a JAR or something) which will be added to PermGen. However, in addition to this, your code being executed may also result in additional (possibly anonymous) classes also being created and added to PermGen - so without care, these can start to fill up that space and cause OOM exceptions (although generated classes will often be very little, so might take a while before it actually errors).

An example of what might do this is loading Groovy config files - if you are loading sensibly and just doing it once then it won't be an issue, but if you find yourself re-loading the config every request/execution then it can keep adding those to PermGen.  Another example of where this happens (surprisingly) is if you are using Groovy templating. Consider the following code:

(Taken from the SimpleTemplateEngine JavaDocs)

The example is a simple example of binding a Groovy template with a map of values - maybe something that you would do to send an email or create a customized document for someone - but behind the scenes Groovy will create a class that is added to permgen for each execution. Now this isn't a  lot, however, if you are dealing with high throughput it can certainly add up pretty quickly.


Lazy Garbage Collection

Another interesting behaviour that I observed over the last few years is that, in the JVM implementation I was using, the PermGen garbage collector collected lazily.  As I mentioned, because nothing interesting traditionally happened in the permanent generation in the JVM, the garbage collectors didn't do anything interesting. Further more, because it was always assumed that the contents of perm gen were fairly static (as the name suggests) the collection happens fairly in-frequently, and often only kicks in for a full collection (which is more costly).  What this means is that even if everytime you reload you free up lots of classes for collection (say, your entire application), it might not GC permgen for several reloads as a full collection isn't required, and the JVM will just lazily perform the collection when permgen is almost full.



If you look at the diagram above, it displays a common pattern I observed - each little step in the up swing of the chart represents a full application reload, but you will see that there are multiple reloads before the permgen usage approaches the limit, and it is only when the usage is close to the limit that it actually performs the collection.

The challenges this can present is that if you push the permgen close to the limit but still not triggering a full collection, then the following reload it can once again spike the permgen (because the entire application and its associated third party code is being reloaded into memory), this can push it over the limit and cause OOM exceptions.  This was not something I ever saw on production environments, but was fairly common in desktop/development environments where less resources were available.

(another interesting observation in this particular pattern is that the GC seems to be intermittently collecting fewer classes. I never got to the bottom of that question mark: there was no difference in activity between application reload each time, so there is no change in application behaviour to trigger a leak and the middle period of reloading maintains a constant level of memory usage patterns - which also shows no leak behaviour)



Hopefully someone smarter than me about all this stuff is reading this and can shed some better insights into it, otherwise, hopefully its helpful if you are about to start doing crazy things with perm gen..

0 comments: