2012-12-24

Another year is ending. Did you make a donation?


Another year. Time goes by faster and faster.

I usually wrap up the year by making some donations.

This time i donated to:

  • Wikimedia Foundation. The donation page is here (english) or here (spanish).
  • The SETI Institute. The donation page is here.

Also, all this year i've been making monthly donations to the World Food Program (english or spanish) and Nuru International (here).

So, end this year doing something for others. That is money well spent and it actually makes you feel better about yourself.

This is enough to be my fourteenth post.

How much memory does a JVM need?


Update: fixed some typos pointed out by Ezequiel and Lecko.

At this point in IT affairs, we should change the old US saying to go something like "Nothing is certain but death, taxes and a JVM Full GC".

Because, no matter what, one thing you can count on is that a JVM will eventualy perform a Full Stop the World Garbage Collection. This is true with Concurrent Mark and Sweep (-XX:+UseParNewGC -XX:+UseConcMarkSweepGC), and with the relatively new G1 garbage collector on Java 6 (-XX:+UnlockExperimentalVMOptions -XX:+UseG1GC).

Hence, the rigth thing to do is to give the JVM as much memory as you can afford. Right? Well, not necessarily.

Some ground leveling


What follows is in line with the old CMS (Concurrent Mark Sweep) GC because of the memory structure I will discuss. The G1, as far as I know, uses a somehow different structure that is more resilient, but on the end, it will perform a Full Stop the World Garbage Collection, just less frequently.

The standard memory arrangement for a JVM is to divide the memory in three. You have the New memory, the Tenured memory (OldGen) and the Perm (permanent) memory. The Permanent Memory is used to hold bytecode, class definitions and things like that.

The New Memory is itself divided in two: the Eden and the Survivor areas. Eden is where short lived objects are created (that is: when inside a method you create a string, an object is created in Eden and then it is destroyed when the method scope is destroyed). This Eden arrangement is what makes for a ligth weight GC. How does it work? Well, if you come to think about it, when you have a server thread, you wait for requests and when you get one, you usualy call a method to solve it. This method creates everything in Eden and when it terminates, almost every object residing in Eden has been destroyed (because of method scoping). So you can mostly reclaim all of Eden's memory without having to think about handling memory fragments. Empirically, most objects in a JVM have a very short lifespan (or high mortality rate, if you want) and Eden is arranged for that.

Yes, I said almost every object in Eden has been destroyed. That is what the Survivor area is for. Because objects that survive method scoping (ex: the method's result value) will be sent (copied) to Survivor. When Eden is full, then a Minor GC will occur. This Minor GC will do a few things:

  • claim all Eden's dead objects,
  • move all Eden's live objects to the Survivor area (by copying),
  • will stage the other Survivor objects (using generations, akin to being X years old), and
  • send the oldest generation of Survivor objects to Old Generation (in Perm memory) by copying

Note: there are actualy two survivor areas, that are used alternatively to make things more efficient.

If you carefully think about what I've just said, you will realize that an object can be dead to the program (meaning: no longer accessible) but still be using memory and pretty much alive for all intended purposes. Objects are not realy destroyed until they are Garbage Collected. This is what the Mark and Sweep does: decide what objects are reachable (alive) and what objects are not (to destroy them). And yes, this is why you should realy realy think twice before using finalizers, because the finalizer is called when garbage is being collected, not when the object gets out of scope. This keeps the resources being used and slows down the GC process because it cannot just kill the object, it has to give the finalizer a chance to run, thus the object survives the current GC cycle and memory is still occupied.

See the following chart to fully grasp what happens on a Minor GC:


The question is: what happens when Old Gen is full? Does the program halt? Enter the Full Stop the World Garbage Collection.

When this happens, the GC stops all processing in the JVM and starts collecting memory on zombie objects (alive but unreachable). This encompases objects in the Old Gen, as much as objects on Eden, Survivor and things in Perm memory. The Perm is included because some loaded or dinamicaly created classes might no longer be needed and that memory could be claimed.

As for what goes on with the rest of objects, well, they are marked as dead or alive and the alive objects are memory copied to new positions in order to create a large block of unused memory. This copying is expensive on it's own, but what makes the thing even worse is that every pointer (reference) to the moved object must be also updated to point to the new object's address. And this is why the Stop the World, because we can't be changing things while the rest keeps moving (see the section at the end about Azul's Zing JVM).

Note: this pointer fixing can be really taxing on the processor's L2/L3 cache lines, depending on how spread in memory are the pointers to be fixed and how many times the object is pointed to, so a very linked structure (ex: a highly connected graph) will probably slow down the process.

From all this, it follows that the amount of time it takes to do a Full GC is largely dependent on the number of objects still alive that must be copied, which largely depends on the application logic and the amount of memory to scan (AKA: the Old Gen memory size).

Some Caveat


It is custom practice to set the memory sizes by specifying the same value to two parameters. The JVM accepts a Max size and a Start size. If you set Start to less than Max, then the JVM allocates Size and grows when needed. This growing (and shrinking) is expensive, so most people usually sets Max and Start to the same value. For example:

  • the Perm size can be set to 512Mb set with -XX:PermSize=512m -XX:MaxPermSize=512m
  • the Perm total size can be to 8Gb set using -Xms8000m -Xmx8000m
  • the New total size can be set to 128Mb with XX:NewSize=128m -XX:MaxNewSize=128m

How to get information on Garbage Collection


In order to know what is going on at the GC level, you can add the following parameters to your JVM on startup:
-XX:+PrintGCTimeStamps-XX:+PrintTenuringDistribution-XX:+PrintGCDetails-Xloggc:<file path>
Also, you can use the jstat command to see what's going on in real time on a running JVM (it requires jstatd and probably some configuration settings). The command takes the form:
jstat -gcutil <pid> 1s 1000
where 1s is 1 second intervals and 1000 is the number of samples to take. Full jstat command line arguments for Java 5 can be found in this link.

This command is interesting because it shows memory usage in percentages and also the number of Minor and Full GC performed on the system with the total accumulated time used for each type of GC. You can then use this to figure out how long it takes to perform each type of Garbage Collection.

So, how much memory does a JVM need?


This actualy depends, as it always do. There is, of course, a minimum below which your application will not even start. Obviously, you need more than that minimum.

Now, if you are a lucky bastard, then you probably can get away with having NO Full Stop the World GC. Yes, I said before that you can count on death, taxes and a Full GC. So, how is it possible to getting away with no Full GC?

Well, it has to do with the definition of lucky bastard. You fall into that category if you can manage to satisfy these two conditions:

  1. have an application activity cicle that has a very low activity period. For example: you serve customers in just one country and you have near zero activity at 3 am.
  2. have an application that, measured between two adjacent low activity cicles and without a Full GC, consumes less memory than your server's real memory. For example: between today at 3 am and tomorrow at 3 am, your app requires 4 GB total memory and your server has 8 GB phisical memory.
Update: i will clarify on rule 2. Let's say you realize the JVM executes a Full GC every hour and that GC releases 1Gb of OldGen memory every run. Then, on a 24 hour cycle, that JVM will require 24Gb of memory, plus the minimum required memory, plus some extra memory for safety to avoid executing a Full GC. If your server has 48Gb of memory, then allocating, say, 32Gb to the JVM should do the trick.

In other words, lucky means that you can allocate 6 GB of memory to your JVM (which avoids a Full GC during working hours) and you also restart your app at 3 am, before the Old Gen gets full.

Yes! You get away without a Full GC because you kill your JVM before it actualy needs it. This approach is being used by many finnancial institutions (specially high frequency traders) to avoid the latency of a Stop the World GC.

Now, if you are an average mortal, your JVM will actualy stop to GC at some point or another. So what's the size to use for the Old Gen? It depends on your processing requirements and requires trial and error. What you must embrace is the fact that the Full GC cannot be avoided. What you can do, however, is to control for how long the Full GC stops the world. As said, that depends on the total number of object, which in turns depends on the total memory that the GC must scan and copy.

So the value that you must set can actualy be tuned to your needs, but the unecpected thing is that the memory size might have to go down for you to hit your target SLA requirements (ex: all requests must take less than a second to be responded).

I know this is counter intuitive, and frigthening. A few days ago, i was talking to a customer and said that the total configured memory a JVM had should probably go down from 4.5Gb to 3Gb to reduce the Full GC stop time from 2.5 seconds to less than 1s. A few eyebrows were raised and some fearfull looks crossed the room.

Nobody is happy with the idea of reducing the allocated memory to a process, specially in these days of Moore's law when cheap translates into a default of more is better.

But the GC algorithm is time linear on memory size and perhaps it's better to have frequent shorter stops than less frequent but longer stops. Your SLA might call for this.

You see, sometimes less is better.

So, what is your target Memory value?


Well, it's a trial an error, but now you know how to tune the value. You check how frequent is the Full GC and how long it takes and figure out if you are a lucky bastard. If you are, then all set.

If you are not, then the duration of the Full GC should tell you if you have to add or remove Old Gen memory and then tune the size slowly until you get the expected results.

There is also some good news to those lucky bastard wannabes. If you are elastic on the number of CPU/Cores/memory available, then you can get away with no Full GCs, even if your app has significant activity all day long. All you have to do is monitor when your JVMs are approaching the Full GC mark and restart them. An Old Gen at 95% utilization can mark a good time when you recycle the JVM, but this also needs tuning.

Whether you are tuning the NewMem size or the OldGen size, the two most important things to remember are:

  • that the value you set has direct impact on the duration of the GC cycle, and
  • that the type of garbage created depends on your applications, so the values for one app will not necessarily work for a different app.

What about Application Logic?


As i mentioned before, two factors affect the Full GC time. The first one is the memory size which we just covered.

The second factor is Application Logic. More precisely, the reference structure and the pattern of change of Long Lived objects.

The more intricate the links between objects, the more expensive it becomes to move an object around in memory, due to the pointer fixing. So you probably don't want a big mesh of objects referring to each other. A simple reference structure is probably the best choice here, unless you absolutely need something more complex.

The other issue to control is the pattern of change. It will generally be best if your long term objects (ex: cached options and configuration) are mostly stable over long periods of time. The reason being that a destroyed long term object causes a hole in memory that must be reused, which in turn requires moving around other long term objects.

Are we stuck with Full GC?


There is also a bit of other good news, if you can afford to spend some green currency bills.

There is a company called Azul Systems that created a special brand of JVM called Zing. One of the biggest features of Zing is that it has a GC that makes no Full Stop the World GCs.

I haven't used it and I have no relation with Azul Systems, so i can really say much more about it, but it sure sounds cool.

Some extra info


If you want to know more about Garbage Collectors, you can watch this InfoQ presentation by Gil Tene (CTO and co-founder of Azul Systems). It's about Garbage Collectors in general. Again, I have no relation to Azul Systems in any way, but the information on this presentation is very good.

This is enough to be my thirteenth post.