Sunday, July 13, 2008

Java Code Optimization

Update: I now have a greater understanding of this topic, and I may make another post on it sometime to make up for this poorly-informed post. The general point is that knowing how adaptive JIT compilation works can be beneficial. Put code that you know will run many times in its own method and it will be optimized better (depending, of course, on your JIT compiler).

Just another quick word today, this one about writing efficient Java code. I'm sure this is somewhat well-known to some, but I've never actually seen it written anywhere (much less explained in a programming course). However, before I say anything, know that you should always test to make sure it actually yields you better results. Depending on your compiler and JVM, you may or may not see improvement. It also largely depends on your code.

Note that this doesn't just affect Java, it has an effect on many/all JIT (Just In Time) compilers.

Take this bit of code:



public void func1(){
int sum1 = 0; // cold
double sqrtsum1 = 0; // cold
for(int i=0; i<10000; i++){ // hot
sum += i; // hot
sqrtsum += Math.sqrt(i); // hot
for(int j=0; j<10000; j++){// very hot
int sumplusj = j+i; // very hot
} // very hot
} // hot
} // cold



Cold means the code is executed very infrequently, while hot code is executed often. Assuming this "func1()" is called very few times, the method as a whole is cold.

The thing to notice is the hot/very hot section in the cold method. Most modern JIT compilers compile/optimize on the method level: if a method is determined to be hot (after it is executed numerous times), it is compiled and optimized. Before that, it's just interpreted, which is much slower.

The issue is having hot code in cold methods. Since the method is cold, it isn't compiled to native and optimized very quickly (if at all). Note that, depending on your specific application and setup, the hot code may not be either - in which case this won't help you.

So what can one do to remedy the situation? Extract hot code into separate methods.




int sum1 = 0; // cold
double sqrtsum1 = 0; // cold

public void func1(){ // cold
for(int i=0; i<10000; i++){ // cold
func2(i); // hot
} // cold
} // cold

public void func2(int i){ // hot
sum += i; // hot
sqrtsum += Math.sqrt(i); // hot
for(int j=0; j<10000; j++){// very hot
int sumplusj = j+i; // very hot
} // very hot
}



Normally one may say "but isn't there a method call overhead that makes it slower?" - if this were C or C++, perhaps. Yes, there is an overhead, but extracting the hot code out of the cold method overcomes that by a large margin.

As for the "very hot" code in the second method, it is debatable as to when to extract that as well. It is my own heuristic that the more operations in the hot section, the more the benefit. With just that one statement, there may not be much benefit (testing would be required to know for sure).

In general though, keep cold code in cold methods and hot code in hot methods. Cursory tests similar to (though more intensive) the above show a performance increase of 200%, but it's largely going to depend on your JIT compiler.

As JIT compilation evolves, this will most likely become less of an issue - I expect that it will rely less on method boundaries in the future. For now though, keep those methods hot.

2 comments:

Unknown said...

So tell me o wise one??? How does the compiler know that it should compile the extracted function instead of interpreting it if it was only being called once in a while from inside the cold function anyways???? Won't this have the same effect (not to mention the function call overhead you mentioned)?

Solnus said...

If it's only called once ever inside the cold function, then it's cold as well and won't be compiled quickly (if ever). If it's in a while loop and is called hundreds of times, it will be compiled because it is therefore hot. If the called function is small as well as hot, it will not only be compiled but inlined as well, so there won't be any overhead.