inaccurate timing of native method calls

Questions about YourKit Java Profiler
Post Reply
knutwannheden
Posts: 13
Joined: Wed Dec 07, 2011 3:18 pm

inaccurate timing of native method calls

Post by knutwannheden »

Hi,

As I also found mentioned in another thread (http://forums.yourkit.com/viewtopic.php?f=3&t=912) the CPU sampling is not accurate for native methods. I performed my tests using YourKit 9.5.4.

Specifically, our application quite frequently indirectly calls String#intern() (typically only for a handful of distinct Strings; and the call is part of a 3rd party library and is thus out of our control). When CPU profiling our application with the sampling method the String#intern() method shows up as using around 17% of the entire CPU time, which is clearly way too much. (I found out about the inaccurate reporting the hard way by investing quite some effort into patching the 3rd party code to *not* call String#intern() as often, only to find out that it made a very small difference in the end.)

If this problem cannot be fixed in YourKit, is there maybe a way to factor out the impact of it in the reporting?

Also note that CPU profiling using the tracing method is not always an option, as it would take way too long to be practical.

Thanks,

Knut Wannheden
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: inaccurate timing of native method calls

Post by Anton Katilin »

Hello Knut

Sampling cannot accurately measure methods smaller than the sampling period. The problem is with all short methods, not only the native ones. It's a fundamental limitation of sampling approach.
http://www.yourkit.com/docs/10/help/cpu_intro.jsp

To measure short methods please use tracing. Yes, its overhead can be very big, but you can reduce it by specifying appropriate filters:
http://www.yourkit.com/docs/10/help/filters.jsp

Also, to track execution of particular methods you can use probes:
http://www.yourkit.com/docs/10/help/probes.jsp
or triggers:
http://www.yourkit.com/docs/10/help/triggers.jsp

Best regards,
Anton
Gibson
Posts: 181
Joined: Mon Apr 11, 2005 10:38 am

Re: inaccurate timing of native method calls

Post by Gibson »

Try taking random stack traces from your app (using ctrl-\ or jstack). If you find that the native method is disproportionately represented at the bottom of the stack, you might be running into this bug.

A vote may help to get it looked at!
R
knutwannheden
Posts: 13
Joined: Wed Dec 07, 2011 3:18 pm

Re: inaccurate timing of native method calls

Post by knutwannheden »

Hi Robert,

Interesting. Thanks for the pointer!

It indeed sounds very much like the problem I encountered. Given that the report is over 2 years old, has a priority of low, and 0 votes I don't expect much to happen here... But if I manage to confirm this I will certainly give the bug my vote.

Regards,
Knut
Gibson
Posts: 181
Joined: Mon Apr 11, 2005 10:38 am

Re: inaccurate timing of native method calls

Post by Gibson »

Anton Katilin wrote:Sampling cannot accurately measure methods smaller than the sampling period. The problem is with all short methods, not only the native ones. It's a fundamental limitation of sampling approach.
http://www.yourkit.com/docs/10/help/cpu_intro.jsp
Actually Anton, I don't quite agree. It's true that if the sampling frequency of the profiler is less than twice (1/time in short-lived method) then the profiler doesn't have enough information to correctly regenerate the original signal (thanks Messrs Nyquist, Shannon, Kotelnikov and Whittaker!). But if anything this should under-estimate the time in the short method, not over-estimate.

You can get empirical confirmation of this by taking the small test case in the bug referenced above and playing with the method doLog. If you replace the call to StrictMath.log by a call to, for example, StrictMath.sin, or another short-lived but native method, that method shows up as a hot spot in the profiler; however if you replace the call to StrictMath.log by a call to StrictMath.toRadians, or another short-lived but non-native method, that method drops off the radar completely and it is nextDouble and nextGaussian which show up as hot spots.
Gibson
Posts: 181
Joined: Mon Apr 11, 2005 10:38 am

Re: inaccurate timing of native method calls

Post by Gibson »

knutwannheden wrote:Given that the report is over 2 years old, has a priority of low, and 0 votes I don't expect much to happen here
That's because I haven't spent enough time trolling for votes on the YourKit forums! And maybe somebody with a super-platinum support contract with Oracle will be bitten by the bug and find this post ...

The "good news" is that it isn't a YourKit problem since the Oracle VM is reporting the wrong info so whatever profiler you use you will get the wrong results. (At least I suppose, I only tested with hprof and YourKit and not at all with other VMs.)
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: inaccurate timing of native method calls

Post by Anton Katilin »

Hello,

We confirm the problem, but what can we do from our side?

We use the stacks which JVM reports.
How can we understand that some particular stack is incorrect? And which stack should we use instead?

We see no solution for this problem at the moment.

Best regards,
Anton
knutwannheden
Posts: 13
Joined: Wed Dec 07, 2011 3:18 pm

Re: inaccurate timing of native method calls

Post by knutwannheden »

Is this problem only relevant when profiling in the sampling mode or does it also affect the tracing mode?

I think the best thing you can do (apart from voting for this JVM bug :)) is documenting it. I spent considerable time trying to find the source of this problem and I imagine other users may run into the same problem. Maybe you could even add some kind of decorator to native methods in the profiler with a link to this documentation.

Even better would of course be if you find some way of compensating for this effect. Or could the user could be given the option to filter these out?
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: inaccurate timing of native method calls

Post by Anton Katilin »

Is this problem only relevant when profiling in the sampling mode or does it also affect the tracing mode?
This problem is in sampling mode only. In tracing all stacks are exact. Tracing results can be non adequate if there are many calls to short running methods traced, but that's completely different problem which can be solved by properly choosing filters.
Or could the user could be given the option to filter these out?
Do you offer to ignore particular methods from given list, if they occur at the top of sampled stack, and use their caller instead?

By the way, have you tried version 11 EAP? We have excluded some JVM intrinsic methods from instrumentation to reduce overhead. Perhaps it can somehow change the situation with wrong sampled stacks too.
knutwannheden
Posts: 13
Joined: Wed Dec 07, 2011 3:18 pm

Re: inaccurate timing of native method calls

Post by knutwannheden »

Anton Katilin wrote:Do you offer to ignore particular methods from given list, if they occur at the top of sampled stack, and use their caller instead?
Honestly I don't know how this exactly should work. Would it be possible to just use the filters normally used when tracing?

I think adding a note to the documentation would be the most valuable change.
By the way, have you tried version 11 EAP? We have excluded some JVM intrinsic methods from instrumentation to reduce overhead. Perhaps it can somehow change the situation with wrong sampled stacks too.
I will give that a try. Thanks for the tip.
Gibson
Posts: 181
Joined: Mon Apr 11, 2005 10:38 am

Re: inaccurate timing of native method calls

Post by Gibson »

Anton Katilin wrote:By the way, have you tried version 11 EAP? We have excluded some JVM intrinsic methods from instrumentation to reduce overhead. Perhaps it can somehow change the situation with wrong sampled stacks too.
Interesting remark, I didn't think about the distinction between intrinsic and ordinary native methods. I did a small test to confirm, looks like the problem affects all native methods, not just intrinsic ones.

Having said that, I'd love to check the latest build but don't have a license since [email protected] aren't checking their email (hint, hint).
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: inaccurate timing of native method calls

Post by Anton Katilin »

Please just download and install version 11 EAP:
http://www.yourkit.com/eap

It will not ask to enter a key. It will run out of the box.
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: inaccurate timing of native method calls

Post by Anton Katilin »

P.S.
Please also try startup option "disablenatives" with version 11 or with an older one, in order to prevent instrumentation of all native methods. It is interesting whether it affects the results.

How to specify it: http://www.yourkit.com/docs/10/help/startup_options.jsp
Gibson
Posts: 181
Joined: Mon Apr 11, 2005 10:38 am

Re: inaccurate timing of native method calls

Post by Gibson »

I tested with 9.0.9 and 11-EAPb11008 with and without the option disablenatives and didn't notice any significant differences.
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: inaccurate timing of native method calls

Post by Anton Katilin »

Interesting. Thanks for sharing the results.

If possible, could you please also run the same test on a non-HotSpot VM, e.g. JRockit or IBM.
Post Reply