inaccurate timing of native method calls
-
- Posts: 13
- Joined: Wed Dec 07, 2011 3:18 pm
inaccurate timing of native method calls
Hi,
As I also found mentioned in another thread (http://forums.yourkit.com/viewtopic.php?f=3&t=912) the CPU sampling is not accurate for native methods. I performed my tests using YourKit 9.5.4.
Specifically, our application quite frequently indirectly calls String#intern() (typically only for a handful of distinct Strings; and the call is part of a 3rd party library and is thus out of our control). When CPU profiling our application with the sampling method the String#intern() method shows up as using around 17% of the entire CPU time, which is clearly way too much. (I found out about the inaccurate reporting the hard way by investing quite some effort into patching the 3rd party code to *not* call String#intern() as often, only to find out that it made a very small difference in the end.)
If this problem cannot be fixed in YourKit, is there maybe a way to factor out the impact of it in the reporting?
Also note that CPU profiling using the tracing method is not always an option, as it would take way too long to be practical.
Thanks,
Knut Wannheden
As I also found mentioned in another thread (http://forums.yourkit.com/viewtopic.php?f=3&t=912) the CPU sampling is not accurate for native methods. I performed my tests using YourKit 9.5.4.
Specifically, our application quite frequently indirectly calls String#intern() (typically only for a handful of distinct Strings; and the call is part of a 3rd party library and is thus out of our control). When CPU profiling our application with the sampling method the String#intern() method shows up as using around 17% of the entire CPU time, which is clearly way too much. (I found out about the inaccurate reporting the hard way by investing quite some effort into patching the 3rd party code to *not* call String#intern() as often, only to find out that it made a very small difference in the end.)
If this problem cannot be fixed in YourKit, is there maybe a way to factor out the impact of it in the reporting?
Also note that CPU profiling using the tracing method is not always an option, as it would take way too long to be practical.
Thanks,
Knut Wannheden
-
- Posts: 6172
- Joined: Wed Aug 11, 2004 8:37 am
Re: inaccurate timing of native method calls
Hello Knut
Sampling cannot accurately measure methods smaller than the sampling period. The problem is with all short methods, not only the native ones. It's a fundamental limitation of sampling approach.
http://www.yourkit.com/docs/10/help/cpu_intro.jsp
To measure short methods please use tracing. Yes, its overhead can be very big, but you can reduce it by specifying appropriate filters:
http://www.yourkit.com/docs/10/help/filters.jsp
Also, to track execution of particular methods you can use probes:
http://www.yourkit.com/docs/10/help/probes.jsp
or triggers:
http://www.yourkit.com/docs/10/help/triggers.jsp
Best regards,
Anton
Sampling cannot accurately measure methods smaller than the sampling period. The problem is with all short methods, not only the native ones. It's a fundamental limitation of sampling approach.
http://www.yourkit.com/docs/10/help/cpu_intro.jsp
To measure short methods please use tracing. Yes, its overhead can be very big, but you can reduce it by specifying appropriate filters:
http://www.yourkit.com/docs/10/help/filters.jsp
Also, to track execution of particular methods you can use probes:
http://www.yourkit.com/docs/10/help/probes.jsp
or triggers:
http://www.yourkit.com/docs/10/help/triggers.jsp
Best regards,
Anton
-
- Posts: 181
- Joined: Mon Apr 11, 2005 10:38 am
Re: inaccurate timing of native method calls
Try taking random stack traces from your app (using ctrl-\ or jstack). If you find that the native method is disproportionately represented at the bottom of the stack, you might be running into this bug.
A vote may help to get it looked at!
R
A vote may help to get it looked at!
R
-
- Posts: 13
- Joined: Wed Dec 07, 2011 3:18 pm
Re: inaccurate timing of native method calls
Hi Robert,
Interesting. Thanks for the pointer!
It indeed sounds very much like the problem I encountered. Given that the report is over 2 years old, has a priority of low, and 0 votes I don't expect much to happen here... But if I manage to confirm this I will certainly give the bug my vote.
Regards,
Knut
Interesting. Thanks for the pointer!
It indeed sounds very much like the problem I encountered. Given that the report is over 2 years old, has a priority of low, and 0 votes I don't expect much to happen here... But if I manage to confirm this I will certainly give the bug my vote.
Regards,
Knut
-
- Posts: 181
- Joined: Mon Apr 11, 2005 10:38 am
Re: inaccurate timing of native method calls
Actually Anton, I don't quite agree. It's true that if the sampling frequency of the profiler is less than twice (1/time in short-lived method) then the profiler doesn't have enough information to correctly regenerate the original signal (thanks Messrs Nyquist, Shannon, Kotelnikov and Whittaker!). But if anything this should under-estimate the time in the short method, not over-estimate.Anton Katilin wrote:Sampling cannot accurately measure methods smaller than the sampling period. The problem is with all short methods, not only the native ones. It's a fundamental limitation of sampling approach.
http://www.yourkit.com/docs/10/help/cpu_intro.jsp
You can get empirical confirmation of this by taking the small test case in the bug referenced above and playing with the method doLog. If you replace the call to StrictMath.log by a call to, for example, StrictMath.sin, or another short-lived but native method, that method shows up as a hot spot in the profiler; however if you replace the call to StrictMath.log by a call to StrictMath.toRadians, or another short-lived but non-native method, that method drops off the radar completely and it is nextDouble and nextGaussian which show up as hot spots.
-
- Posts: 181
- Joined: Mon Apr 11, 2005 10:38 am
Re: inaccurate timing of native method calls
That's because I haven't spent enough time trolling for votes on the YourKit forums! And maybe somebody with a super-platinum support contract with Oracle will be bitten by the bug and find this post ...knutwannheden wrote:Given that the report is over 2 years old, has a priority of low, and 0 votes I don't expect much to happen here
The "good news" is that it isn't a YourKit problem since the Oracle VM is reporting the wrong info so whatever profiler you use you will get the wrong results. (At least I suppose, I only tested with hprof and YourKit and not at all with other VMs.)
-
- Posts: 6172
- Joined: Wed Aug 11, 2004 8:37 am
Re: inaccurate timing of native method calls
Hello,
We confirm the problem, but what can we do from our side?
We use the stacks which JVM reports.
How can we understand that some particular stack is incorrect? And which stack should we use instead?
We see no solution for this problem at the moment.
Best regards,
Anton
We confirm the problem, but what can we do from our side?
We use the stacks which JVM reports.
How can we understand that some particular stack is incorrect? And which stack should we use instead?
We see no solution for this problem at the moment.
Best regards,
Anton
-
- Posts: 13
- Joined: Wed Dec 07, 2011 3:18 pm
Re: inaccurate timing of native method calls
Is this problem only relevant when profiling in the sampling mode or does it also affect the tracing mode?
I think the best thing you can do (apart from voting for this JVM bug ) is documenting it. I spent considerable time trying to find the source of this problem and I imagine other users may run into the same problem. Maybe you could even add some kind of decorator to native methods in the profiler with a link to this documentation.
Even better would of course be if you find some way of compensating for this effect. Or could the user could be given the option to filter these out?
I think the best thing you can do (apart from voting for this JVM bug ) is documenting it. I spent considerable time trying to find the source of this problem and I imagine other users may run into the same problem. Maybe you could even add some kind of decorator to native methods in the profiler with a link to this documentation.
Even better would of course be if you find some way of compensating for this effect. Or could the user could be given the option to filter these out?
-
- Posts: 6172
- Joined: Wed Aug 11, 2004 8:37 am
Re: inaccurate timing of native method calls
This problem is in sampling mode only. In tracing all stacks are exact. Tracing results can be non adequate if there are many calls to short running methods traced, but that's completely different problem which can be solved by properly choosing filters.Is this problem only relevant when profiling in the sampling mode or does it also affect the tracing mode?
Do you offer to ignore particular methods from given list, if they occur at the top of sampled stack, and use their caller instead?Or could the user could be given the option to filter these out?
By the way, have you tried version 11 EAP? We have excluded some JVM intrinsic methods from instrumentation to reduce overhead. Perhaps it can somehow change the situation with wrong sampled stacks too.
-
- Posts: 13
- Joined: Wed Dec 07, 2011 3:18 pm
Re: inaccurate timing of native method calls
Honestly I don't know how this exactly should work. Would it be possible to just use the filters normally used when tracing?Anton Katilin wrote:Do you offer to ignore particular methods from given list, if they occur at the top of sampled stack, and use their caller instead?
I think adding a note to the documentation would be the most valuable change.
I will give that a try. Thanks for the tip.By the way, have you tried version 11 EAP? We have excluded some JVM intrinsic methods from instrumentation to reduce overhead. Perhaps it can somehow change the situation with wrong sampled stacks too.
-
- Posts: 181
- Joined: Mon Apr 11, 2005 10:38 am
Re: inaccurate timing of native method calls
Interesting remark, I didn't think about the distinction between intrinsic and ordinary native methods. I did a small test to confirm, looks like the problem affects all native methods, not just intrinsic ones.Anton Katilin wrote:By the way, have you tried version 11 EAP? We have excluded some JVM intrinsic methods from instrumentation to reduce overhead. Perhaps it can somehow change the situation with wrong sampled stacks too.
Having said that, I'd love to check the latest build but don't have a license since [email protected] aren't checking their email (hint, hint).
-
- Posts: 6172
- Joined: Wed Aug 11, 2004 8:37 am
Re: inaccurate timing of native method calls
Please just download and install version 11 EAP:
http://www.yourkit.com/eap
It will not ask to enter a key. It will run out of the box.
http://www.yourkit.com/eap
It will not ask to enter a key. It will run out of the box.
-
- Posts: 6172
- Joined: Wed Aug 11, 2004 8:37 am
Re: inaccurate timing of native method calls
P.S.
Please also try startup option "disablenatives" with version 11 or with an older one, in order to prevent instrumentation of all native methods. It is interesting whether it affects the results.
How to specify it: http://www.yourkit.com/docs/10/help/startup_options.jsp
Please also try startup option "disablenatives" with version 11 or with an older one, in order to prevent instrumentation of all native methods. It is interesting whether it affects the results.
How to specify it: http://www.yourkit.com/docs/10/help/startup_options.jsp
-
- Posts: 181
- Joined: Mon Apr 11, 2005 10:38 am
Re: inaccurate timing of native method calls
I tested with 9.0.9 and 11-EAPb11008 with and without the option disablenatives and didn't notice any significant differences.
-
- Posts: 6172
- Joined: Wed Aug 11, 2004 8:37 am
Re: inaccurate timing of native method calls
Interesting. Thanks for sharing the results.
If possible, could you please also run the same test on a non-HotSpot VM, e.g. JRockit or IBM.
If possible, could you please also run the same test on a non-HotSpot VM, e.g. JRockit or IBM.