Fully featured low overhead profiler for Java EE and Java SE platforms.
Monitoring and profiling solution for Gradle, Maven, Ant, JUnit and TestNG.
Easy to use performance and memory profiler for .NET framework.

Profiling overhead: how to reduce or avoid

The profiler may add some overhead to the performance of applications you profile. This overhead may vary from virtually zero to significant, depending on the conditions described below.

Overhead of running an application with the profiler

To enable such features as recording object allocation and CPU tracing, the profiler inserts some supporting code into the bytecode of the profiled application by means of bytecode instrumentation. When object allocation recording and CPU tracing are not performed, this inserted code is in inactive state but still adds a small overhead to the performance of instrumented methods (1-5%, depending on the application). The process of bytecode instrumentation itself, of course, also requires some fixed time that depends on the number of loaded classes and their methods.

In most cases, such overhead is more than acceptable.

For cases when maximum performance is needed, e.g. if profiling in production, this overhead can be totally eliminated by avoiding bytecode instrumentation. The price you pay is that some features are disabled. But even when they are disabled, you can still capture memory snapshots and perform CPU sampling, which is enough in many cases (see Solving performance problems).

You can disable bytecode instrumentation by specifying "disabletracing", "disablealloc" and "probe_disable=*" startup options.

Since the greatest share of the overhead described above is caused by instrumentation needed for tracing, as a compromise you can disable this feature alone, keeping the ability to record object allocations on demand.

There is another, almost negligible, issue: if JVM loads an agent that is capable of profiling heap memory, JVM class data sharing is disabled. This may slightly increase startup time, i.e. the time the JVM needs to load its core classes from rt.jar. For details about class sharing, refer to this page on the Sun website: http://java.sun.com/j2se/1.5.0/docs/guide/vm/class-data-sharing.html

Overhead when measuring is performed

When CPU profiling and/or object allocation recording are performed, the profiler adds extra overhead. After measuring is done and turned off, overhead should decrease to the level described above in "Overhead of running an application with the profiler".

Snapshot capture

During the capture, the profiled application is paused. The time it takes to capture a memory snapshot depends on the heap size. Capturing memory snapshots of huge heaps takes more time because of the intensive use of the system swap file (if little free physical memory is available).

Thread stack and status telemetry

Thread stack and status information is shown in Thread view as well as in other telemetry views. This information can be very useful because it allows you to connect to the profiled application on demand and discover how the application behaved in the past. In most cases, there is no significant overhead of collecting this information.

However, it makes sense to disable it in production Java EE servers in order to ensure minimum profiling overhead. This can be done with the help of "disablestacktelemetry" startup option.

Exception telemetry

Exception telemetry helps discovering performance issues and logic errors. In most cases, there is no significant overhead of collecting this information.

However, it makes sense to disable it in production Java EE servers in order to ensure minimum profiling overhead. This can be done with the help of "disablestacktelemetry" startup option.