- System requirements
- Profiler architecture
- Running the profiler
- Profiler activation
- Start profiling
- Solving performance problems
- CPU profiling
- Deadlock detector
- Memory profiling
- Memory telemetry
- Memory snapshot
- Object allocation recording
- Shallow and retained sizes
- Memory views
- Memory inspections
- Comparing memory snapshots
- Support of HPROF format snapshots
- Support of Java Flight Recorder (JFR)
- Support of Portable Heap Dumps (.phd)
- Values of primitive types
- Persistent object IDs
- Useful actions
- Set description language
- Garbage collection
- Monitor profiling
- Exception profiling
- Probes: monitor events of various kinds
- Performance Charts
- Inspections: automatic recognition of typical problems
- Automatically trigger actions on event
- Summary, snapshot annotation, automatic deobfuscation
- Time measurement (CPU time, wall time)
- Snapshot directory customization
- Export of profiling results to HTML, CSV, XML, plain text
- Profiler Java API
- Profiler HTTP API
- Command line tool to control profiling
- FAQ: How to profile in my scenario?
Object allocation recording
YourKit Java Profiler can optionally record object allocations, that is, track method calls where objects are created.
To start allocation recording, connect to the profiled application and use corresponding toolbar button:
Object allocation recording modes
Recording of allocations adds performance overhead. This is the reason why allocations should not be recorded permanently. Instead, it is recommended to record allocations only when you really need them.
Also, you may choose from two available recording modes to balance between result accuracy and fullness and profiling overhead.
This mode provides full detail: stack and thread where particular object is created is obtained and remembered for each recorded object. Allocation information for particular objects in available in a memory snapshot. Also, this mode enables comprehensive analysis of excessive garbage allocation.
Memory snapshots captured when allocations are being recorded, or after object allocation recording has been stopped, contain allocation information.
If an object is created when allocations are not being recorded, or recording is restarted after the object has been created, or allocation results are explicitly cleared after the object has been created, snapshot will contain no allocation information for that object.
In order to keep moderate overhead, it is reasonable to skip allocation events for some percent of objects. This approach is useful to find the excessive garbage collection.
Also, you can record allocations for each object with size bigger than certain threshold. It is valuable to know where the biggest objects are allocated. Normally there are not so many such big objects, thus recording their allocation should not add any significant overhead.
In some rare cases you can record each created object e.g. when allocation information for some particular object must be obtained. To achieve this, set "Record each" to 1.
This mode provides the same level of detail as the classic mode and usually has lower profiling overhead.
Heap sampling uses the JVM heap sampling event available in Java 11+ to record objects created after allocating each N bytes on average.
Unlike the classic mode controlled with the
sizeLimitparameters, there is absolutely no profiling overhead for creation of the objects not being recorded.
Count allocated objects
This is the most lightweight allocation recording mode.
Use this mode to quickly get insight on how many objects are created and of which classes. In particular, this identifies excessive garbage allocation problems (lots of temporary objects).
Object counting is specially designed to have minimal possible, almost zero overhead:
- Object counting provides allocated object counts by class then by immediate allocator method with the exact line number, if available. Unlike the other recording mode, it does not provide stack traces and does not track particular instances, i.e. no allocation information for particular live object(s) is available in a memory snapshot.
- Objects allocated in different threads are summed and cannot be distinguished.
Use object counting to initially detect possible problems: thanks to its low overhead you may do this even in production.
Further investigation may involve using the classic or heap sampling allocation recording mode to get comprehensive profiling results with stack traces (call tree).
Start and stop recording
You can start and stop recording of allocations during execution of your application as many times as you wish. When allocations are not recorded, memory profiling adds no performance overhead to the application being profiled.
You can control recording of allocations from the profiler UI as described above, or via Profiler API.
Object allocation recording results in the user interface
The profiler offers two implementations of object allocation recording: bytecode instrumentation-based and heap sampling-based. They provide same results but may differ in profiling overhead and footprint.
Bytecode instrumentation is available in all supported Java versions. The profiler applies it by default when heap sampling (see below) is not available, i.e. if it's Java 10 or older. To use bytecode instrumentation on Java 11 or newer please specify the agent startup option
Bytecode instrumentation imposes almost no overhead while allocation recording is not running.
If you apply the startup options
disableallto totally eliminate the overhead, allocation recording will not be possible.
Heap sampling is a new Java profiling capability introduced in Java 11 via JEP 331. It offers a new JVMTI event to record allocated objects without using bytecode instrumentation.
Not instrumenting bytecode for object allocation recording reduces class load time and size of the resulting bytecode.
Also, this new approach is extremely useful in the attach mode, because it eliminates the pause on a first attempt to start object allocation recording, which is associated with instrumenting classes loaded before the agent has attached.
By default, the heap sampling event is used instead of bytecode instrumentation whenever available, that is, on Java 11 and newer. To use bytecode instrumentation instead, please specify the agent startup option
Sampled object allocation recording option (advanced topic)
The greatest contribution to object allocation recording overhead in the classic mode is made by obtaining exact stack trace for each recorded new object.
The idea is to estimate stack trace instead in order to reduce the overhead. It is similar to how CPU sampling works. The sampling thread periodically obtains stacks of running threads. When a thread creates a new object, the stack recently remembered for that thread is used as the object allocation stack estimation.
And just like in case of CPU sampling, the sampled object allocation recording results are relevant only for methods longer than the sampling period.
By default synchronous sampler is used to estimate stack traces if no CPU sampling is in progress. When you change CPU sampling mode, it also affects the sampled object allocation recording.
When should be used
Use this mode to get allocation hot spots, to find top methods responsible for most allocations. Don't use it if you need precise results for all recorded objects.
How to enable
Exact stack traces are gathered by default when you start object allocation recording. To use sampled stacks instead:
in the profiler UI: choose "Estimated (sampled) stacks instead of exact stacks" in the "Start Object Allocation Recording" toolbar popup window
if object allocation recording is started with
allocsizelimitstartup options, specify startup option
use API method
boolean sampledAllocationRecording. The old version of the method without that parameter records exact stacks.