YJP Agent segfault in native libssl call

Questions about YourKit Java Profiler
Post Reply
duaneg
Posts: 3
Joined: Thu Nov 25, 2021 10:46 pm

YJP Agent segfault in native libssl call

Post by duaneg »

Hi Folks,

I am getting the following crash when I try and use the latest 2021.11-b221 version (the same crash happened with the previous 2021.11 build I tried; 2021.3 is working OK):

# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f71a2ff7e7c, pid=14787, tid=14805
#
# JRE version: OpenJDK Runtime Environment 18.9 (11.0.13+8) (build 11.0.13+8-LTS)
# Java VM: OpenJDK 64-Bit Server VM 18.9 (11.0.13+8-LTS, mixed mode, tiered, compressed oops, parallel gc, linux-amd64)
# Problematic frame:
# C [libssl.so.10+0x46e7c] ssl_cert_dup+0x7c

The JVM with the agent is running inside a docker container with a CentOS 7.9-based image. The installed SSL package is "openssl-1.0.2k-22.el7_9". I'm running the profiler on an arch host. Let me know if you need further details and/or the full dump file.
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: YJP Agent segfault in native libssl call

Post by Anton Katilin »

Hello,

It's a new issue.

In version 2021.11 the profiler agent communicates over a secure connection, hence the new dependency on ssl. This functionality did not exist in previous versions such as 2021.3.
Let me know if you need further details and/or the full dump file.
Do you have the same problem with different (newer) OS and/or openssl version combinations?

Can you provide a reproducible example, e.g. a docker file?

Could you please send the JVM crash log file hs_err<pid>.log and the profiler agent log file ~/.yjp/log/<session name>-<pid>.log to [email protected]

Best regards,
Anton
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: YJP Agent segfault in native libssl call

Post by Anton Katilin »

Update:

We didn't manage to reproduce the problem.

We used this Dockerfile:

Code: Select all

FROM centos:centos7

RUN yum -y update
RUN yum -y install java-11-openjdk java-11-openjdk-devel wget unzip openssl

RUN wget https://www.yourkit.com/download/docker/YourKit-JavaProfiler-2021.11-docker.zip -P /tmp/ && \
  unzip /tmp/YourKit-JavaProfiler-2021.11-docker.zip -d /usr/local && \
  rm /tmp/YourKit-JavaProfiler-2021.11-docker.zip

RUN echo "public class A { public static void main(String[] args) throws Exception {Thread.sleep(Integer.MAX_VALUE);}}" > /tmp/A.java
RUN javac /tmp/A.java -d /tmp

EXPOSE 10001

CMD java -agentpath:/usr/local/YourKit-JavaProfiler-2021.11/bin/linux-x86-64/libyjpagent.so=port=10001,listen=all -cp /tmp A 
We ran the container and successfully connected to the agent port 10001 from the profiler UI without any issues.

Version output from within the container:
[root@4bf038b20eb0 /]# openssl version
OpenSSL 1.0.2k-fips 26 Jan 2017
[root@4bf038b20eb0 /]# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)
duaneg
Posts: 3
Joined: Thu Nov 25, 2021 10:46 pm

Re: YJP Agent segfault in native libssl call

Post by duaneg »

I'll send a crash log and profiler agent log as requested.

I'm not able to provide a docker file to reproduce it, sorry. Our application is proprietary and requires licensing which is non-trivial to arrange on our end and setup on yours.

Note that I was not able to reproduce it with your minimal test code, in my own container, where it happens every time when running our software. The bug triggers on startup, and always crashes in the same native call, but it occurs at varying points in the startup sequence. I suspect triggering it requires the JVM to be doing something "interesting" and may be timing dependent. I also can't reproduce it running directly on my host machine.

It is also possible that our code is doing something to cause it: we have native code in the mix, including third-party licensing stuff which will be using crypto libraries. However, I have tried hacking that all out and the bug still occurs.
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: YJP Agent segfault in native libssl call

Post by Anton Katilin »

Thank you for the explanation.

We have received the logs, thank you.
we have native code in the mix, including third-party licensing stuff which will be using crypto libraries.
I'm not familiar with openssh library internals, but a quick search reveals it is not thread safe by default: https://wiki.openssl.org/index.php/Libc ... ead_Safety
So it can be possible that the agent and your native code access the library in parallel which breaks it.

Anyway, could you please try your setup with a different base image, e.g. with the latest CentOS version? Perhaps the problem is version specific.
Anton Katilin
Posts: 6172
Joined: Wed Aug 11, 2004 8:37 am

Re: YJP Agent segfault in native libssl call

Post by Anton Katilin »

Could you please provide the output of the command "ldd -r libyjpagent.so" inside your container.
duaneg
Posts: 3
Joined: Thu Nov 25, 2021 10:46 pm

Re: YJP Agent segfault in native libssl call

Post by duaneg »

Here is the ldd output for the agent:
(process-extension) ~/gproms-core $ ldd -r libyjpagent.so
ldd: warning: you do not have execution permission for `libyjpagent.so'
linux-vdso.so.1 => (0x00007ffd243ec000)
librt.so.1 => /lib64/librt.so.1 (0x00007fecc5ea0000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fecc5c9c000)
libm.so.6 => /lib64/libm.so.6 (0x00007fecc599a000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fecc577e000)
libc.so.6 => /lib64/libc.so.6 (0x00007fecc53b0000)
/lib64/ld-linux-x86-64.so.2 (0x00007fecc7106000)
As requested by email, here is the version of SSL dynamically linked into the running JVM:
(process-extension) ~/gproms-core $ lsof -p 531 | grep ssl
java 531 builder mem REG 254,1 15210552 /usr/lib64/libssl.so.1.0.2k (path dev=0,44)
I'll send you the full output showing all the dynamically linked libraries, separately via email.

As I said, I was able to reproduce the crash with all the licensing code excluded, however even without that we do a bunch of stuff that uses crypto and SSL, so there is definitely the possibility for conflict there.
Post Reply