JDI: three ways to attach to a Java process

If you've looked at my recent posts, you know I'm working on a plugin for VisualVM, a very useful tool supplied with the JDK. In one example, I showed how to attach to a waiting Java application using a socket-based AttachingConnector. At that time I said that there were two primary ways of attaching to a process with JDI -- via shared memory, and with a socket.

It turns out there is a "third way". Following is an example of why this way is useful, and why it was provided.

When I last wrote JDI programs (in Java 5), I would notice that my target application would start up and print (to stdout) the port on which it was listening, as in the following:

Listening for transport dt_socket at address: 55779

In Java 5, if you detached your debugger from this process, you would get another line to stdout in the target's console, like this:

Listening for transport dt_socket at address: 55779

and this would go on for as long as you chose to attach and detach, etc.

At some point (and I don't know when this started happening), the port on which the target is listening started changing on each detach of an external debugger. If in Java 6 (I'm using u20), you repeatedly attach and detach from the target process, you'll see the following out in the target's console:

Listening for transport dt_socket at address: 55837
ERROR: transport error 202: recv error: Connection reset by peer
Listening for transport dt_socket at address: 55844
ERROR: transport error 202: recv error: Connection reset by peer
Listening for transport dt_socket at address: 55846
ERROR: transport error 202: recv error: Connection reset by peer
Listening for transport dt_socket at address: 55911

If you're writing an application that attaches using the debug port, each time you attach you need to find out what port the target is using. This information is not available from the process itself; in other words, you have to play the usual unpleasant game of capturing console output to know what the port is. Even if you specify a port at target start, you still need to get your hands on the value.

You can still find the original request for a feature to attach to a process by its process ID if you search around the old Java bug reports. The long and short of it: a new AttachingConnector was created, one which attaches by PID. As you know, sometimes it isn't much fun finding a process's PID either. In my case, however, I am writing a plugin for VisualVM, and one thing you get for free when you do that is Visual VM's API, which as you might expect includes calls to get the PID. My goal, then, is to use this new connector in my VisualVM plugin, and I thought it might be appreciated if I shared the details.

I've adapted my test program from an earlier post so that it now outputs the details of each AttachingConnector; the changed code fragment is shown here:

List<AttachingConnector> attachingConnectors = vmMgr.attachingConnectors();
for (AttachingConnector ac: attachingConnectors)
{
  Map paramsMap = ac.defaultArguments();
  Iterator keyIter = paramsMap.keySet().iterator();
  System.out.println("AttachingConnector:  '" + ac.getClass().getName() + "'");
  System.out.println("  name: '" + ac.name() + "'");
  System.out.println("  description: '" + ac.description() + "'");
  System.out.println("  transport name: '" + ac.transport().name() + "'");
  System.out.println("  default arguments:");
  while (keyIter.hasNext())
  {
    String nextKey = keyIter.next();
    System.out.println("    key: '" + nextKey + "'; value: '" + paramsMap.get(nextKey) + "'");
  }
}

The output from this code is shown below:

AttachingConnector:  'com.sun.tools.jdi.SocketAttachingConnector'
 name: 'com.sun.jdi.SocketAttach'
 description: 'Attaches by socket to other VMs'
 transport name: 'dt_socket'
 default arguments:
   key: 'timeout'; value: 'timeout='
   key: 'hostname'; value: 'hostname=AdamsResearch'
   key: 'port'; value: 'port='
AttachingConnector:  'com.sun.tools.jdi.SharedMemoryAttachingConnector'
 name: 'com.sun.jdi.SharedMemoryAttach'
 description: 'Attaches by shared memory to other VMs'
 transport name: 'dt_shmem'
 default arguments:
   key: 'timeout'; value: 'timeout='
   key: 'name'; value: 'name='
AttachingConnector:  'com.sun.tools.jdi.ProcessAttachingConnector'
 name: 'com.sun.jdi.ProcessAttach'
 description: 'Attaches to debuggee by process-id (pid)'
 transport name: 'local'
 default arguments:
   key: 'pid'; value: 'pid='
   key: 'timeout'; value: 'timeout='

A couple of things I hadn't noticed before is that the socket-based connector comes with the hostname argument pre-set to my machine's hostname, and that all three connectors have a timeout default argument. The first observation brings up an interesting point: if you use the local, PID-based connector, remember that you'll only be attaching to processes on your debugger's host.

I changed my test program to use the local connector and it works as before! Well, no, actually, it does not. Here's what I now get:

java.lang.UnsatisfiedLinkError: no attach in java.library.path
Exception in thread "main" java.io.IOException: no providers installed
at com.sun.tools.jdi.ProcessAttachingConnector.attach(ProcessAttachingConnector.java:86)
at com.adamsresearch.jdiDemo.JDIDemo.main(JDIDemo.java:70)

Does this mean the local connector isn't exactly ready for use? No, but I have been burned by the same issue that has plagued a number of others (scroll down in that page -- the issue was found by a reader of that post and was solved, partially, by another reader of that post). I'm working on a Windows platform, and when you do that you have to be a little careful ;-> . In this case, the problem is caused by 1) using the java interpreter as found on the system path, and 2) not making sure that path points directly to your JDK or JRE directory. The executable will look in a path relative to itself for the needed libraries, and when Windows copies the java executable to C:\Windows\system32 (or similar) -- and if you use that executable -- that relative path is broken. I believe this is the true issue, unlike described in the comments on the above post, where the distinction is made between using the JRE java and the JDK java. I don't think that's the issue. For example, below are the results of my attach test in 3 different scenarios:

Using java from my path, the first hit of which comes from C:\Windows\system32:


java -cp c:\jdk1.6.0_20\lib\tools.jar;. com.adamsresearch.jdiDemo.JDIDemo 10816 863 fileName
...
java.lang.UnsatisfiedLinkError: no attach in java.library.path
Exception in thread "main" java.io.IOException: no providers installed
 at com.sun.tools.jdi.ProcessAttachingConnector.attach(ProcessAttachingConnector.java:86)
 at com.adamsresearch.jdiDemo.JDIDemo.main(JDIDemo.java:70)

Using the full path to the JRE bin java:


c:\jdk1.6.0_20\jre\bin\java -cp c:\jdk1.6.0_20\lib\tools.jar;. com.adamsresearch.jdiDemo.JDIDemo 10816 863 fileName
...
Attached to process 'Java HotSpot(TM) 64-Bit Server VM'

Using the full path to the JDK bin java:

c:\jdk1.6.0_20\bin\java -cp c:\jdk1.6.0_20\lib\tools.jar;. com.adamsresearch.jdiDemo.JDIDemo 10816 863 fileName
...
Attached to process 'Java HotSpot(TM) 64-Bit Server VM'

As you can see, the above seems to support my theory that it's not the JRE vs the JDK, but rather the context-poor placement of the java executable in the "usual" Windows binaries directory, that caused the problem. That posting is several years old, so it is possible that at that time, the needed JDI libraries actually were not included in the JRE, but it is clear that today, you will see the same exception if you use the java executable found in Windows' default binaries directory.

Now, if I run my JDI application against my JarView utility, searching for AttachingConnector in the JDK installation directory, I get the following output:

Breakpoint at line 863:
fileName = 'AttachingConnector.class'
Breakpoint at line 863:
fileName = 'GenericAttachingConnector$1.class'
Breakpoint at line 863:
fileName = 'GenericAttachingConnector.class'
Breakpoint at line 863:
fileName = 'ProcessAttachingConnector$1.class'
Breakpoint at line 863:
fileName = 'ProcessAttachingConnector$2.class'
Breakpoint at line 863:
fileName = 'ProcessAttachingConnector.class'
Breakpoint at line 863:
fileName = 'SharedMemoryAttachingConnector$1.class'
Breakpoint at line 863:
fileName = 'SharedMemoryAttachingConnector.class'
Breakpoint at line 863:
fileName = 'SocketAttachingConnector$1.class'
Breakpoint at line 863:
fileName = 'SocketAttachingConnector.class'

and so have done what I set out to do, which is 1) debug-attach by process ID, and 2) thrash through the inevitable hiccups and share the solutions. Hopefully this will be useful to you, too.

Note: actually, there are even more ways to attach to a Java process. JPDA Connection and Invocation is the definitive guide, from Oracle. If you're going to be writing debuggers, you can't go wrong reading this page first.

For nearly two years, I've been trying to branch out and add another programming language to my brain. I read and blogged about Seven Languages in Seven Weeks, by Brian Tate, an excellent book that I blasted through in seven days to save a little time. If you read my blog, you'll know that I finally settled on Haskell, started posting about my experience as an object-oriented programmer writing in a functional language, and then things kind of fizzled out.

I really like Haskell. However, I think I'm one of those people who tend to learn better when under pressure. Since I didn't have a job requirement to learn Haskell or an otherwise motivating situation, I never really quite got in to it. I still plan to, some day.

But, I have finally picked the "new" language I want to learn, and that is R (I say "new" because of course R is not a new language). I had a number of reasons to do so:

Big Data is all the buzzword-rage right now, and R figures prominently in many big-data scenarios.
I'm taking MOOCs at coursera, and the ones I'm taking use R as the programming platform, ensuring that I must have more than a superficial understanding of the language. I had actually looked at R once before and never stuck with it for the same reasons I did not stick with Haskell -- no looming deadlines!
As I learn more about R, I become more impressed by how handily it performs tasks that require a lot of boilerplate code in any other language I've used, so that experience provides me more motivation to keep learning.
I am currently working at a bank, and I'm already starting to use R not only to greatly speed up some tasks that I need to perform, but also to perform analyses that would have required so much Java code that they would have gone on the "back burner."

I'm also happy to report there has been some convergence, for me, among big data, R, Haskell and my recent exposure to functional programming. R is an interesting language. I don't have an especially formal computer-science background (instead, I'm from physics, math, and electrical engineering), so I probably would not be the best person to articulate how R checks (and does not check) boxes for functional and object-oriented languages. But all that Haskell investigation helped a lot when I started learning MapReduce, and seeing functional features in R that also fit well into the MapReduce paradigm makes me feel - as all curious types should - that all that investigation was worthwhile.

I'll still blog about Java occasionally, but my posts for the near future will be focused on my self-training to fill in gaps in my skill set related to big data. I have started a new blog on this topic, called Data Scientist in Training. If you read me on DZone, you don't have to do much to find me, as my posts from both blogs will continue to find their way to DZone (the big-data posts go to a microzone called Big Data/BI Zone). If you read me directly on Blogger, then please bookmark the link above if you're interested in what I'm doing. At the least, please check out my Welcome! post, where I explain my path and reference some resources that you, too, may want to check out in the event that you want to learn more about big data, too.

My posts about R on Data Scientist in Training will not explicitly say anything in the title like "Java developer struggles with R data frames", but it will still be obvious that my approach to R is that of a developer who has used Java for about 90% of his coding for the last 15 years. If you're a Java developer and are learning R, I hope there will be some content there of special use to you. As I've searched online while learning R, I've noticed helpful responders trying to explain how to move from the "use a for-loop to iterate and then build your model in rows" approach to "use a mapping function to create your new column of data, then add it to your data frame". (In fact, this reminds me of another feature I like about R -- R data frames remind me of tables in the column-oriented databases used extensively in big data). I'm going to blog in near-real-time so I don't forget those dead ends I encountered as I was trying to map Java onto R, and that perspective is the one I think will be most helpful to fellow Java/OO developers.

There are a few posts on Data Scientist in Training already. The next one will be specifically about R -- I hope you check it out when it arrives!

Wayne Adams' Blog