Enterprise grade Java.
You'll read about Conferences, Java User Groups, Java, Integration, Reactive, Microservices and other technologies.

Monday, February 16, 2015

Byteman - a swiss army knife for byte code manipulation

12:00 Monday, February 16, 2015 Posted by Markus Eisele
,
I am working with a bunch of communities in JBoss and there is so much interesting stuff to talk about, that I can't wrap my head around every little bit myself. This is the main reason why I am very thankful to have the opportunity to welcome guest bloggers here from time to time. Today it is Jochen Mader, who  is part of the nerd herd at codecentric. He currently spends his professional time coding Vert.x-based middleware solutions, writing for different publications and talking at conferences. His free time belongs to his family, mtb and tabletop gaming. You can follow him on Twitter @codepitbull.

There are tools you normally don't want to use but are happy enough to know about them when the need arises. At least to me Byteman falls into this category. It's my personal swiss army knife to deal with a Big Ball of Mud or one of those dreaded Heisenbugs. So grab a current Byteman-distribution, unzip it to somewhere on your machine and we are off to some dirty work.

What is it
Byteman is a byte code manipulation and injection tool kit. It allows us to intercept and replace arbitrary parts of Java code to make it behave differently or break it (on purpose):
  •  get all threads stuck in a certain place and let them continue at the same time (hello race condition)
  •  throw Exceptions at unexpected locations
  •  tracing through your code during execution
  •  change return values
and a lot more things.

An example
Let's get right into some code to illustrate what Byteman can do for you.
Here we have a wonderful Singleton and a (sadly) good example of code you might find in many places.

public class BrokenSingleton {

    private static volatile BrokenSingleton instance;

    private BrokenSingleton() {
    }

    public static BrokenSingleton get() {
        if (instance == null) {
            instance = new BrokenSingleton();
        }
        return instance;
    }
}

Let's pretend we are the poor souls tasked with debugging some legacy code showing weird behaviour in production. After a while we discover this gem and our gut indicates something is wrong here.
At first we might try something like this:

public class BrokenSingletonMain {

    public static void main(String[] args) throws Exception {
        Thread thread1 = new Thread(new SingletonAccessRunnable());
        Thread thread2 = new Thread(new SingletonAccessRunnable());
        thread1.start();
        thread2.start();
        thread1.join();
        thread2.join();
    }

    public static class SingletonAccessRunnable implements Runnable {
        @Override
        public void run() {
            System.out.println(BrokenSingleton.get());
        }
    }
}

Running this there is a very small chance to see the actual problem happen. But most likely we won't see anything unusual. The Singleton is initialized once and the application performs as expected. A lot of times people start brute forcing by increasing the number of threads, hoping to make the problem show itself. But I prefer a more structured approach.

Enter Byteman.

The DSL
Byteman provides a convenient DSL to modify and trace application behaviour. We'll start with tracing calls in my little example. Take a look at this piece of code.

RULE trace entering
CLASS de.codepitbull.byteman.BrokenSingleton
METHOD get
AT ENTRY
IF true
DO traceln("entered get-Method")
ENDRULE

RULE trace read stacks
CLASS de.codepitbull.byteman.BrokenSingleton
METHOD get
AFTER READ BrokenSingleton.instance
IF true
DO traceln("READ:\n" + formatStack())
ENDRULE

The core building block of Byteman-scripts is the RULE.

It consists of several components (example shamelessly ripped from the Byteman-Docs:

 # rule skeleton
 RULE <rule name>
 CLASS <class name>
 METHOD <method name>
 BIND <bindings>
 IF <condition>
 DO <actions>
 ENDRULE


Each RULE needs to have unique __rule name__. The combination of CLASS and METHOD define where we want our modifications to apply. BIND allows us to bind variables to names we can use inside IF and DO. Using IF we can add conditions under which the rule fires. In DO the actual magic happens.

ENDRULE, it ends the rule.

Knwoing this my first rule is easily translated to:

When somebody calls _de.codepitbull.byteman.BrokenSingleton.get()_ I want to print the String "entered get-Method" right before the method body is called (that's what __AT ENTRY__ translates to).

My second rule can be translated to:

After reading (__AFTER READ__) the instance-Member of BrokenSingleton I want to see the current call-Stack.

Grab the code and put it into a file called _check.btm_. Byteman provides a nice tool to verify your scripts. Use __<bytemanhome>/bin/bmcheck.sh -cp folder/containing/compiled/classes/to/test check.btm__ to see if your script compiles. Do this EVERY time you change it, it's very easy to get a detail wrong and spend a long time figuring it out.

Now that the script is saved and tested it's time to use it with our application.

The Agent
Scripts are applied to running code through an agent. Open the run-Configuration for the __BrokenSingletonMain-class__ and add

__-javaagent:<BYTEMAN_HOME>/lib/byteman.jar=script:check.btm__

to your JVM-parameters. This will register the agent and tell it to run _check.btm_.

And while we are at it here are a few more options:
If you ever need to manipulate some core java stuff use

__-javaagent:<BYTEMAN_HOME>/lib/byteman.jar=script:appmain.btm,boot:<BYTEMAN_HOME>/lib/byteman.jar__

This will add Byteman to the boot classpath and allow us to manipulate classes like _Thread_, _String_ ... I mean, if you ever wanted to such nasty things ...

It's also possible to attach the agent to a running process. Us __jps__ to find the process id you want to attach to and run

__<bytemanhome>/bin/bminstall.sh &ltpid&gt__

to install the agent. Afterwards run

__<bytemanhome>/bin/bmsubmit.sh check.btm__

Back to our problem at hand.

Running our application with the modified run-Configuration should result in output like this

entered get-Method
entered get-Method
READ:
Stack trace for thread Thread-0
de.codepitbull.byteman.BrokenSingleton.get(BrokenSingleton.java:14)
de.codepitbull.byteman.BrokenSingletonMain$SingletonAccessRunnable.run(BrokenSingletonMain.java:20)
java.lang.Thread.run(Thread.java:745)

READ:
Stack trace for thread Thread-1
de.codepitbull.byteman.BrokenSingleton.get(BrokenSingleton.java:14)
de.codepitbull.byteman.BrokenSingletonMain$SingletonAccessRunnable.run(BrokenSingletonMain.java:20)
java.lang.Thread.run(Thread.java:745)


Congratulations you just manipulated byte code. The output isn't very helpful yet but that's something we are going to change.

Messing with threads
With our infrastructure now set up we can start digging deeper. We are quite sure about our problem being related to some multithreading issue. To test our hypothesis we have to get multiple threads into our critical section at the same time. This is close to impossible using pure Java, at least without applying extensive modifications to the code we want to debug.

Using Byteman this is easily achieved.

RULE define rendezvous
CLASS de.codepitbull.byteman.BrokenSingleton
METHOD get
AT ENTRY
IF NOT isRendezvous("rendezvous", 2)
DO createRendezvous("rendezvous", 2, true);
traceln("rendezvous created");
ENDRULE

This rule defines a so called rendezvous. It allows us to specify a place where multiple threads have to arrive until they are allowed to procede (also known as a a barrier).

And here the translation for the rule:

When calling _BrokenSingleton.get()_ create a new rendezvous that will allow progress when 2 threads arrive. Make the rendezvous reusable and create it only if it doesn't exist (the IF NOT part is critical as otherwise we would create a barrier on each call to _BrokenSingleton.get()_).

After defining this barrier we still need to explicitly use it.

RULE catch threads
CLASS de.codepitbull.byteman.BrokenSingleton
METHOD get
AFTER READ BrokenSingleton.instance
IF isRendezvous("rendezvous", 2)
DO rendezvous("rendezvous");
ENDRULE

Translation: After reading the _instance_-member inside _BrokenSingleton.get()_ wait at the rendezvous until a second thread arrives and continue together.

We now stop both threads from _BrokenSingletonMain_ in the same lace, after the instance-null-check. That's how to make a race condition reproducible. Both threads will continue thinking _instance_ is null, causing the constructor to fire twice.

I leave the solution to this problem to you ...

Unit tests
Something I discovered while writing this blog post is the possibility to run Byteman-scripts as part of my unit tests. Their JUNit- and TestNG-integration is easily integrated.

Add the following dependency to your _pom.xml_

<dependency>
    <groupId>org.jboss.byteman</groupId>   
    <artifactId>byteman-submit</artifactId>
    <scope>test</scope>
    <version>${byteman.version}</version>
</dependency>

Now Byteman-scripts can be executed inside your Unit-Tests like this:

@RunWith(BMUnitRunner.class)
public class BrokenSingletonTest
{
  @Test
  @BMScript("check.btm")
  public void testForRaceCondition() {
    ...
  }
}

Adding such tests to your suits increases the usefulness of Byteman quite a bit. There's no better way preventing others from repeating your mistakes as making these scripts part of the build process.

Closing words
There is only so much room in a blog post and I also don't want to start rewriting their documentation. It was a funny thing writing writing this post as I hadn't used Byteman for quite a while. I don't know how I managed to overlook the unit test integration. That will make me use it a lot more in the future.
And now I suggest to browse their documentation and start injecting, there's a lot to play around with.