We are the Dev Teams of
  • brands
  • ebay_main
  • ebay
  • mobile
<
>
BLOG

Monitoring Tomcat with Coda Hale Metrics and Graphite

by Florian Semrau
in Tutorials

This article describes how to monitor your tomcat with Coda Hale Metrics and Graphite. In our example we were interested in the amount of requests per time on certain URLs.

To use Coda Hale Metrics we used these dependencies in our Maven build, with the variable ${metrics.version} set to 3.0.1:

<dependencies>
    <dependency>
        <groupId>com.codahale.metrics</groupId>
        <artifactId>metrics-core</artifactId>
        <version>${metrics.version}</version>
    </dependency>
</dependencies>

For intercepting our Tomcat request we start using a servlet filter which also contains the Graphite server config.

<filter>
    <filter-name>request-metrics</filter-name>
    <filter-class>com.ebayk.urimetrics.RequestMetricsFilter</filter-class>
    <init-param>
        <param-name>graphite.server</param-name>
        <param-value>your.graphite.host</param-value>
    </init-param>
    <init-param>
        <param-name>graphite.port</param-name>
        <param-value>1234</param-value>
    </init-param>
</filter>
...
<filter-mapping>
    <filter-name>request-metrics</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

First we need a class that will initialize the communication to Graphite. We chose the singleton pattern for this job. Have a look at the report method that is used to set up graphite in line 23. We configured Graphite to report our metrics once every minute and adjusted the reporting time units to our needs. 

package ...;
import com.codahale.metrics.*;
import com.codahale.metrics.graphite.Graphite;
import com.codahale.metrics.graphite.GraphiteReporter;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.net.InetSocketAddress;
import java.util.concurrent.TimeUnit;
enum Metrics {
    INSTANCE;
    private final MetricRegistry metricsRegistry;
    private GraphiteReporter reporter;
    private static final String GRAPHITE_PREFIX = "graphite.metrics.prefix";
    private Metrics() {
        metricsRegistry = new MetricRegistry();
    }
    public static Timer newTimer(String timerName) {
        return INSTANCE.metricsRegistry.timer(timerName);
    }
    static void report(String server, int port) {
        Graphite graphite = new Graphite(new InetSocketAddress(server, port));
        INSTANCE.reporter = GraphiteReporter.forRegistry(
                 INSTANCE.metricsRegistry)
                .prefixedWith(GRAPHITE_PREFIX)
                .convertRatesTo(TimeUnit.SECONDS)
                .convertDurationsTo(TimeUnit.MILLISECONDS)
                .filter(MetricFilter.ALL)
                .build(graphite);
        INSTANCE.reporter.start(1, TimeUnit.MINUTES);
    }
    static void stop() {
        INSTANCE.reporter.stop();
    }
}

The servlet filter class takes care of counting and logging time metrics to Graphite (line 33). The class looks like this:

package ...;
import com.codahale.metrics.Timer;
import com.google.common.base.Optional;
import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import java.io.IOException;
public class RequestMetricsFilter implements Filter {
    public static final String GRAPHITE_HOST = "your.graphite.host";
    public static final String GRAPHITE_PORT = "1234";
    private UriCounter uriCounter = new UriCounter();
    RequestMetricsFilter(UriCounter uriCounter) {
        this.uriCounter = uriCounter;
    }
    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
        String server = Optional.fromNullable(
           filterConfig.getInitParameter("graphite.server")).or(GRAPHITE_HOST);
        String port = Optional.fromNullable(
           filterConfig.getInitParameter("graphite.port")).or(GRAPHITE_PORT);
        Metrics.report(server, Integer.parseInt(port));
    }
    @Override
    public void destroy() {
        Metrics.stop();
    }
    @Override
    public void doFilter(ServletRequest request, 
        ServletResponse response, FilterChain chain) 
        throws IOException, ServletException {
        HttpServletRequest httpRequest = (HttpServletRequest) request;
        Timer.Context ctx = 
            uriCounter.count(httpRequest.getRequestURI()).time();
        try {
            chain.doFilter(request, response);
        } finally {
            ctx.stop();
        }
    }
}

The UriCounter class keeps track of all counters that are matched against certain URIs that we want to track. The class has a Map called CONFIG (see line 17), that maps all our rules as pre-compiled regular expression patterns (for improved performance) to graphite metrics timer. This is where you can add your own patterns that you would like to monitor.

We also added a timer called catchall to count all requests that do not match any patterns defined in the map. 

package ...;

import com.codahale.metrics.Timer;
import com.google.common.collect.ImmutableMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import static java.lang.String.format;
import static java.util.regex.Pattern.compile;

class UriCounter {
    private static final String GRAPHITE_TIMER_PATH = "tomcat.timer.%s";
    private Timer catchAllTimer = timer("catchall");
    private final Map<Pattern, Timer> pathCounterMap;

    private static final Map<Pattern, Timer> CONFIG =
            ImmutableMap.<Pattern, Timer>builder()
                    // add more counter rules here
                    .put(compile("/anzeigen/m-.*"), timer("my-"))
                    .build();

    UriCounter() {
        this(CONFIG, timer("catchall"));
    }

    UriCounter(Map<Pattern, Timer> pathCounterMap, Timer catchAllTimer) {
        this.pathCounterMap = pathCounterMap;
        this.catchAllTimer = catchAllTimer;
    }

    public Timer count(String uri) {
        for (Map.Entry<Pattern, Timer> entry : pathCounterMap.entrySet()) {
            Pattern pattern = entry.getKey();
            Matcher matcher = pattern.matcher(uri);
            if (matcher.matches()) {
                return entry.getValue();
            }
        }

        return catchAllTimer;
    }

    private static Timer timer(String name) {
        return Metrics.newTimer(format(GRAPHITE_TIMER_PATH, name));
    }
}
That's it! Start your web app and click around to track some metrics. Then you can go to your graphite web ui and navigate to the tomcat.timer node defined in the constant GRAPHITE_TIMER_PATH

Note: This code only works in an environment that does not rely on load balancers. If you're working with a load balancer, make sure to add some logic to the private static timer method that initializes the Coda Hale timer. We usually insert the host name in the Graphite node string.

If the code above is run without adjusting that method that initializes the timer, you will see that Graphite shows strange behaviour. This is because all your instances will send diverse data to on Graphite node, which graphite can not differentiate. In this case, graphite simply calculates an average from these values and persists them, which will falsify your metrics. A typical behaviour that can be observed in this case is that you see a spike in your graph that disappears later on. Make sure to have a look at your Graphite node configuration in that case.

siteops, java, tomcat, codahale, graphite

?>