Java itself is a very large program which approaches an operating system in its complexity. While it is usually quite stable, like any program it occasionally has problems like crashes, freezes, etc. When a Java application process freezes, it can be difficult to detect and recover from. Because the Java process itself still exists in memory, many external monitoring applications will think it is alive even though it has stopped responding.

When a JVM freeze occurs, it is usually noticed by users or other systems failing to interact with the application. Once notified of the problem, often via a complaint from a user, a system administrator then must connect to the server, kill the frozen JVM, and restart the application. All of this can potentially take a couple hours if the person who needs to do the work is out at dinner.

Solution

The solution to this problem is to have a monitoring application, with an intimate knowledge of how Java works, which can be trusted to monitor your Java application 24 hours a day, 365 days a year, and automatically take care of problems when they arise.

The Java Service Wrapper contains several monitoring features. One of which is the ability to detect when a JVM has frozen. The Wrapper constantly monitors a JVM, and contains advanced logic to decide whether or not a JVM is frozen. When a frozen JVM is detected, the Wrapper is then able to automatically restart the JVM to get the system back up and running with a minimum of delay. The Wrapper can then send out an email notification to the system administrators in case they wish to double check to make sure everything is back up and working normally.

The Java Service Wrapper periodically pings the JVM process (every 5 seconds by default) and waits for a response. If the response fails to arrive within a configured period of time (30 seconds by default), then the Wrapper will determine that it is frozen. The Wrapper also takes into account things like overall system load to make sure that false positives are kept to a minimum.

When a freeze is detected, the Wrapper will record something like the following in the log file:

Log Example Freeze and Restart:
jvm 1    | ...
wrapper  | JVM appears hung: Timed out waiting for signal from JVM.
wrapper  | JVM did not exit on request, terminated
wrapper  | Launching a JVM...
jvm 2    | WrapperManager: Initializing...
jvm 2    | ...

The Wrapper helps you detect critical problems including:

Technical Overview

Freeze detection is enabled by default. It will attempt to ping the JVM every 5 seconds and then timeout after 30 seconds without a response. Both of these intervals are configurable. If you set the wrapper.debug=TRUE property, you will be able to see the pings in the log file. In this example, the log format has been set to also include timestamps.

Log Example of Pings:
INFO   | jvm 1    | 2012/05/09 18:28:11 | ...
DEBUG  | wrapperp | 2012/05/09 18:28:13 | send a packet PING : ping
INFO   | jvm 1    | 2012/05/09 18:28:13 | WrapperManager Debug: Received a packet PING : ping
INFO   | jvm 1    | 2012/05/09 18:28:13 | WrapperManager Debug: Send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 18:28:13 | read a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 18:28:18 | send a packet PING : ping
INFO   | jvm 1    | 2012/05/09 18:28:18 | WrapperManager Debug: Received a packet PING : ping
INFO   | jvm 1    | 2012/05/09 18:28:18 | WrapperManager Debug: Send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 18:28:18 | read a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 18:28:21 |...

Freeze Detection

Compatibility :1.0.0
Editions :Professional EditionStandard EditionCommunity Edition
Platforms :WindowsMac OSXLinuxIBM AIXFreeBSDHP-UXSolarisIBM z/OSIBM z/Linux

The pinging of the JVM is a very lightweight feature, and it is recommended to monitor the JVM with the default 5 second interval. Depending on your application, or in cases when the system is under very high external loads, it may be necessary to lengthen the ping timeout. Be aware that doing so will increase the amount of time that it takes the Wrapper to decide that the JVM is frozen.

Notice how this example shows that when a JVM freezes up, it will stop responding to all pings and the Wrapper will initiate a restart:

Log Example: Freeze and Restart.
DEBUG  | wrapperp | 2012/05/09 19:20:22 | send a packet PING : ping
INFO   | jvm 1    | 2012/05/09 19:20:22 | WrapperManager Debug: Received a packet PING : ping
INFO   | jvm 1    | 2012/05/09 19:20:22 | WrapperManager Debug: Send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:22 | read a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:27 | send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:31 | send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:36 | send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:40 | send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:44 | send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:49 | send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:53 | send a packet PING : ping
DEBUG  | wrapperp | 2012/05/09 19:20:58 | send a packet PING : ping
ERROR  | wrapper  | 2012/05/09 19:20:59 | JVM appears hung: Timed out waiting for signal from JVM.
ERROR  | wrapper  | 2012/05/09 19:21:00 | JVM did not exit on request, terminated

Trying it out

Hopefully you will have a hard time reproducing a JVM freeze in your own Java application. To make it easy to test this out, the Wrapper ships with a TestWrapper Example Application which can be used to try out this and many other features.

Please launch the TestWrapper Application from a console by going into the downloaded and expanded Java Service Wrapper distribution. On Windows, run bin\TestWrapper.bat and on UNIX platforms run bin\testwrapper console.

When the TestWrapper Example Application starts up, you should see a simple GUI screen with several buttons down the left side. Clicking on the Simulate JVM Hang button will cause the JVM to appear frozen to the Wrapper.

Log Example: Freeze and Restart.
jvm 1    | TestWrapper: Showing dialog...
jvm 1    | WrapperManager: WARNING: Making JVM appear to be hung...
wrapper  | JVM appears hung: Timed out waiting for signal from JVM.
wrapper  | JVM did not exit on request, terminated
wrapper  | Launching a JVM...
jvm 2    | WrapperManager: Initializing...
jvm 2    | TestWrapper: Initializing...

If you wish to test this out in your own Java application, it is possible to call the WrapperManager.appearHung() to enter this test mode.

Responding to Events

Compatibility :3.3.0
Editions :Professional EditionStandard Edition (Not Supported)Community Edition (Not Supported)
Platforms :WindowsMac OSXLinuxIBM AIXFreeBSDHP-UXSolarisIBM z/OSIBM z/Linux

Sometimes when an application freezes or crashes, it is not always enough simply to restart it. If temporary files were not properly cleaned up, or the database was left in an unstable state, the new application instance may quickly fail with errors. Ideally the application should be designed to be robust enough to handle these failures, but this is not always possible. In such cases we need a way to do some cleanup before launching a new JVM.

The Wrapper supports this through the use of Event Commands. It is possible to tell the Wrapper to run an external command or script at various times.

If you look at the debug log output, you will see occasional Enqueue Event 'NNN' entries. These show the timing of when the Wrapper's events are fired. In the case of restarts after a frozen JVM was detected, we want to launch our cleanup script before the new JVM instance is launched. The jvm_restart event will be used.

Please create a simple batch file as follows and place it into the same directory as the Wrapper binary. This example simply simulates a 10 second cleanup task. A real script would be more interesting.

Sample script: "cleanupTest.bat"
@echo off
echo Cleaning up for 10 seconds...
rem use ping to cause a 10 second delay.
PING 1.1.1.1 -n 1 -w 10000 >NUL
echo All done cleaning up.

Next add the following properties to your wrapper.conf

Configuration file "wrapper.conf"
wrapper.event.jvm_restart.command.argv.1=cleanupTest.bat
wrapper.event.jvm_restart.command.block=TRUE
wrapper.event.jvm_restart.command.loglevel=INFO

Now when a freeze is detected, the Wrapper will run the configured batch file and block until it has completed:

Log Output Example
wrapper  | JVM appears hung: Timed out waiting for signal from JVM.
wrapper  | JVM did not exit on request, terminated
wrapper  | Event Command 'jvm_restart': Command line: cleanupTest.bat
wrapper  | Event Command 'jvm_restart': Command launched (pid: 1188), blocking for up to 15 seconds...
Cleaning up for 10 seconds...
All done cleaning up.
wrapper  | Event Command 'jvm_restart': Command completed with exit code: 0
wrapper  | Event Command 'jvm_restart': Command completed with exit code: 0  Continuing.
wrapper  | Launching a JVM...

Email Notification

Compatibility :3.3.0
Editions :Professional EditionStandard Edition (Not Supported)Community Edition (Not Supported)
Platforms :WindowsMac OSXLinuxIBM AIXFreeBSDHP-UXSolarisIBM z/OSIBM z/Linux

In addition to the ability to launch external commands in response to a freeze event, it is also possible to configure the Wrapper to send out a notification email. The email can take the form of a simple notice, or optionally include the most recent log output as an attachment.

An email notification of the jvm_unexpected_exit event can be configured as follows:

Configuration file "wrapper.conf"
wrapper.event.default.email.smtp.host=mail.example.com
wrapper.event.default.email.sender=wrapper-app@example.com
wrapper.event.default.email.recipient=sysadmin@example.com
wrapper.event.jvm_killed.email=TRUE
wrapper.event.jvm_killed.email.subject=Oh no! The application was killed.
wrapper.event.jvm_killed.email.maillog=ATTACHMENT

This will cause an email like the following to be sent whenever the JVM is killed.

jvm_killed email
Subject: Oh no! The application was killed.
To: sysadmin@example.com
From: wrapper-app@example.com

Java Service Wrapper Event Notification

Host: myserver
App Name: testwrapper
         TestWrapper Example Application

Event: jvm_killed
--

STATUS | wrapper  | 2012/05/15 18:23:42 | ...
INFO   | jvm 1    | 2012/05/15 18:23:47 | WrapperManager: WARNING: Making JVM appear to be hung...
ERROR  | wrapper  | 2012/05/15 18:24:24 | JVM appears hung: Timed out waiting for signal from JVM.
ERROR  | wrapper  | 2012/05/15 18:24:25 | JVM did not exit on request, terminated

But it is also good to get good news. The following configuration will also tell the Wrapper to send out a second email whenever the JVM has been launched and your application is back up and running. This pair of emails can be very useful in keeping you informed of the state of your server so you can decide whether or not its Ok to relax and enjoy the rest of dinner.

your wrapper.conf for the configuration
wrapper.event.jvm_started.email=TRUE
wrapper.event.jvm_started.email.subject=Good news!  The applicaiton is back up.
wrapper.event.jvm_started.email.maillog=ATTACHMENT

The email and event systems of the Wrapper are very powerful and configurable. Please see the Event Overview page for more details.

Reference: Freezes

The Java Service Wrapper provides a full set of configuration properties that allows you to make the Wrapper meet your exact needs. Please take a look at the documentation for the individual properties to see all of the possibilities beyond the examples shown above.