The Java Service Wrapper does a very good job of monitoring and recovering from errors in the Java process.
While the Wrapper itself is very stable, it is good to avoid any point of failure.
The Wrapper makes it possible to make use of the Windows Service Manager's ability to automatically restart a service
which has crashed or terminated in an error state.
The Windows Service Control Manager maintains a failure count for each service installed.
Every time the service terminates unexpectedly, the counter will be incremented
and the registered failure action for that count will be triggered.
This property is used to configure the time in seconds, without any errors,
after which the Service Manager will reset this counter back to "0" (zero).
The default value for this property is 3600 (= 1 hour).
If "-1" is set, this means that the Service Control Manager will never reset the counter.
Example, setting the reset time to 86400 seconds (=1 day):
To avoid confusion, please note that the Recovery Tab on the Properties dialog of the Services Control Panel
displays the reset time in days rather than seconds.
This means that the displayed values will be floored to days.
For instance a value of 3600 seconds (1 hour) will be displayed as "0" (zero) days.
One thing that the Service Control Manager lets you do is execute an external command in response to a failure.
This can be used to do things like perform cleanup or send external notification.
While it is possible to specify a different
for each recovery event, the Service Control Manager API only makes it possible to specify a single command,
which can then be referenced as the failure action for one or more recovery event.
If REBOOT is specified for a
the Service Control Manager will log a System Event describing "why the system was rebooted".
This property is used to define the text of that message.
wrapper.ntservice.recovery.reboot_msg=System will reboot now.
This property sets the failure action which should happen if the Wrapper service crashes or exits with an error.
The "<n>" component of the property name indicates the number of the failure count.
The Service Control Manager counts the number of times each service has failed since the system booted.
The count is reset back to "0" (zero) if the service has not failed for at least
When the service fails, the failure count (N) is used to decide which failure action to take.
These actions are defined with the wrapper.ntservice.recovery.<n>.failure property,
where 'N' is the failure count.
If N is greater than the last configured action, then the failure action with the highest N (<n> index) will be used.
Possible actions are:
The Service Manager will do nothing.
This is the same as if no recovery properties were specified.
The service will stop and stay stopped.
Any failure actions defined with higher N values will never be reached.
The Service Manager will restart the service.
This is the most commonly used action with the Wrapper
as the Wrapper itself is capable of cleaning up after the JVM after a failure.
The Service Manager will run the command specified with the
If you wish the service to be restarted after running the command then this must be done within the command.
While the API makes it possible to specify a failure action
after NONE or REBOOT, they will never be used.
The Recovery Tab on the Properties dialog of the Services Control Panel ([Control Panel] - [Administrator Tool] - [Service]) will only display the first 3 configured failure actions.
The API however makes it possible to define more.
Be aware that using more than 3 failure actions could be confusing to a user looking at the dialog.
The value of this property can not be smaller than "0" (zero).
The default value is 3 times the value of wrapper.ping.timeout property.
If the Wrapper process itself were to crash while the JVM was still up and running,
the JVM is configured to shut itself down after the 3 times the ping timeout.
This recovery delay uses the same value to avoid restarting the Wrapper with a second JVM until the first has had a chance to shutdown.
Starting a second JVM would most likely cause resource conflicts.
While in theory it would be possible to set a delay for the failure action "NONE",
the Service Control Manager would ignore the delay in this case.
Example, setting the delay to 90 seconds (1.5min):
In order to prevent confusion when checking the Service on the Service Dialog in the Recovery Tab,
please note that the delay time is displayed in minutes.
This means that displayed values will be floored to minutes.
For instance a value of 90 seconds (1.5 min) will be displayed as 1 minute.
Putting it all together
The following example will
set the reset time to 86400 seconds (1 day)
1st Failure: tell the Service Control Manager to wait 90 seconds (1.5 min) before performing failure action 1
1st Failure: specify failure action 1 to RESTART, what will restart the service
2nd Failure: tell the Service Control Manager to wait 180 seconds (3 min) before performing failure action 2
2nd Failure: specify failure action 2 to COMMAND, what will run an external command
2nd Failure: set the external command to "C:\cleanup.bat"
3rd Failure: tell the Service Control Manager to wait 600 seconds (10 min) before performing failure action 3
3rd Failure: specify failure action 3 to REBOOT, what will reboot the machine
3rd Failure: set the message, which gets logged with the reboot in the System Events to "System will reboot now."