Thursday, May 8, 2008

Domain Health tool Vs WLDF for monitoring

[Originally posted on my old BEA Dev2Dev blog on May 8, 2008]

Note added 11-Dec-2009: The content of this blog entry is generally superseded by a newer blog entry which addresses the differences since the introduction of the WLDF harvesting capability to DomainHealth.

DomainHealth is an open source "zero-config" monitoring tool for BEA's WebLogic Application Server. It collects important server runtime statistics over time for all servers in a domain, archives this data into CSV files and provides a simple Web Browser based interface for viewing graphs of statistics, historically.

Version 0.7 has just been released and is available to download and use on WebLogic 9.x/10.x from the project's home page at: http://sourceforge.net/projects/domainhealth


Some people have asked why I have created the DomainHealth tool when WebLogic already provides the WebLogic Diagnostic Framework (WLDF). This is a good question which I will attempt to answer below.

First of all it is worth considering that monitoring can be divided into many sub-categories, three of which are:
  1. Alerting of a potential issue which is about to occur or has just occurred (eg. not much memory left) enabling the administrator to take pro-active or remedial action
  2. Real-time statistics capture and viewing, so an administrator can instantly gauge the health of his/her applications and systems
  3. Continuous harvesting of application and system data over time to enable administrators to retrospectively analyse trends (eg. to identify potential tunings), and to diagnose the cause of fatal problems after the problem has occurred.
...plus many others like instrumenting, profiling, heartbeating.

WLDF in combination with the "WLDF Console Extension" provides the base technology to enable administrators to configure a set of WebLogic servers and environment which caters for all 3 monitoring capabilities listed above.

The philosophy of DomainHealth is "zero-config" and as such is intended for a certain type of user base for certain types of monitoring needs (ie. mainly point (3) and to a lesser extent point (2) above). For users who require more complex customisable monitoring capabilities, the WebLogic Diagnostic Framework (WLDF) is more likely to provide the capabilities desired. In some environments, a combination of using WLDF for (1) and DomainHealth for (2) and (3) may also be an option to consider.

Here's a list of benefits which I see for using Domain Health and for using WLDF and the WLDF Console Extension....

Domain Health (DH) benefits:
  • DH is easier to install (hot deploy of WAR to admin server) than WLDF which requires copying a JAR file to a specific path and then re-starting the admin server. The WLDF deployment process may be problematic when needing to retrospectively add monitoring capabilities to a production system which has already gone live.
  • DH in most case requires no configuration. WLDF console has some built-in views which helps shorten the number of configuration steps likely to be required, However it is unlikely that WLDF's built-in views will be sufficient on their own. By default, the WLDF Console shows data cached on the live servers. For the WLDF Console to show historic data, the administrator must first manually configure (or more likely script the configuration) of a WLDF harvester, separately, to enable historical data to then be retrieved and shown in the console.
  • For current data, WLDF Console periodically polls the cache for live data and for historic data has to contact each server's WLDF Data Accessor to retrieve data from each server's local WLDF archive files. Overall, the performance impact of this process is likely to negate some of the performance gain, from not needing to poll servers remotely to harvest their properties in the first place.
  • The CSV output of DH is more friendly to generic third-party tools than WLDF is. For offline retrieval of WLDF archived data for use in such tools, scripting will be required to integrate Jython/WLST with the third-party tools.
  • DH's web-based visual display is more lightweight than the WLDF console extension for administrators, requiring just a simple browser rather than also needing a Java browser plugin to be installed.
WLDF (inc WLDF Console) benefits:
  • WLDF is infinitely configurable allowing administrators to track the exact server statistics that they are interested in. For DH, you are stuck with what has been prescribed as a suitable set of statistics.
  • A WLDF thread runs within each managed server for property retrieval, rather than requiring remote polling from an admin server. This may lessen any potential performance impact of performing monitoring in a live environment (however this gain is at least partly lost when using the WLDF console running on the Admin Server to then view data - this requires each server to be contacted to retrieve its locally archived statistics).
  • WLDF and the WLDF console are far more powerful and extensive in terms of the overall monitoring features that they provide (eg. provides visualisation for instrumentation of running code).
It is also worth noting that DomainHealth is not a replacement for a 'fully-fledged' 'off-the-shelf' Server monitoring and management tool. Such tools are usually more capable and able to capture other statistics (eg. hardware, OS and network stats) in addition, to enable a more holistic picture of the health of a server environment to be captured. However, in cases where budget does not stretch to purchase an off-the-shelf solution for production environments or where a quick and easy tool for monitoring performance in performance test environments, is required, DomainHealth may help to fill the gap.

In summary what I would like to say is that DomainHealth is not intended as a alternative to WLDF and has been developed specifically for certain use cases for a certain subset of clients which I have dealt with over the years (and is indeed being used by some of these clients today).

Soundtrack for today: Nearly Lost You by Screaming Trees

3 comments:

Alex said...

Very nice indeed. I just downloaded it, built and installed it on my test domain. I love that it is so simple to use and open source.

Thanks very much for your efforts. If we make any improvements I'll funnel them back to you for consideration.

Unknown said...

Hi, is it possible to run the app using another non-admin user ? Thanks

kriatv said...

Very Nice tool, but i would like to know how much load it puts on the server to run.
Isnt this resource intensive?
Thanks