Paul Done's Technical Blog: 2011

Wednesday, October 5, 2011

New release of DomainHealth - 1.0

I've just released a new version of DomainHealth, that by virtue of being the next increment after 0.9, means that this is the grand 1.0 release! No great fanfare or massive new features but this should [hopefully] be a nice stable release to rely on and live up to its 1.0 billing! :D

You can download DomainHealth 1.0 from here: http://sourceforge.net/projects/domainhealth

One new feature in 1.0 that is worth highlighting though, is the new optional capability to collect and show Processor, Memory and Network statistics from the underlying host Operating System and Machine that WebLogic is running on. DomainHealth only enables this feature if you've also deployed another small open source JEE application that I've created, called WLHostMachineStats. Below is a screenshot of DomainHealth 1.0 in action, displaying graphs of some of these host machine statistics (in this case it's running on an Exalogic system).

(click image for larger view)

WLHostMachineStats is a small agent (a JMX MBean deployed as a WAR file) that runs in every WebLogic Server in a WebLogic domain. It is used to retrieve OS data from the underlying machine hosting each WebLogic Server instance. For more information, including deployment instructions, and to download it, go to: http://sourceforge.net/projects/wlhostmchnstats

Here's another screenshot, just for fun:

(click image for larger view)

Some things to bear in mind....

...the WLHostMachineStats project is still in its infancy and currently places restrictions on what specific environments are supported. Right now, WLHostMachineStats can only be used for WebLogic domains running on Linux Intel (x86) 64-bit based machines (including Exalogic) and only for versions 10.3.0 or greater of WebLogic. This is partly because WLHostMachineStats relies on the SIGAR open source utility, that uses native C libraries and JNI. I hope to widen the list of supported platforms for WLHostMachineStats in the future.

Song for today: Dynamite Steps by The Twilight Singers

Friday, September 2, 2011

New release of DomainHealth (v0.9.1)

I've just released a new version of DomainHealth (version 0.9.1). This is primarily a maintenance/bug-fix release.

DomainHealth is an open source "zero-config" monitoring tool for WebLogic. It collects important server metrics over time, archives these into CSV files and provides a simple web interface for viewing graphs of current and historical statistics. It also works nicely on Exalogic.

To download (and see release notes) go to the project home (select 'files' menu option) at: http://sourceforge.net/projects/domainhealth/

For DomainHealth help docs see: http://sourceforge.net/apps/mediawiki/domainhealth/index.php

Song for today: Ascension Day by Talk Talk

Thursday, March 3, 2011

Exalogic DCLI - run commands on all compute nodes at once

Exalogic includes a tool called DCLI (Distributed Command Line Interface) that can be used to run the same commands on all or a subset of compute nodes in parallel. This saves a lot of time and helps avoid the sorts of silly errors that often occur when running a command over and over again. DCLI is a tool that originally came with Exadata (as documented in the Oracle Exadata Storage Server Software User's Guide - E13861-05 chapter 9), and is now incorporated into the new Exalogic product too. It is worth noting that if you are ever involved in performing the initial configuration of a new Exalogic rack, using OneCommand to configure the Exalogic's networking, then under the covers OneCommand will be using DLCI to perform a lot of its work.
Introduction to Exalogic's DCLI
The Oracle Enterprise Linux 5.5 based factory image running on each Exalogic compute node has the exalogic.tools RPM package installed. This contains the DCLI tool in addition to other useful Exalogic command line utilities. Running 'rpm -qi exalogic.tools' on a compute node shows the following package information:

Name : exalogic.tools
Version : 1.0.0.0
Release : 1.0

When you run 'rpm -ql exalogic.tools' you will see that the set of command line utilities are all placed in a directory at '/opt/exalogic.tools'. Specifically, the DCLI tool is located at '/opt/exalogic.tools/tools/dcli'.

Running DCLI from the command line with the '-h' argument, will present you with a short help summary of DCLI and the parameters it can be given:

# /opt/exalogic.tools/tools/dcli -h

If you look at the contents of the '/opt/exalogic.tools/tools/dcli' file you will see that it is actually a Python script that, essentially, determines the list of compute nodes that a supplied command should be applied to and then runs the supplied command on each compute node using SSH under the covers. Conveniently, the Python script also captures the output from each compute node and prints it out in the shell that DCLI was run from. The output from each individual compute node is prefixed by that particular compute node's name so that it is easy for the administrator to see if something untoward occurred on one of the compute nodes only.

A good way of testing DCLI, is to SSH to your nominated 'master' compute node in the Exalogic rack (eg. the 1st one), as root user, and create a file (eg. called 'nodelist') which contains the hostnames of all the compute nodes in the rack (separated by newlines). For example, my nodelist file has the following entries in the first 3 lines:

el01cn01

el01cn02

el01cn03

....

Note: You can comment out one or more hostnames with a hash ('#') if you want DCLI to ignore particular hostnames.

As a reminder on Exalogic compute node naming conventions, 'el01' is the Exalogic rack's default name and 'cn01' contains the number of the specific compute node in that rack.

Once you've created the list of target compute nodes for DCLI to distribute commands to, a nice test is to run a DCLI command that just prints the date-time of each compute node to the shell output of your master compute node (using the /bin/date Linux command). For example:

# /opt/exalogic.tools/tools/dcli -t -g nodeslist /bin/date

Example output:

Target nodes: ['el01cn01', 'el01cn02', 'el01cn03',....]

el01cn01: Mon Feb 21 21:11:42 UTC 2011

el01cn02: Mon Feb 21 21:11:42 UTC 2011

el01cn03: Mon Feb 21 21:11:42 UTC 2011

....

When this runs, you will be prompted for the password for each compute node that DCLI contacts using SSH. The '-t' option tells DCLI to first print out all the names of all nodes it will run the operation on, which is useful for double-checking that you are hitting the compute nodes you intended. The -g command provides the name of the file that contains the list of nodes to operate on (in this case, 'nodelist' in the current directory).

SSH Trust and User Equivalence

To use DCLI without being prompted for a password for each compute node that is contacted, it is preferable to first set-up SSH Trust between the master compute node and all the other compute nodes. DCLI calls this "user equivalence"; a named user on one compute node will then be assumed to have the same identity as the same named user on all other compute nodes. On your nominated 'master' compute node (eg. 'el01cn01'), as root user, first generate an SSH public-private key for the root user. For example:

# ssh-keygen -N '' -f ~/.ssh/id_dsa -t dsa

This places the generated public and private key files in the '.ssh' sub-directory of the root user's home directory (note, '' in the command is two single quotes)

Now run the DCLI command with the '-k' option as shown below which pushes the current user's SSH public key to each other compute node's '.ssh/authorized_keys' file to establish SSH Trust. You will again be prompted to enter the password for each compute node, but this will be the last time you will need to. With the '-k' option, each compute node is contacted sequentially rather than in parallel, to give you chance to enter the password for each node in turn.

# /opt/exalogic.tools/tools/dcli -t -g nodeslist -k -s "\-o StrictHostKeyChecking=no"

In my example above, I also pass the SSH option 'StrictHostKeyChecking=no' so you avoid being prompted with the standard SSH question "Are you sure you want to continue connecting (yes/no)", for each compute node that is contacted. The master compute node will then be added to the list of SSH known hosts on each other compute node, so that this yes/no question will never occur again.

Once the DCLI command completes you have established SSH Trust and User Equivalence. Any subsequent DCLI commands that you issue, from now on, will occur without you being prompted fo passwords.

You can then run the original date-time test again, to satisfy yourself that SSH Trust and User Equivalence is indeed established between the master compute node and each other compute node and that no passwords are prompted for.

# /opt/exalogic.tools/tools/dcli -t -g nodeslist /bin/date

Useful Examples

Now lets have a look at some examples common DCLI commands you might need to issue for your new Exalogic system.

Example 1 - Add a new OS group to each compute node called 'oracle' with group id 500:

# /opt/exalogic.tools/tools/dcli -t -g nodeslist groupadd -g 500 oracle

Example 2 - Add a new OS user to each compute node called 'oracle' with user id 500 as a member of the new 'oracle' group:

# /opt/exalogic.tools/tools/dcli -t -g nodeslist useradd -g oracle -u 500 oracle

Example 3 - Set the password to 'welcome1' for the OS 'root' user and the new 'oracle' user on each compute node (this uses another feature of DCLI where, if multiple commands need to be run in one go, they can be added to a file, which I tend to suffix with '.scl' in my examples - 'scl' is the convention for 'source command line', and the '-x' parameter is provided to tell DCLI to run commands from the named file):

# vi setpasswds.scl
echo welcome1 | passwd root --stdin
echo welcome1 | passwd oracle --stdin # chmod u+x setpasswds.scl
# /opt/exalogic.tools/tools/dcli -t -g nodeslist -x setpasswds.scl

Example 4 - Create a new mount point directory and definition on each compute node for mounting the common/general NFS share which exists on Exalogic's ZFS Shared Storage appliance (the hostname of the HA shared storage on Exalogic's internal InfiniBand network in my example is 'el01sn-priv') and then from each compute node, permanently mount the NFS Share:

# /opt/exalogic.tools/tools/dcli -t -g nodeslist mkdir -p /u01/common/general
# /opt/exalogic.tools/tools/dcli -t -g nodeslist chown -R oracle:oracle /u01/common/general
# vi addmount.scl
cat >> /etc/fstab << EOF
el01sn-priv:/export/common/general /u01/common/general nfs rw,bg,hard,nointr,rsize=131072,wsize=131072,tcp,vers=3 0 0
EOF# chmod u+x addmount.scl

# /opt/exalogic.tools/tools/dcli -t -g nodeslist -x addmount.scl
# /opt/exalogic.tools/tools/dcli -t -g nodeslist mount /u01/common/general

Running DCLI As Non-Root User

In the default Exalogic set-up, DCLI executes as root user when issuing all of its commands regardless of what OS user's shell you use to enter the DCLI command from. Although root access is often necessary for creating things like OS users, groups and mount points, it is not desirable if you just want to use DCLI to execute non-privileged commands under a specific OS user on all computes nodes. For example, as a new 'coherence' OS user, you may want the ability to run a script that starts a Coherence Cache Server instance on every one of the compute nodes in the Exalogic rack, in one go, to automatically join the same Coherence cluster.

To enable DCLI to be used under any OS user and to run all its distributed commands on all compute nodes, as that OS user, we just need to make a few simple one-off changes on our master compute node where DCLI is being run from...

1. As root user, allow all OS users to access the Exalogic tools directory that contains the DCLI tool:

# chmod a+x /opt/exalogic.tools/tools

2. As root user, change the permissions of the DCLI tool to be executable by all users:

# chmod a+x /opt/exalogic.tools/tools/dcli

3. As root user, modify, the DCLI python script (/opt/exalogic.tools/tools/dcli) using 'vi' and replace the line....

USER_ID="root"

...with the line...

USER_ID=pwd.getpwuid(os.getuid())[0]

This script line uses some Python functions to set the DCLI user id to the name of the current OS user running the DCLI command, rather than the hard-coded 'root' username.

4. Whilst still editing the file using vi, add the following Python library import command near the top of the DCLI Python script to enable the 'pwd' Python library to be referenced by the code in step 3.

import pwd

Now log-on to your master compute node as your new non-root OS user (eg. 'coherence' user) and once you've done the one-off setup of your nodelist file and SSH-Trust/User-Equivalence (as described earlier), you will happily be able run DCLI commands accross all compute nodes as your new OS user.

For example, for a test Coherence project I've been playing with recently, I have a Cache Server 'start in-background' script in a Coherence project located on my Exalogic's ZFS Shared Storage. When I run script using the DCLI command below, from my 'coherence' OS user shell on my master compute node, 30 Coherence cache servers instances are started immediately, almost instantly forming a cluster across the compute nodes in the rack.

# /opt/exalogic.tools/tools/dcli -t -g nodeslist /u01/common/general/my-coh-proj/start-cache-server.sh

Just for fun I can run this again to allow 30 more Coherence servers to start-up and join the same Coherence cluster, now containing 60 members.

Summary

As you can see DCLI is pretty powerful yet very simple in both concept and execution!

Song for today: Death Rays by Mogwai

Sunday, January 23, 2011

Exalogic Software Optimisations

[Update 19-March-2001 - this blog entry is actually a short summary of a much more detailed Oracle internal document I wrote in December 2010. A public whitepaper using the content from my internal document, has now been published on Oracle's Exalogic home page (see "White Papers" tab on right-hand side of the home page); for the public version, a revised introduction, summary and set of diagrams have been contributed by Oracle's Exalogic Product Managers.]

For version 1.0 of Exalogic there is a number of Exalogic-specific enhancements and optimisations that have been made to the Oracle Application Grid middleware products, specifically:

the WebLogic application server product;
the JRockit Java Virtual Machine (JVM) product;
the Coherence in-memory clustered data-grid product.

In many cases, these product enhancements address performance limitations that are not present on general purpose hardware that uses Ethernet based networking. Typically, these limitations are only manifested when running on Exalogic's high-density computing nodes with InfiniBand's fast-networking infrastructure. Most of these enhancements are designed to enable the benefits of the high-end hardware components, that are unique to Exalogic, to be utilised to the full. This results in a well balanced hardware/software system.

I find it useful to categorise the optimisations in the following way:

Increased server scalability, throughput and responsiveness. Improvements to the networking, request handling, memory and thread management mechanisms, within WebLogic and JRockit, enable the products to scale better on the high-multi-core compute nodes that are connected to the fast InfiniBand fabric. WebLogic will use Java NIO based non-blocking server socket handlers (muxers) for more efficient request processing, multi-core aware thread pools and shared byte buffers to reduce data copies between sub-system layers. Coherence also includes changes to ensure more optimal network bandwidth usage when using InfiniBand networking.
Superior server session replication performance. WebLogic's In-Memory HTTP Session Replication mechanism is improved to utilise the large InfiniBand bandwidth available between clustered servers. A WebLogic server replicates more of the session data in parallel, over the network to a second server, using parallel socket connections (parallel "RJVMs") instead of just a single connection. WebLogic also avoids a lot of the unnecessary processing that usually takes place on the server receiving session replicas, by using "lazy de-serialisation". With the help of the underlying JRockit JVM, WebLogic skips the host node's TCP/IP stack, and uses InfiniBand's faster “native” networking protocol, called SDP, to enable the session payloads to be sent over the network with lower latency. As a result, for stateful web applications requiring high availability, end-user requests are responded to far quicker.
Tighter Oracle RAC integration for faster and more reliable database interaction. For Exalogic, WebLogic includes a new component called “Active Gridlink for RAC” that provides application server connectivity to Oracle RAC clustered databases. This supersedes the existing WebLogic capability for Oracle RAC connectivity, commonly referred to as “Multi-Data-Sources”. Active Gridlink provides intelligent Runtime Connection Load-Balancing (RCLB) across RAC nodes based on the current workload of each RAC node, by subscribing to the database's Fast Application Notification (FAN) events using Oracle Notification Services (ONS). Active Gridlink uses Fast Connection Failover (FCF) to enable rapid RAC node failure detection for greater application resilience (using ONS events as an input). Active GridLink also allows more transparent RAC node location management with support for SCAN and uses RAC node affinity for handling global (XA) transactions more optimally. Consequently, enterprise Java applications involving intensive database work, achieve a higher level of availability with better throughput and more consistent response times.
Reduced Exalogic to Exadata response times. When an Exalogic system is connected directly to an Exadata system (using the built-in Infiniband switches and cabling), WebLogic is able to use InfiniBand's faster “native” networking protocol, SDP, for JDBC interaction with the Oracle RAC database on Exadata. This incorporates enhancements to JRockit and the Oracle Thin JDBC driver in addition to WebLogic. With this optimisation, an enterprise Java application that interacts with Exadata, is able to respond to client requests quicker, especially where large JDBC result sets need to be passed back from Exadata to Exalogic.

To summarise, Exalogic provides a high performance, highly redundant hardware platform for any type of middleware application. If the middleware application happens to be running on Oracle's Application Grid software, further significant performance gains will be achieved.

Song for today: Come to Me by 65daysofstatic