Paul Done's Technical Blog

Friday, October 10, 2008

Web Service Messaging Nirvana

Web Services and specifically SOAP has got a bad rap in recent years, believed to be yet another over-hyped remote procedure call (RPC) protocol, and not cool enough in light of more current hip approaches like REST. Some of this flak is justified and the SOAP community does itself no favours when it releases a newer version of a spec (SOAP 1.2) which goes out of its way to state that SOAP is no longer an acronym. What next? TAFKAS? (The Artist Formerly Know As SOAP)

Anyway, I digress. I blogged in the past about two main areas where SOAP can be used. As you might guess, the main area that interests me is Messaging.

Over the last few years, I've been on a quest to explore how a JavaEE architect and developer can better implement messaging based web services and service consumers in a more loosely coupled and less error prone way. My motivation has been driven by a reaction to the traditional RPC based Web Services programming model that many have used and still use today and all the problems that such a model brings.

Over time I have constructed a list of 8 key values or principles that I believe are necessary for achieving "Web Services Messaging Nirvana". In most cases, these principals are based more on architectural and development approach, than the specific technology or SOAP toolkit used. However, as always, the two can't be completely divorced from each other..

Here's my 8 key values:

Promote Interface/Contract-First Development

“Implementation-First” leads to tight coupling between the code in the service provider and the code in the service consumer
"Implementation First" leads to a strong dependency on the initial SOAP toolkit used for implementation, which is then difficult to migrate off and leads to the difficulties of vendor lock-in

Enable a Message-Passing Programming Model

RPC programming leads to lots of find grained 'chatty' services where brittle pairings exist between provider and consumer which are resistant to change
RPC clients tend to assume endpoints are implementations when in reality, they may just be messaging intermediaries (eg. a service bus)

De-couple “Service Interface Definition” from “Message Document Definition”

In-lining XML type definitions inside a WSDLs prevents re-use of the message structure definitions across multiple web services and other components in the system as a whole (import external Schemas instead)
The message document structures may already be pre-defined in the organisation (eg. industry-wide standard schemas), so the ability to share these definitions across many web services is required‏

Promote Postel’s Law: “Be conservative in what you do; be liberal in what you accept from others”

In other words, allow developers to create services with strongly defined interfaces, yet forgiving implementations
Service intermediaries and end-points may only care about using part of the message content - they should not concern themselves about trying to validate other parts of a message
Enable minor interface version additions without necessarily requiring an upgrade or change to the service provider or service consumer

Avoid the need for manually generated static stubs and skeletons

These are invariably an unnecessary time-consuming addition to the development and build process
Interface modification always requires re-generation of stubs, skeletons and Java types, forcing service providers and consumers to need to always be corrected, re-compiled and re-deployed.
A requirement for Skeletons and Stubs is unlikely when generated Java types are not being used in the implementation

Separate data from the code that operates on it

Object-oriented programming is great for service implementations but not for de-coupling disparate elements of the higher level distributed systems that are based on the principle of passing messages or events around
Enable message data to be passed between systems without one system caring about or pre-empting how the other system may use and operate on the message’s content

Provide the developer with a choice of XML Tools to use

SOAP Toolkits which force messages to be marshalled into a strongly typed Java graph of objects, results in one and only one way that the programmer can operate on the data
If only part of a message is required or performance is key, a developer should have the option to use a SAX parser, a Streaming parser or just an XPath based Java API to query and operate on the data
If XQuery or XSLT offers some developers a better abstraction and quicker development process then don't prevent a developer from being able plug in this option
Retain the choice of being able to generate strongly types Java objects for situations where the developer needs or prefers it, but enable the developer to choose the technology (eg. JAXB, XmlBeans)‏

Integrate with the host container's Security Framework for Web Services Security

For an application server based production environment, the server's Security Framework is already likely to be plugged into the organisation's larger security infrastructure (inc. LDAP, Kerberos, PKI and other elements) to provide full support for authentication, access control, single sign-on, identity propagation, encryption and signing. This infrastructure needs to be leveraged by the SOAP Toolkit, not ignored
The SOAP Toolkit needs to delegate to, or at least share its security context with, the host container, especially for its support of WS-Security and SAML

In the past, with J2EE 1.2, 1.3 and 1.4 compatible application servers, these values have been very hard to achieve with the application server's built-in SOAP support (ie. JAX-RPC - oriented around the concept of RPC, skeletons, stubs, and generated Java types) or with a Third-Party SOAP Toolkit (eg Apache Axis - little integration with the container's underlying Security Framework).

As a result, on these platforms, I've struggled to be able to get anywhere near achieving my goal of hitting the 8 key values of my Web Service quest.

JavaEE 1.5 introduced the JAX-WS API which is the newer Web Services toolkit for JavaEE developers. The focus of this standard, in my opinion, is still largely based on RPC and generated Java types. However, it does offer a glimmer of hope in form of the option of using a 'dnyamic' server side API (Provider interface) and client-side API (Dispatcher interface).

To my eyes the Provider/Dispatcher facility could finally give me a chance of achieving the 8 key values in a real Web Services based solution.

Version 10.0 of WebLogic was the first version to introduce JAX-WS, but at the time lacked integration with WebLogic's Security Framework and WS-Security and SAML support, thus lessening its relevance to many solutions where securing web services was a key requirement. Many users continued to use WebLogic's JAX-RPC stack due to the need to leverage Weblogic security.

Things have now changed and the JAX-WS implementation in version 10.3 of WebLogic should address this. I hope to blog soon about my experiments with JAX-WS Providers/Dispatchers in WebLogic 10.3 and my quest to realise my 8 key values of Web Services Messaging Nirvana.

Song for today: Burn Baby by Mother Tongue

Friday, September 5, 2008

WSDL-First and Schemas: Use Global Types or Global Elements?

Often I'm faced with needing to expose a Web Service where the request and response bodies are messages, based on a well established shape that is not just the concern of the specific Web Service. The message received by the Web Service is likely to flow through other parts of the internal system 'behind' the service endpoint. The Schemas for these messages are not specific to just the Web Service WSDL but are also used by other internal components which have to deal with XML. Often these Schemas are pre-defined across an organisation or by third-parties (eg. the ARTS IXRetail schema for the Retail industry).

With these sorts of scenarios, a “WSDL first” approach is usually required for creating or generating Web Service end-points, where the WSDL identifies the shape of the service request and response bodies, but does not directly define this shape. Instead, the WSDL imports the externally held Schemas and references entities from the Schema to declare the format of the top-level nodes for a SOAP message body.

With this in mind, when designing Schemas that will be used by WSDLs and other XML components, I've been trying to work out whether top-level schema entities should be defined as Global Types or as Global Elements.

So I tried a little experiment, using a simple home-made schema to represent a Customer, including Name, Date of Birth, Addresses and Telephone Numbers. I created a SOAP Web Service which enables a Customer's personal information to be retrieved by providing the customer's name as the search criteria in the SOAP request.

First I tried using a Global Type in my Customer schema (Customer.xsd) to represent the shape of the top level request XML (CustomerSearchCrtieria) and response XML (Customer), as follows:

<schema ....>
<complexType name="CustomerSearchCriteria">
    <sequence>
        <element name="FirstName" type="tns:FirstName"/>
        <element name="LastName" type="tns:LastName"/>
    </sequence>
</complexType>

<complexType name="Customer">
    <sequence>
        <element name="FirstName" type="tns:FirstName"/>
        <element name="LastName" type="tns:LastName"/>
        <element name="DateOfBirth" type="tns:DateOfBirth"/>
        <element name="Address" type="tns:Address" minOccurs="0" maxOccurs="unbounded"/>
        <element name="Telephone" type="tns:TelephoneNumber" minOccurs="0" maxOccurs="unbounded"/>
    </sequence>
</complexType>
....

In the WSDL, I defined the request and response message parts to use the two Global Types from the imported schemas, as follows:

<wsdl:definitions name="CustomerInfoService" ... xmlns:cust="http://www.example.org/Customer">
<wsdl:types>
    <xsd:schema>
        <xsd:import namespace="http://www.example.org/Customer" schemaLocation="./Customer.xsd"/>
    </xsd:schema>
</wsdl:types>
 
<wsdl:message name="GetCustomerInfoRequest">
    <wsdl:part name="MyCustomerSearchCriteria" type="cust:CustomerSearchCriteria"/>
</wsdl:message>

<wsdl:message name="GetCustomerInfoResponse">
    <wsdl:part name="MyCustomer" type="cust:Customer"/>
</wsdl:message>
....

Then when I fired a SOAP request into the service using my test harness, the SOAP request contained the following (notice no namespace for top level element in the body):

<soapenv:Envelope ....>
<soap:Header ..../>
<soapenv:Body>
    <MyCustomerSearchCriteria xmlns:cus="http://www.example.org/Customer">
        <cus:FirstName>Paul</cus:FirstName>
        <cus:LastName>Done</cus:LastName>
    </MyCustomerSearchCriteria>
</soapenv:Body>
</soapenv:Envelope>

The SOAP response contained the following (notice no namespace for top level element in the body):

<soapenv:Envelope ....>
<soap:Header  ..../>
<soapenv:Body>
    <MyCustomer xmlns:cus="http://www.example.org/Customer">
        <cus:FirstName>Paul</cus:FirstName>
        <cus:LastName>Done</cus:LastName>
        <cus:DateOfBirth>1973-12-12</cus:DateOfBirth>
        <cus:Address>
            <cus:HouseNumber>1</cus:HouseNumber>
            <cus:Street>High Street</cus:Street>
            <cus:Town>Big Town</cus:Town>
            <cus:Postcode>BT1 1AZ</cus:Postcode>
        </cus:Address>
        <cus:Telephone>
            <cus:AreaCode>0111</cus:AreaCode>
            <cus:LocalNumber>1234561</cus:LocalNumber>
        </cus:Telephone>
    </MyCustomer>
</soapenv:Body>
</soapenv:Envelope>

As you can see, the resulting top level XML element inside the SOAP body for both the request and the response is NOT part of the same namespace at the rest of the message XML. This is because the top-level element has actually been defined as an element by the WSDL, rather than being defined externally in the schema itself. To me this is less than ideal because I believe most people would want the whole message to have the same namespace.

In addition I believe most people would want the top level nodes to have an element name mandated by the Schema, not by the WSDL (eg. the top level element should be forced to be 'Customer' and not 'MyCustomer').

So I then tried using a Global Element instead, in my Customer schema (Customer.xsd) to represent the shape of the top level request XML (CustomerSearchCrtieria) and response XML (Customer), as follows:

<schema ...>
<element name="CustomerSearchCriteria" type="tns:CustomerSearchCriteria"/>

<element name="Customer" type="tns:Customer"/>
....

In the WSDL, I defined the request and response message parts to use the two Global Elements from the schemas, as follows:

<wsdl:definitions name="CustomerInfoService" .... xmlns:cust="http://www.example.org/Customer">
<wsdl:types>
    <xsd:schema>
        <xsd:import namespace="http://www.example.org/Customer" schemaLocation="./Customer.xsd"/>
    </xsd:schema>
</wsdl:types>

<wsdl:message name="GetCustomerInfoRequest">
    <wsdl:part element="cust:CustomerSearchCriteria" name="parameters"/>
</wsdl:message>

<wsdl:message name="GetCustomerInfoResponse">
    <wsdl:part element="cust:Customer" name="parameters"/>
</wsdl:message>
....

Then when I fired a SOAP request into the service using my test harness, the SOAP request contained the following:

<soapenv:Envelope ....>
<soap:Header ..../>
<soapenv:Body>
    <cus:CustomerSearchCriteria xmlns:cus="http://www.example.org/Customer">
        <cus:FirstName>Paul</cus:FirstName>
        <cus:LastName>Done</cus:LastName>
    </cus:CustomerSearchCriteria>
</soapenv:Body>
</soapenv:Envelope>

The SOAP response contained the following:

<soapenv:Envelope ....>
<soap:Header ..../>
<soapenv:Body>
    <cus:Customer xmlns:cus="http://www.example.org/Customer">
        <cus:FirstName>Paul</cus:FirstName>
        <cus:LastName>Done</cus:LastName>
        <cus:DateOfBirth>1973-12-12</cus:DateOfBirth>
        <cus:Address>
            <cus:HouseNumber>1</cus:HouseNumber>
            <cus:Street>High Street</cus:Street&
            <cus:Town>Big Town</cus:Town>
            <cus:Postcode>BT1 1AZ</cus:Postcode>
        </cus:Address>
        <cus:Telephone>
            <cus:AreaCode>0111</cus:AreaCode>
            <cus:LocalNumber>1234561</cus:LocalNumber>
        </cus:Telephone>
    </cus:Customer>
</soapenv:Body>
</soapenv:Envelope>

As you can now see, all of the elements in the XML message, contained within the request and response SOAP bodies, now have the same namespace and the top level element name is defined by the Schema rather than by the WSDL. This is much more desirable in my view.

In summary, when dealing with Web Services in a “WSDL-first” scenario and using externally defined schemas, I would recommend defining Global Elements in the Schema and then referencing these elements from a WSDL, rather than using Global Types. This enables the complete message to share the same namespace and enables all elements (and their names), in the message, to be completely defined and enforced by the Schema.

Note: I am not necessarily making a recommendation to use Global Elements over Global Types in all circumstances, rather, just in situations where a Schema could possibly be used directly by a SOAP Web Service description file (WSDL).in addition to other parts of a system.

Soundtrack for today: Gouge Away by Pixies

Thursday, August 21, 2008

Is a JVM's Maximum Heap Size really 1.7 GB?

Many claim that the maximum heap size for a JVM is 1.7 GB, but is this really true?

Well yes and no....

Probably yes, where in most cases the person asking the question is a developer using a Windows based PC. Invariably, the PC uses a Windows 32-bit OS on 32 bit hardware and the JVM in use is Sun's 32-but Hotspot JVM. However, start to change any of these host environment characteristics and the maximum heap size limit can vary significantly.

In many production scenarios, the characteristics of the host environment are very different from development, such as having a 64-bit OS on 64-bit hardware (eg. Solaris/SPARC). However, its often only the developers and not the system administrators that understand the subtleties of JVM memory management to any significant degree, so the whole discussion gets lost and everyone ends up assuming 1.7GB is a hard heap size limit across the board..

So here's my attempt to map out the actual maximum heap sizes a little more clearly. Hopefully the figures in the table below are pretty accurate. I've tested many of these variables, but I've not had the luxury of having an environment to test them all. So please correct me if you have evidence of discrepancies.

The table assumes that the host machine has sufficient physical RAM installed and that other OS processes aren't consuming a significant portion of the available memory. It is assumed that the 64-bit OS is Windows, Linux or Solaris, the 32-bit hardware is x86 and the 64-bit hardware is x86-64, IA-64 or SPARC. Other operating system or hardware architectures may have different subtleties.

Depending on the JVM you use, your mileage may vary. For example, JRockit is pretty good at being able to hit the maximum heap size limits listed, whereas inherent limitations in other JVMs may prevent this.

It is worth discussing the Windows /3G boot option (with PAE) a little further. Most JVMs, like Sun Hotspot 1.6, require a contiguous memory space. Using the /3G option on Windows does not guarantee a single contiguous memory space dedicated to a single process. Therefore, this option cannot be used by JVMs that cannot cope with non-contiguous memory and these JVMs thus have a maximum heap size of 1.7 GB. JRockit can handle non-contiguous memory and therefore can use the /3G option on Windows, enabling a maximum heap size of approximately 2.7 GB.

Why the difference between maximum OS process size and maximum heap size?

Well the memory used by a JVM process is not just composed of heap. It is also made up of JVM internal structures, generated code, loaded native libraries and thread stacks which are all a lot less predictable in size. As a result, the stated maximum JVM process memory size can only ever be approximate allowing some headroom for non-heap memory usage. The more native C libraries your Java application uses and the more threads your application uses, then the more the memory, outside of the heap, will be consumed. Also, some JVMs, like Sun Hotspot, store class definitions outside of the heap (eg. in a “Perm Space”), further adding to the size of a JVM process over and above the maximum heap size.

So even though the figures in the table are achievable in certain circumstances, the actual choice of JVM and the nature of the hosted Java application and its usage of non-heap based memory may mean, in practice, that you are not able to get close to achieving the indicated maximum heap size for your application.

Supporting References:

[1] http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx
http://blogs.sun.com/moazam/entry/why_can_t_i_allocate
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4358809

[2] http://blogs.oracle.com/jrockit/2008/09/how_to_get_almost_3_gb_heap_on_windows.html (updated link following the move of the linked content from the old Dev2Dev site - 30-Jan-09)

[3] http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_heap_32bit

Soundtrack for today: Radio Protector by 65daysofstatic

Thursday, May 8, 2008

Domain Health tool Vs WLDF for monitoring

[Originally posted on my old BEA Dev2Dev blog on May 8, 2008]

Note added 11-Dec-2009: The content of this blog entry is generally superseded by a newer blog entry which addresses the differences since the introduction of the WLDF harvesting capability to DomainHealth.

DomainHealth is an open source "zero-config" monitoring tool for BEA's WebLogic Application Server. It collects important server runtime statistics over time for all servers in a domain, archives this data into CSV files and provides a simple Web Browser based interface for viewing graphs of statistics, historically.

Version 0.7 has just been released and is available to download and use on WebLogic 9.x/10.x from the project's home page at: http://sourceforge.net/projects/domainhealth

Some people have asked why I have created the DomainHealth tool when WebLogic already provides the WebLogic Diagnostic Framework (WLDF). This is a good question which I will attempt to answer below.

First of all it is worth considering that monitoring can be divided into many sub-categories, three of which are:

Alerting of a potential issue which is about to occur or has just occurred (eg. not much memory left) enabling the administrator to take pro-active or remedial action
Real-time statistics capture and viewing, so an administrator can instantly gauge the health of his/her applications and systems
Continuous harvesting of application and system data over time to enable administrators to retrospectively analyse trends (eg. to identify potential tunings), and to diagnose the cause of fatal problems after the problem has occurred.

...plus many others like instrumenting, profiling, heartbeating.

WLDF in combination with the "WLDF Console Extension" provides the base technology to enable administrators to configure a set of WebLogic servers and environment which caters for all 3 monitoring capabilities listed above.

The philosophy of DomainHealth is "zero-config" and as such is intended for a certain type of user base for certain types of monitoring needs (ie. mainly point (3) and to a lesser extent point (2) above). For users who require more complex customisable monitoring capabilities, the WebLogic Diagnostic Framework (WLDF) is more likely to provide the capabilities desired. In some environments, a combination of using WLDF for (1) and DomainHealth for (2) and (3) may also be an option to consider.

Here's a list of benefits which I see for using Domain Health and for using WLDF and the WLDF Console Extension....

Domain Health (DH) benefits:

DH is easier to install (hot deploy of WAR to admin server) than WLDF which requires copying a JAR file to a specific path and then re-starting the admin server. The WLDF deployment process may be problematic when needing to retrospectively add monitoring capabilities to a production system which has already gone live.
DH in most case requires no configuration. WLDF console has some built-in views which helps shorten the number of configuration steps likely to be required, However it is unlikely that WLDF's built-in views will be sufficient on their own. By default, the WLDF Console shows data cached on the live servers. For the WLDF Console to show historic data, the administrator must first manually configure (or more likely script the configuration) of a WLDF harvester, separately, to enable historical data to then be retrieved and shown in the console.
For current data, WLDF Console periodically polls the cache for live data and for historic data has to contact each server's WLDF Data Accessor to retrieve data from each server's local WLDF archive files. Overall, the performance impact of this process is likely to negate some of the performance gain, from not needing to poll servers remotely to harvest their properties in the first place.
The CSV output of DH is more friendly to generic third-party tools than WLDF is. For offline retrieval of WLDF archived data for use in such tools, scripting will be required to integrate Jython/WLST with the third-party tools.
DH's web-based visual display is more lightweight than the WLDF console extension for administrators, requiring just a simple browser rather than also needing a Java browser plugin to be installed.

WLDF (inc WLDF Console) benefits:

WLDF is infinitely configurable allowing administrators to track the exact server statistics that they are interested in. For DH, you are stuck with what has been prescribed as a suitable set of statistics.
A WLDF thread runs within each managed server for property retrieval, rather than requiring remote polling from an admin server. This may lessen any potential performance impact of performing monitoring in a live environment (however this gain is at least partly lost when using the WLDF console running on the Admin Server to then view data - this requires each server to be contacted to retrieve its locally archived statistics).
WLDF and the WLDF console are far more powerful and extensive in terms of the overall monitoring features that they provide (eg. provides visualisation for instrumentation of running code).

It is also worth noting that DomainHealth is not a replacement for a 'fully-fledged' 'off-the-shelf' Server monitoring and management tool. Such tools are usually more capable and able to capture other statistics (eg. hardware, OS and network stats) in addition, to enable a more holistic picture of the health of a server environment to be captured. However, in cases where budget does not stretch to purchase an off-the-shelf solution for production environments or where a quick and easy tool for monitoring performance in performance test environments, is required, DomainHealth may help to fill the gap.

In summary what I would like to say is that DomainHealth is not intended as a alternative to WLDF and has been developed specifically for certain use cases for a certain subset of clients which I have dealt with over the years (and is indeed being used by some of these clients today).

Soundtrack for today: Nearly Lost You by Screaming Trees

Thursday, March 27, 2008

New Open Source WebLogic Monitoring tool

[Originally posted on my old BEA Dev2Dev blog on March 27, 2008]

DomainHealth is an open source server monitoring tool for BEA's WebLogic Application Server. It collects important server runtime statistics over time for all servers in a domain, archives this data into CSV files and provides a simple Web Browser based interface for viewing graphs of statistics.

DomainHealth is designed to have a low overhead on managed servers in terms of performance and is intended to be a 'zero config' tool *.

*(Currently in certain circumstances, some minimal configuration may be required - see the FAQ document on the project's host web site)

The statistics that DomainHealth collects, includes Core Server properties (eg. Open Sockets), JDBC Data Source properties (eg. Average Connection Delay) and JMS Destination properties (eg. Messages Current). The Web HTML based interface that DomainHealth provides, enables an administrator to view line-graphs for specific harvested server properties. The administrator can choose the window of time to view data for, which may go back hours, days, months and even years. Alternatively, an administrators can use his/her own tool of choice (eg. MS Excel, Open Office Spreadsheet) to analyse the contents of the generated CSV files (eg. to generate a graph view of some data from the CSV file loaded into a spreadsheet).

DomainHealth is deployed to the Admin Server of a WebLogic domain in the form of a J2EE Web-Application (WAR), and immediately collects properties from all servers in the domain, via a periodic polling mechanism, once a minute. All CSV files are generated and stored on the file-system of the Admin Server for subsequent viewing via the DomainHealth graphical web interface or for retrieval for offline analysis.

Domain Health is hosted on SourceForge at: http://sourceforge.net/projects/domainhealth

The latest version can be downloaded from the project home page. The home page also contains links to various Help Documents plus an Online Forum for users to get help and provide feedback. To install and run the DomainHealth monitor application, follow these simple steps:

Deploy domainhealth-nn.war to WebLogic, targetted to the Admin Server server of the domain only
Using a Web Browser, navigate to: http://adminhost:port/domainhealth

Supports WebLogic Server 9.x and 10.x.

Some people may ask why the need for this tool given the availability of WLDF in WebLogic. Personally I think these two tools are complimentary and I intend to blog on my reasons for this opinion in the near future.

Soundtrack for today: Auto Rock by Mogwai

Monday, February 25, 2008

Web Services: RPC, REST and Messaging

[Originally posted on my old BEA Dev2Dev blog on February 25, 2008]

Choosing a model for interoperable communication in the enterprise

For the implementation of Web Services in the enterprise environment, I've seen many different technologies used. Recently, in my spare moments, I've reflected on this and have come to the conclusion that all these technologies tend to fit one of three models (or hybrids of these models).

I would summarise these three models as:

Remote Procedure Calls (RPC). A client-server based remotable pattern where a subset of an existing system's local functions is exposed pretty much 'as-is' over the wire to client programs.
Resource-oriented Create-Read-Update-Delete (CRUD). A client-server based resource-oriented pattern where the server-side provides a representation of a set of resources (often hierarchical) and exposes Create, Read, Update and Delete capabilities for these resources to client programs.
Messaging (eg. as commonly seen with Message Oriented Middleware and B2B). Messages or documents are passed asynchronously between peer systems in either, but not always both, directions.

Sometimes its hard to distinguish between these models and where the boundaries lie. In fact, I don't think there are boundaries, only grey areas and all three models lie in the same spectrum. I've attempted to visualise this spectrum in the diagram below.

Depending on where your implementation lies in the spectrum, the different highlighted characteristics should manifest themselves.

In the Web Services world, we may typically implement these three models using one of the following three approaches:

Remote Procedure Calls: SOAP using a synchronous RPC programming approach and, typically, generated 'skeletons/stubs' and some sort of Object-to-XML marshalling technology
Resource-oriented Create-Read-Update-Delete: REST or 'RESTful Web Services' or ROA, re-using World-Wide-Web based approaches and standards like HTTP and URIs
Messaging: SOAP using an asynchronous Message/Document passing approach where invariably the documents are defined by schemas and, often, the use of message-level (rather than transport-level) security elements is required

The confusing thing is the fact that SOAP can happily and naturally satisfy two of these three models (ie. both RPC and Messaging), albeit that these two models are polar opposites which causes much confusion and is probably part of the reason why SOAP gets a bad name in some quarters.

The problem is further compounded with SOAP because the SOAP specification and accompanying collateral talks about two possible 'Styles' (ie. 'RPC' style and 'Document' style). However as I explained here, just because your SOAP Web Services are marked as 'Document' style in your WSDLs, it doesn't mean you are not doing remote procedure call based programming.

When faced with the REST zealot or the WS-* zealot, we probably need to bear this spectrum in mind. For the Web Services paradigm, there is not a 'one-size fits all' and specific requirements for a given situation should dictate which position in this spectrum best lends itself to satisfying the requirements. Also, the overlap between the models may be greater than shown in the diagram. For example, some would argue that REST can happily and more appropriately be used to fulfil what would otherwise be RPC oriented problems, in addition to solving Resource-oriented CRUD style problems.

Soundtrack for today: Purity by The God Machine

Friday, August 10, 2007

Tips for Web Services Interoperability

[Originally posted on my old BEA Dev2Dev blog on August 10, 2007]

I try to follow some simple rules to maximise interoperability when developing Web Services using WebLogic and/or AquaLogic Service Bus (ALSB). Most of the rules are pretty obvious, but perhaps one or two are not?. In case some of these rules are useful to others, I thought I'd share them, so here they are:

Use SOAP over HTTP. I've blogged here about why SOAP over JMS shouldn't be used if interoperability is a concern.

Conform to the WS-I Basic Profile 1.1 by using the free WS-I Test Tool. Test the WSDL and over-the-wire SOAP requests/responses for the created Web Services, for conformity using the tool available here (look for "Interoperability Testing Tools 1.1").

Expose Web Services using the "Document-Literal-Wrapped" style with the 'dotNetStyle' flag to help WS-I conformity and to be especially Microsoft product friendly. I partly covered this in the blog here

Use the WS-* standards judiciously. WebLogic implemented standards such as WS-Addressing, WS-Security, SAML and WS-ReliableMessaging are not necessarily implemented by other Web Services products/stacks or the specification version supported by these may be different.

Don't necessarily dismiss the use of WebLogic 'add-value' / 'non-standard' Web Services features at face-value
- 'Buffered' Web Services are interoperable with other client Web Services stacks at the basic SOAP-HTTP level because the service consumer is not aware that the service implementation uses a JMS queue for buffering internally.
- 'Callbacks' may be interoperable with non-WebLogic service consumers as long as the non-WebLogic consumers include the WS-Addressing 'Reply-To' header in the request and provide a web service endpoint to be asynchronously called back on for the specified 'Reply-To' URL
- 'Asynchronous Requests/Responses' may be interoperable with non-WebLogic service providers as long as the non-WebLogic providers honour the received WS-Addressing 'Reply-To' header of the request, by sending the Web Service response asynchronously to the specified 'Reply-To' URL.
- However, 'Conversational' Web Services are highly unlikely to be interoperable with non-WebLogic based service providers or consumers. The specification 'WS-Conversation' which the 'Conversational' feature would probably most clearly map to, doesn't really exist as a public specification and there is no indication that it ever will (an incomplete internal draft version has been dormant for a few years now).
For SOAP/HTTP Proxies created in ALSB, activate the "WS-I compliance enforcement" option (for the development phase of the project at least). When ALSB is used to act as an intermediary between Web Services consumers and providers, this ALSB option will help any Web Service non-conformities to be detected, so that they can be quickly rectified.

Note: ALSB also transparently converts between SOAP version 1.1 and SOAP version 1.2 inbound and outbound messages and ALSB is specifically tested by BEA for interoperability against third-party vendor toolkits such as Microsoft .NET and Apache Axis.

Soundtrack for today: Forensic Scene by Fugazi

Tuesday, July 31, 2007

RPC-Encoded. Document-Literal. Does it really matter?

[Originally posted on my old BEA Dev2Dev blog on July 31, 2007]

In SOAP, there are two possible styles:

RPC. Implies a SOAP body structure which indicates service name, and multiple parameters and return values
Document. Implies a SOAP body which is a complex message document

In SOAP, there are two possible uses:

Encoded. Adheres to a set of rules for serialising a graph of typed objects using basic XML schema data-types, but as a whole, does not conform to a schema
Literal. Body content conforms to a specific XML schema

In most SOAP toolkits, the most common combinations of Style and Use are RPC-Encoded and Document-Literal. Additionally, RPC-Literal is becoming more prevalent although it is currently a lot less common. Document-Encoded doesn't really make sense and as a result I doubt you'll find it implemented in your favourite SOAP toolkit.

RPC-Encoded was the initial message format for SOAP, when SOAP was originally aimed at just the Remote Procedure Call programming model. Document-Literal was incorporated into the SOAP standard in time for SOAP 1.0. It was intended to enable XML documents (messages) to be passed as the full content of the SOAP body, usually with one input message part and one return message part.

Like most J2EE Application Servers, the core of WebLogic's Web Services support is based on the JAX-RPC 1.1 specification. JAX-RPC defines a Remote Procedure Call based programming model and API for developers who want to expose a set of Java methods remotely (JAX-RPC does not offer much in the way of support for adopting the alternative distributed computing model of "Messaging").

Given that JAX-RPC is based on the RPC programming model, then in terms of best practices, it's obvious then that we should be using RPC-Encoded (or RPC-Literal) as the preferred SOAP Style/Use for creating and exposing newly developed Web Services, isn't it?

Well, not necessarily....

The terminology of RPC versus Document SOAP Styles is very unfortunate when we start to consider Remote Procedure Call versus Document/Messaging distributed programming models. These terms imply that the RPC Style should be used for RPC programming models and that the Document style should be used for Document (Messaging) programming models. That is not the case at all. In practice, the SOAP Style has nothing to do with a programming model, it merely dictates how to translate a WSDL binding to a SOAP message. For example, WebLogic's JAX-RPC toolkit equally supports exposing the same Java methods remotely via either style. You can use either style with either programming model.

A SOAP Style/Use of Document-Literal provides two distinct advantages over RPC-Encoded:

WS-I Basic Profile precludes the use of "Encoded" as the SOAP Use. So, if promoting interoperability and openness is your concern then you wouldn't choose RPC-Encoded over Document-Literal* (and why else would you be using SOAP other than for interoperability?).

RPC-Encoded provides no real separation between the format of the SOAP body (eg. that could be defined by a Schema) and the transport protocol and invocation format of a SOAP operation (eg. defined by a WSDL). For Document-Literal, the SOAP body content conforms to one or more Schemas which can optionally be externalised from the WSDL (and then included into the WSDL via an 'import' statement). Why is this important? Well, Document-Literal can promote the re-use of the same XML Schemas across the many different Web Services you may need to expose and throughout the rest of your distributed application logic which may need to deal with the same XML data formats. As we know, re-use reduces development effort, helps avoid errors and promotes consistency.

In summary, the decision of using RPC-Encoded or Document-Literal really doesn't have a direct relation to whether one is adopting a Remote Procedure Call programming model rather than a Document/Message-passing programming model. In practice, regardless of programming model, Document-Literal offers practical advantages.

Footnote: There is a new alternative Web Services toolkit which can be used, based on JAX-WS 2.0. JAX-WS offers developers an alternative programming model based on Messaging in addition to the Remote Procedure Programming model which is also supported. JAX-WS is newly supported in WebLogic 10, with some restrictions.

* In fact, use WebLogic's 'Wrapped' option with "Document-Literal" to help further promote interoperability with both Microsoft based Web Services toolkits (which traditionally prefer the Document-Literal-Wrapped style) and Remote Procedure Call oriented client Web Services toolkits (which often expect to be able to include the 'remote operation' name within the SOAP requests they send, rather than using another mechanism such as WS-Addressing to identify the operation to invoke).

Soundtrack for today: Off To One Side by Come

Tuesday, July 24, 2007

WLST and SQLPLus on Linux - get UP, DOWN, LEFT, RIGHT and other keys working

[Originally posted on my old BEA Dev2Dev blog on July 24, 2007]

This has been bugging me for ages so today I spent some time trying to find a solution. I've managed to find one, so I thought I'd share it with you just in case it is useful to others....

When running command line tools in a Linux shell that read from standard input (using the 'readline' library under the covers), the UP and DOWN arrow keys can't be used to select previous history commands. Also, the LEFT, RIGHT arrow keys and the HOME and END keys can't be used to move to different parts of the current text line to easily correct mistakes in the line before submitting.

As a JavaEE/WebLogic user I find this really frustrating when I need to use tools like WLST or SQLPlus in interactive mode to mess around with WebLogic domains or change Oracle database schemas, respectively. Well the solution is a nifty little tool called rlfe which I found here.

Once installed, before running 'wlst.sh' or 'sqlplus' from a shell, first enter...

> rlfe

...and press return. Then when you run WLST or SQLPlus from the same shell, the interactive mode will work the way you always wanted it to. You'll be able to press the UP key to get to a previous command to execute it and you'll be able to press the LEFT key to move further back in the line of text to correct a typo.

The good news if you're a Ubuntu user is that this is available in the standard universe repository. To install it, just run....

> sudo apt-get install rlfe

Also, to avoid forgetting to run 'rlfe' before running your program, you can use an alias for the program to ensure that your program is always run in this 'enhanced' text entry mode. For example, add the following to ~/.bashrc :

alias sqlplus='rlfe sqlplus'

..then whenever you run sqlplus from the command line it'll work the way you want.

[In a comment on the original Dev2Dev blog 'eduardo_biagi' suggested an alternative tool called 'rlwrap' which works on Fedora. Since installing 8.04 version of Ubuntu, 'rlfe' does not work properly on Ubuntu so I now use 'rlwrap'. For example I have the following alias set up in my .bashrc: 'alias wlst='rlwrap /opt/oracle/wls1001/wlserver_10.0/common/bin/wlst.sh']

Soundtrack for today: Faded by The Afghan Whigs

Wednesday, March 28, 2007

The problem with using SOAP over JMS in SOA

[Originally posted on my old BEA Dev2Dev blog on March 28, 2007]

Sometimes I talk to people who seem to view the use of SOAP over JMS as the perfect combination to enable loosely coupled asynchronous shared services. However, when I dig deeper these people have invariably assumed that JMS is an 'over-the-wire' protocol, like HTTP. It is not.

Question: Why is this a problem?

Answer: Interoperability, plain and simple.

HTTP is a standard 'over-the-wire' protocol. HTTP belongs in the Application Layer of both the OSI model (layer 7) and the Internet Protocol Suite (layer 4 or 5). SOAP is a 'standard' (or W3C recommendation at least) transport agnostic protocol which uses an XML payload.

Due to the standard and technology agnostic nature of both SOAP and HTTP, many platforms and toolkits out there, written in different languages and on different operating systems, can interoperate using SOAP over HTTP by simply adhering to both of these standards (or at least the WS-I version of these standards).

However, JMS is not an 'over-the-wire' protocol. It is a Java API which requires that a client application uses a JMS provider library (JAR) provided by the vendor of the JMS Server hosting the services. This is analogous to requiring a JDBC driver for a particular vendor's database before a Java application can talk to that database. The actual 'over-the-wire' protocol used under the covers within the JMS provider library is not defined (it could be IIOP for example, or it could be some high speed non-standard vendor specific protocol).

As a result, in most cases, the only types of applications which can talk to a specific vendor's JMS Server are other Java based applications. It gets worse. If, for example, the JMS server vendor is IBM WebSphere and the service consumer is running within Oracle's Application Server, there may be problems even getting IBM's JMS client provider library working from within Oracle's Application Server in the first place, due to JMS implementation clashes. Some JMS Server vendors provide one or two non-Java based JMS libraries too (for example for C++ or .NET), but these are often limited in functionality and scope and often only support specific versions of specific platforms and operating systems.

In other words, the onus of interoperability, when using SOAP over JMS, is on the support of the vendor of the JMS server for all possible service consumer environments rather than the onus being on the service client's host environment support for standards. Vendors cannot scale to provide JMS support for all of the wide mix of programming languages, application servers and operating systems (including different versions) out there, so interoperability will take a big hit. Even for consumer applications that can use the JMS provider, one has to give the service consumer the provider library first before it can invoke services - not very loosely coupled I think.

As a result, an enterprise's design choice to use SOAP over JMS, as the default mechanism for interoperability for an enterprise's mix of heterogeneous systems, is likely to be fundamentally flawed in my opinion.

It is important to state that I am not saying that Message Oriented Middleware (MOM) does not have a place in a SOA framework. In fact, quite the opposite is true. To achieve capabilities such as asynchronous messaging, guaranteed delivery, only once delivery, and publish/subscribe mechanisms, MOMs are an essential part of the SOA fabric. That's why many vendor's ESB platforms are built on the underlying technology of Message Oriented Middleware. However, what I am saying is that JMS should not be the preferred API for exposing shared services to remote service clients. Using middleware such as an ESB for example, a service with an asynchronous interface can be exposed via a SOAP over HTTP interface, for example, where the ESB performs the switching between the consumer facing synchronous invocation protocol and the underlying internal asynchronous message passing mechanism which may or may not use JMS internally.

With the right organisation and governance in place, I believe it can be valid to decide to expose a shared service via SOAP/JMS in addition to SOAP/HTTP or another more 'open' protocol, where there are valid exceptional circumstances (eg. high performance requirements). However, it is probably best to treat these decisions on an exception by exception basis because the overhead of supporting two access methods for a service does have an additional overhead due to increased configuration, maintenance, and testing costs.

Is HTTP the perfect transport for SOAP, especially for asynchronous services? Not at all. However, if consumers can't invoke these services in the first place - that's worse.

Have I got something against JMS? Not at all. Its one of my favourite JavaEE APIs. I'm not talking about JavaEE here. I am talking about SOA.

Soundtrack for today: Happy Man by Sparklehorse