Thursday, July 8, 2010

Extending schema with redefine and Java


In this case a system which has an external interface (say a web service exposed to external clients) let us only consider the xsd for this example. Assume that the xsd is imported in a wsdl and used.Internally it needs to maintain an enhanced information model which adds multiple attributes to the provided complex types. The internal information model is not a new one it is just an enhanced model.Any additions or changes to the external xsd will require the changes in the internal xsd as well.

Possible solutions

  1. Use a new xsd for external and internal interfaces copying all attributes. The disadvantage for this approach is the maitainanve of the xsds.Any change requires changes in both the places. The advantage is that the internal and external xsd become disconnected and can be maintained seperately. We do not require the advantage in our case since it would be a maintainance overhead

  2. Import the external xsd and reuse the elements whereever possible by extending them or using them as is. It is not a very clean approach lot of repeated code would result but it would be lesser than option 1.

  3. Redefine the elements enhanced and use it for further processing. Much cleaner approach is expected.

Detailed diffierence between xml schema extend vs redefine elements is explained below:

The external xsd and its sample xml is show below:-

Fig. external xsd

Fig. external sample xml

The external interface conatains an element Request with two elements name and child. The child is a complex type with an element t1. The internal interface needs to enhace the child complex type to add two more elements val1 and val2

While using the extend element the schema and the example xml will look like:-

Fig. internal xsd using extends construct

Fig. internal sample xml

As seen above the extends needs to redefine the parent attribute if the child element is extended. This leads to a very complicated xsd and the reusability of the defined types becomes very limited. This problem can be overcome if redefine construct is used as shown below.

Fig. internal xsd using redefine and extends construct

Fig. sample xml for the above xsd

Tool and API support:

JAX-RPC and redefine

According to the JAX-RPC 1.1 specifications (

The following XML Schema features are not required to be supported and WSDL to Java mapping tools are allowed to reject documents that use them: xsd:redefine, xsd:notation, substitution groups.

The JDeveloper does not generate the proxy for the web service that contains the redefine element.

JAXB 1 and Redefine

The XML Schema redefine construct is not supported by JAXB and if such unsupported construct is included in a schema, an error will be generated when you try to generate Java classes from them with xjc.(ref:-

JAXB 2 and Redefine

XJC for JAXB 2 successfully generates the proxy for the web service

SOAP-UI and redefine

SOAP UI does not support the use of redefine elements. I have raised the following bug

XML SPY and redefine

I am using XML Spy 2008. This successfully generates a sample SOAP message from a WSDL containing the redefine element.

WS-Interoperability Basic Profile 1.1

This element is compliant and does not cause any errors.


The redefine construct does provide a flexible construct for extending the schema definations. The support is limited for the construct and is improving. The redefine schema construct cannot be used with JAX RPC, but is compliant with JAXB 2 so any other web services programming model that uses JAXB 2 like JAX WS or Spring Web Services can be used.

Wednesday, July 7, 2010

Anatomy of a signed SOAP message

I will explain a WSS signed web service SOAP message,signed using a X509 certificate.

The sample signed message is:

The following illustrates the anatomy of the message
1. SignedInfo

The SignedInfo element describes the signed content of the message.

1.1. CanonicalizationMethod
The element CanonicalizationMethod is used to describe the canonicalization algorithm used on the xml for the generation of the digest.
1.2. SignatureMethod
The element SignatureMethod is used to describe the algorithm used for the generation of the SignatureValue from the output of the canonicalization algorithm.
1.3. Reference
The optional URI attribute for Reference element identifies the data object that was signed.

In the above case the body is being signed thus the URI attribute refers to the soap body.
Transform Algorithm indicates the transformation algorithm. I still need to understand why do we need a duplicate of the canonicalization algortithm?
DigestMethod Algorithm indicates the algorithm used to generate the digest value and DigestValue contains the computed digest value.
SignatureValue contain the signature value, which is actually the encrypted digest value. This value is the output of the Signature Method Algorithm indicated

The signed data contain a core bare name reference (as defined by the XPointer specification [XPointer]) to the element that contains the security token referenced, or a core reference to the external data source containing the security token.
In this example the BinarySecurityToken contains the Base64Encoded public key that can be used for verification.

The signed content was created using a Microsoft file (.pfx) containing x509 certificates. The public key can be regenerated using the BinarySecurityToken element.
Sample code to generate .cer from BinarySecurityToken

// from tag BinarySecurityToken

public static int decode(char c) {
if (c >= 'A' && c <= 'Z')
return c - 65;
else if (c >= 'a' && c <= 'z')
return c - 97 + 26;
else if (c >= '0' && c <= '9')
return c - 48 + 26 + 26;
switch (c) {
case '+':
return 62;
case '/':
return 63;
case '=':
return 0;
throw new RuntimeException(
new StringBuffer("unexpected code: ").append(c)

public static byte[] decode(String s) {

int i = 0;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int len = s.length();

while (true) {
while (i < len && s.charAt(i) <= ' ')

if (i == len)

int tri = (decode(s.charAt(i)) << 18)
+ (decode(s.charAt(i + 1)) << 12)
+ (decode(s.charAt(i + 2)) << 6)
+ (decode(s.charAt(i + 3)));

bos.write((tri >> 16) & 255);
if (s.charAt(i + 2) == '=')
bos.write((tri >> 8) & 255);
if (s.charAt(i + 3) == '=')
bos.write(tri & 255);

i += 4;
return bos.toByteArray();

public static void main(String[] args) throws Exception {
byte[] back = decode(b64Str);
OutputStream out = new FileOutputStream("aa.cer");
//perform your exception handling

4. KeyInfo

In order to ensure a consistent processing model across all the token types supported by WSS: SOAP Message
Security, the element specify all references to X.509 token types in signature or encryption elements that comply with this profile.
The element contains a element that specifies the token data by means of a X.509 SubjectKeyIdentifier reference.

Normalization and Canonicalization of XML

Normalized xml is the XML stripped of white spaces.
Multiple methods can be applied by using the following schema types:-
  • xsd:normalizedString (
  • xsd:token(
These types do not restrict the use of white spaces rather are instructions to the processor to ignore the spaces (according to their respective rules).
e.g xsd:token is supposed to merge multiple white spaces into one, so for an element defined in xsd as
<xs:element name="tkn" type="xs:token"/>

the value can be provided as:-
<tkn>toks        en     </tkn>

This will not result in an schema validation error but the parser should treat it like a string with the following value:-
<tkn>toks en</tkn>

Canonical form of an XML
The canonical form of an XML document is physical representation of the document produced by the following method:-
  • The document is encoded in UTF-8
  • Line breaks normalized to #xA on input, before parsing
  • Attribute values are normalized, as if by a validating processor
  • Character and parsed entity references are replaced
  • CDATA sections are replaced with their character content
  • The XML declaration and document type declaration (DTD) are removed
  • Empty elements are converted to start-end tag pairs
  • Whitespace outside of the document element and within start and end tags is normalized
  • All whitespace in character content is retained (excluding characters removed during line feed normalization)
  • Attribute value delimiters are set to quotation marks (double quotes)
  • Special characters in attribute values and character content are replaced by character references
  • Superfluous namespace declarations are removed from each element
  • Default attributes are added to each element
  • Lexicographic order is imposed on the namespace declarations and attributes of each element
The rules for Canonical form of xsd are very detailed and do not cover the normalization of elements. Both of these forms supplement each other.

Canonical form is very useful while generating hash for the xml and are used in generating the WS-Security BinarySecurityToken.