Deserializing Partial XML in Java

So you are hitting a legacy XML service in your brand-new Java app, and need to extract the useful information from the response. Because the service is legacy, and XML, it of course is not in a format you really want to replicate. Let’s take a look.

XML

Here is the XML document we will be looking at for this exercise.

<example:Response xmlns:example="ns:example:cwm:1.0" xmlns="http://www.hr-xml.org/3" xmlns:oa="http://www.openapplications.org/oagis/9"
		type="requisitions" count="2" id="b68c2be1-cdf1-4682-b234-13f3992cf14f"
		
		truncated="false">
    <StaffingOrder languageCode="en-US">
        <StaffingOrderTypeCode>Order</StaffingOrderTypeCode>
        <StaffingOrderStatusCode>Active</StaffingOrderStatusCode>
        <CustomerParty>
            <PartyID schemeID="Buyer Org ID">142868</PartyID>
            <PartyName>Child ABC Test (Buyer)</PartyName>
            <PartyReportingIDs>
                <ID schemeID="Requisition ID">2365695</ID>
            </PartyReportingIDs>
        </CustomerParty>
        <SupplierParty>
            <PartyID>142789</PartyID>
            <PartyName>XYZ Test (Supplier)</PartyName>
        </SupplierParty>
        <CustomerReportingRequirements>
			
			
		</CustomerReportingRequirements>
        <MultiVendorDistributionIndicator>false</MultiVendorDistributionIndicator>
        <StaffingRequisition>
            <PositionTitle>Daily Job</PositionTitle>
            <PositionLocation>
                <LocationID>CO </LocationID>
                <Address>
                    <oa:CountrySubDivisionCode>CO</oa:CountrySubDivisionCode>
                    <CountryCode>USA</CountryCode>
                    <oa:PostalCode>N/A</oa:PostalCode>
                </Address>
            </PositionLocation>
            <PositionOpenQuantity>270</PositionOpenQuantity>
            <JobCategoryCode>Consulting</JobCategoryCode>
            <CareerLevelCode>Level I</CareerLevelCode>
            <StaffingAvailability>
                <StartDate>
                    <FormattedDateTime>2020-03-30</FormattedDateTime>
                </StartDate>
                <EndDate>
                    <FormattedDateTime>2021-04-07</FormattedDateTime>
                </EndDate>
            </StaffingAvailability>
            <Shift>
                <ShiftID>Full-Time</ShiftID>
                <ShiftDescription>8:00am to 5:00pm</ShiftDescription>
            </Shift>
            <StaffingRate>
                <RateTypeCode>pay</RateTypeCode>
                <RateClassCode>Regular</RateClassCode>
                <oa:Amount currencyID="USD">22</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>pay</RateTypeCode>
                <RateClassCode>Overtime</RateClassCode>
                <oa:Amount currencyID="USD">44</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>pay</RateTypeCode>
                <RateClassCode>Doubletime</RateClassCode>
                <oa:Amount currencyID="USD">66</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>bill</RateTypeCode>
                <RateClassCode>Regular</RateClassCode>
                <oa:Amount currencyID="USD">22.04</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>bill</RateTypeCode>
                <RateClassCode>Overtime</RateClassCode>
                <oa:Amount currencyID="USD">44.18</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>bill</RateTypeCode>
                <RateClassCode>Doubletime</RateClassCode>
                <oa:Amount currencyID="USD">66.4</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <UserArea>
                <example:JobSubmissionDate>2020-04-07T18:49:05Z</example:JobSubmissionDate>
            </UserArea>
        </StaffingRequisition>
    </StaffingOrder>
    <StaffingOrder languageCode="en-US">
        <StaffingOrderTypeCode>Order</StaffingOrderTypeCode>
        <StaffingOrderStatusCode>Active</StaffingOrderStatusCode>
        <CustomerParty>
            <PartyID schemeID="Buyer Org ID">142868</PartyID>
            <PartyName>Child ABC Test (Buyer)</PartyName>
            <PartyReportingIDs>
                <ID schemeID="Requisition ID">2366194</ID>
            </PartyReportingIDs>
        </CustomerParty>
        <SupplierParty>
            <PartyID>142789</PartyID>
            <PartyName>XYZ Test (Supplier)</PartyName>
        </SupplierParty>
        <CustomerReportingRequirements>
			
			
		</CustomerReportingRequirements>
        <MultiVendorDistributionIndicator>false</MultiVendorDistributionIndicator>
        <StaffingRequisition>
            <PositionTitle>Daily Job</PositionTitle>
            <PositionLocation>
                <LocationID>CO </LocationID>
                <Address>
                    <oa:CountrySubDivisionCode>CO</oa:CountrySubDivisionCode>
                    <CountryCode>USA</CountryCode>
                    <oa:PostalCode>N/A</oa:PostalCode>
                </Address>
            </PositionLocation>
            <PositionOpenQuantity>10</PositionOpenQuantity>
            <JobCategoryCode>Consulting</JobCategoryCode>
            <CareerLevelCode>Level I</CareerLevelCode>
            <StaffingAvailability>
                <StartDate>
                    <FormattedDateTime>2020-04-01</FormattedDateTime>
                </StartDate>
                <EndDate>
                    <FormattedDateTime>2020-04-08</FormattedDateTime>
                </EndDate>
            </StaffingAvailability>
            <Shift>
                <ShiftID>Full-Time</ShiftID>
                <ShiftDescription>8:00am to 5:00pm</ShiftDescription>
            </Shift>
            <StaffingRate>
                <RateTypeCode>pay</RateTypeCode>
                <RateClassCode>Regular</RateClassCode>
                <oa:Amount currencyID="USD">22</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>pay</RateTypeCode>
                <RateClassCode>Overtime</RateClassCode>
                <oa:Amount currencyID="USD">44</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>pay</RateTypeCode>
                <RateClassCode>Doubletime</RateClassCode>
                <oa:Amount currencyID="USD">66</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>bill</RateTypeCode>
                <RateClassCode>Regular</RateClassCode>
                <oa:Amount currencyID="USD">22.04</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>bill</RateTypeCode>
                <RateClassCode>Overtime</RateClassCode>
                <oa:Amount currencyID="USD">44.18</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <StaffingRate>
                <RateTypeCode>bill</RateTypeCode>
                <RateClassCode>Doubletime</RateClassCode>
                <oa:Amount currencyID="USD">66.4</oa:Amount>
                <PayRateIntervalCode>Hour</PayRateIntervalCode>
                <CustomerRateClass>Junior</CustomerRateClass>
            </StaffingRate>
            <UserArea>
                <example:JobSubmissionDate>2020-04-08T16:03:19Z</example:JobSubmissionDate>
            </UserArea>
        </StaffingRequisition>
    </StaffingOrder>
</example:Response>

This is HR-XML, representing a couple of Requisitions being submitted for matching to candidates. We will be grabbing out only a small portion of the data in this document.

Jackson

Jackson since version 2 can serialize and deserialize XML as well as JSON. This is a bit confusing, as a lot of the annotations for JSON are also used for XML. But that is also great, as it promises you can easily swap between the formats. For example you can read XML and write JSON. This package would be great to use if we need the entire document. But for this use case, it ends up being sub-standard.

  @JsonIgnoreProperties(ignoreUnknown = true)
  @JacksonXmlRootElement(localName = "Response")
  public class Response {
    @JacksonXmlElementWrapper(useWrapping = false)
    @JsonProperty("StaffingOrder")
    List<Job> jobs;

    public List<Job> getJobs() {
      return jobs;
    }

    public void setJobs(List<Job> jobs) {
      this.jobs = jobs;
    }

    @Override public String toString() {
      return new StringJoiner(", ", Response.class.getSimpleName() + "[",
          "]").add("jobs=" + jobs).toString();
    }
  }

This is simple. If you’ve used Jackson before, most of the annotations will be familiar. On line 2 we are just setting Response as the root element, and then on line 4 we are telling Jackson that the list is not wrapped. That just means the StaffingOrder elements are repeated without being wrapped in some other kind of list element.

  @JsonIgnoreProperties(ignoreUnknown = true)
  public class Job {

    @JsonProperty("CustomerParty")
    public PartyWithRequisition buyerInfo;
    @JsonProperty("SupplierParty")
    public Party supplierInfo;

    @Override public String toString() {
      return new StringJoiner(", ", Job.class.getSimpleName() + "[",
          "]").add("buyerInfo=" + buyerInfo).add("supplierInfo=" + supplierInfo)
          .toString();
    }
  }

This class looks very familiar. We can already see duplication with the ignore unknown properties though. We should be able to set that up globally on the ObjectMapper to avoid this duplication though. There is a slight difference between the Parties, in that one has an extra element we care about.

  @JsonIgnoreProperties(ignoreUnknown = true)
  public class Party
  {
    @JsonProperty("PartyID")
    public String id;

    @Override public String toString() {
      return new StringJoiner(", ", Party.class.getSimpleName() + "[", "]")
          .add("id='" + id + "'").toString();
    }
  }

  @JsonIgnoreProperties(ignoreUnknown = true)
  public class PartyWithRequisition extends Party
  {
    @JsonProperty("PartyReportingIDs")
    public Requisition requisition;

    @Override public String toString() {
      return new StringJoiner(", ", Party.class.getSimpleName() + "[", "]")
          .add("id='" + id + "'").add("requisition=" + requisition).toString();
    }
  }

Simple again. Our base Party just has an id attribute (note that it is a String rather than a long or int). PartyWithRequisition extends Party and just includes an extra Requisition object. That one again is trivial; so trivial I won’t bore you with code that looks like what we’ve already seen.

The Problems

There are problems with this code. When we run a simple program that just prints out what we read, we end up with this:

Response[jobs=[
  Job[buyerInfo=Party[id='142868', requisition=Requisition[id='2366194']], supplierInfo=null], 
  Job[buyerInfo=null, supplierInfo=null], 
  Job[buyerInfo=null, supplierInfo=null], 
  Job[buyerInfo=null, supplierInfo=null]
]]

This is also after commenting out a section of the XML because Jackson was complaining about unmarshalling “false” to a String. For some reason, it sees 4 Jobs, rather than the 2 expected, and only 1 of them has the data pulled out, but even it is missing the supplier data. That obviously isn’t going to work. Yet this is as close as I could get using Jackson.

Again, if you need the entire XML document in your Java POJO, I think Jackson would be able to work just fine. I believe the problems I have here are related mostly to how I am trying to pick out the few elements I really want. There is a more standard way to just pick out elements from an XML file, so we will investigate that approach next.

XPath

The next obvious approach would be to use XPath. Our problem fits with XPath very nicely. The only reason this was not my first go-to was that it will require a lot of manual object building. Oh well. Let’s see how it works.

If you are not familiar with the XPath language, check out the W3School info on it.

public static class Job {
  private long buyerId;
  private long jobId;
  private long supplierId;

  //Getters and setters omitted

  @Override public String toString() {
    return new StringJoiner(", ", Job.class.getSimpleName() + "[", "]")
        .add("buyerId=" + buyerId).add("jobId=" + jobId)
        .add("supplierId=" + supplierId).toString();
  }
}

The Job POJO is very simple, containing just the fields that we care about and their getters/setters.

NodeList list = (NodeList) xPath.compile("//StaffingOrder").evaluate(xmlDocument, XPathConstants.NODESET);

      List<Job> jobs = new ArrayList<>(list.getLength());

      for (int i = 1; i < list.getLength() + 1; i++) {
        String staffingOrder = "//StaffingOrder[" + i + "]";
        Job job = new Job();
        job.setBuyerId(((Double)xPath.compile(staffingOrder + "/CustomerParty/PartyID")
            .evaluate(xmlDocument, XPathConstants.NUMBER)).longValue());
        job.setJobId(((Double)xPath.compile(staffingOrder + "/CustomerParty/PartyReportingIDs/ID")
            .evaluate(xmlDocument, XPathConstants.NUMBER)).longValue());
        job.setSupplierId(((Double)xPath.compile(staffingOrder + "/SupplierParty/PartyID")
            .evaluate(xmlDocument, XPathConstants.NUMBER)).longValue());
        jobs.add(job);
      }

Here, we just pull the data points we need directly from the XML document. We loop over our StaffingOrders, and from each one (//StaffingOrder[i]) build up our object. Here we have built up an object similar to what we were trying to accomplish with Jackson, but this time it actually works!

[
  Job[buyerId=142868, jobId=2365695, supplierId=142789], 
  Job[buyerId=142868, jobId=2366194, supplierId=142789]
]

That XPath code is not nice to read, but we can extract methods to help our future selves, such as getBuyerId or maybe more general

for (int i = 1; i < list.getLength() + 1; i++) {
  String staffingOrder = "//StaffingOrder[" + i + "]";
  Job job = new Job();
  setBuyerSupplierIds(job, staffingOrder);
...
}

...

private void setBuyerSupplierIds(Job job, String currentOrder) {
  job.setBuyerId(((Double)xPath.compile(currentOrder + "/CustomerParty/PartyID")
  job.setSupplierId(((Double)xPath.compile(currentOrder + "/SupplierParty/PartyID")
            .evaluate(xmlDocument, XPathConstants.NUMBER)).longValue());
}

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.