Creating an XML Parsing Subprocedure

The RPG-XML Suite subprocedure RXS_Parse() is used to read through an XML document and trigger processing based on XML events detected in that document. These events represent data and elements in the XML document and can used to tailor processing of the XML document to meet your business needs through a customized parsing handler subprocedure. This parsing handler subprocedure is defined within your program, and is specified in the ParseDS.Handler field on the RXS_Parse() configuration data structure parameter.

The following events within an XML document will trigger a call out to the parsing handler subprocedure:

Event   Format   Example
Element Begin   >   Request>
Element Content   /   Request/Item/
Element End   />   Request/>
Attribute   @   Request/Item@Attribute

Depending on the size and complexity of your XML document, there may be many individual events that you need to capture to properly retrieve and handle your data. To help simplify this process, the command BLDPRS can be used to generate a basic parsing handler for your XML document, into which you can add your custom programming and data handling.

Sample XML Document

In order to generate a parsing handler, you will first need a sample XML document that is as complete as possible - that is, you should include even elements and attributes which are optional, even if some of them would not logically appear with others in normal usage.

To start, we first need to create a file in the IFS. This file will be populated with our sample XML data, which will be used to generate the parser. The command below uses QSHELL to create an IFS stream file using CCSID 819. This CCSID is strongly recommended when creating IFS files for use with RXS tools. Note that this command is case-sensitive.

QSH CMD('touch -C 819 /home/user/postadr.xml')

The file now exists but without content. To add content we will use the EDTF command and add in our sample XML.

EDTF '/home/user/postadr.xml'

For demonstration purposes, we’ll use the sample XML below. This sample XML may or may not contain data in any or all fields - this will not impact the parser output.

<PostAdr residential="true">
  <name title="Mx.">
    <first>Jamie</first>
    <last>Hale</last>    
  </name>
  <street>2886 Veltri Dr</street>
  <city>Hickory Hills</city>
  <state>VA</state>
  <zip>94124</zip>
  <phone>949-555-4671</phone>
  <phone></phone>
</PostAdr>

While in the Edit File editor select F2 to save the document changes. Now it is time to invoke the BLDPRS command to generate the parsing code.

Generating the Parsing Handler (BLDPRS)

BLDPRS can generate parsing handlers for both the XML parser and the RXS 3 JSON parser (RXS_ParseJson()). The command can generate either free- or mixed-format (fixed D-specs) RPG code, and this code can be output into either a source member or an IFS file. Prompting the BLDPRS command will present the following screen:

Build RPG Parse Subprocedure (BLDPRS)

The first parameter will be the fully-qualified filepath for our sample XML document in the IFS. For demonstration purposes, we’re going to output the generated parsing handler code to a source member, which means we’ll be specifying the Output Source File, Library, and member fields. When writing output to a source member, we do not specify a value in the Output Stream File field.

Build RPG Parse Subprocedure (BLDPRS)

The Append Output parameter controls whether any existing content in the specified output member or stream file will be preserved, or whether it will be overwritten. By default, this value is set to *YES to preserve existing content. In our demonstration, we’ll set it to *NO to overwrite any content.

The final parameter - Parsing Handler Type - has three possible values and controls what type of parsing handler is generated. We’ll input *RXS3XML to indicate that an RXS 3-formatted XML parsing handler, for use with RXS_Parse(), should be generated.

Build RPG Parse Subprocedure (BLDPRS)

When we press enter, four additional fields will be brought up.

Build RPG Parse Subprocedure (BLDPRS)

Build RPG Parse Subprocedure (BLDPRS)

These fields can be used to specify which XML events should have parsing code generated for them. These events are all enabled - the values set to *YES - by default. In most situations, you will not need to change these four values. Press enter again to submit the command.

The Parsing Handler Subprocedure

Once the command finishes executing, you’ll see a status message similar to the following:

Generated parsing handler in USERLIB/QRPGLESRC, POSTADRPRS

If we open the generated source member in RDi or SEU, we’ll see this RPGLE code:


       Dcl-Pr XMLHandler;
         pType Char(10) Value;
         pXPath VarChar(1024) Value;
         pData Pointer Value;
         pDataLen Int(10) Value;
       End-Pr;

      //=======================================================================
      //  Remember to update the handler to account for your program logic.
      //=======================================================================

       Dcl-Proc XMLHandler;
         Dcl-Pi *N;
           pType Char(10) Value;
           pXPath VarChar(1024) Value;
           pData Pointer Value;
           pDataLen Int(10) Value;
         End-Pi;

        Dcl-S ParsedData Like(RXS_Var1Kv_t);

        select;

          when pXPath = '/PostAdr>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr@residential';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/name>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/name@title';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/name/first>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/name/first/';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/name/first/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/name/last>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/name/last/';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/name/last/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/name/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/street>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/street/';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/street/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/city>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/city/';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/city/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/state>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/state/';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/state/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/zip>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/zip/';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/zip/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/phone>';
            RXS_JobLog( '***Element Begin: ' + pXPath );

          when pXPath = '/PostAdr/phone/';
            ParsedData = RXS_STR( pData : pDataLen );
            RXS_JobLog( pXPath + ': ' + ParsedData );

          when pXPath = '/PostAdr/phone/>';
            RXS_JobLog( '***Element End: ' + pXPath );

          when pXPath = '/PostAdr/>';
            RXS_JobLog( '***Element End: ' + pXPath );

        endsl;

       End-Proc; 

Note that, even with the Code Format parameter set to *FREE, the generated free-format RPG code is still limited to columns 8-80. This is for compatibility with customers that are using SEU for development, or for customers on 7.1 that do not have the PTFs required for fully free-format RPG code. Though our example XML document does not have any, long XPaths will wrap automatically at or before the 80th column.

The generated code member contains the parsing handler subprocedure prototype, which is intended to be copied into your program code in the main D-specs, and the parsing subprocedure code itself. You should not modify the prototype or the subprocedure declarations, except to change the subprocedure name from XMLHandler if desired.

The body of the parsing handler subprocedure contains a select block, composed of when statements for each event in the XML document. These when statements are checking the value of the pXPath parameter to determine what XML event was detected, and processing that event accordingly. For example, this when statement is checking if the pXPath parameter is indicating that the event for the title attribute on the name element has been detected, and if so it is retrieving the value of the attribute using RXS_STR():


when pXPath = '/PostAdr/name@title';
  ParsedData = RXS_STR( pData : pDataLen );
  RXS_JobLog( pXPath + ': ' + ParsedData );

For each of these statements, the generated code also includes a call to RXS_JobLog(). This is intended to be for demonstrative purposes, so that you can copy the parsing handler into your program and immediately compile and run the program to see sample output in the job log. These calls should be replaced by your actual program logic to handle the data in the XML document.

Customizing the Parsing Handler Subprocedure

Once the generated parsing handler subprocedure code has been copied into your program, you will need to modify the programming logic in the handler subprocedure to meet the needs of your specific program. This may involve retrieving data from elements and attributes, performing numeric or data-type conversions, and writing records to one or more physical files. You may also need to store the data within the program for further processing once parsing is complete. The parsing handler subprocedures are very flexible and allow for a high level of customization.

The ORDERSVC example program demonstrates using a parsing handler subprocedure to retrieve data from an XML document and write records to several physical files. To do so, the program uses the element begin events to trigger clearing of fields and data structures, and element end events to trigger writing records to the files.

The generated parsing handler subprocedure will contain a when statement for each XML event detected in the document. In practice, you will likely not need many of these events for processing. You can safely comment out or remove any unneeded events.

Sample Program

Here is a full example that demonstrates parsing our sample XML using a fully-customized parsing handler subprocedure, based on the same code we generated in this tutorial:


       Ctl-Opt ActGrp(*New) BndDir('RXSBND') Text('RXS XML Handler Example');

        /COPY QRPGLECPY,RXSCB

       Dcl-Pr XMLHandler;
         pType Char(10) Value;
         pXPath VarChar(1024) Value;
         pData Pointer Value;
         pDataLen Int(10) Value;
       End-Pr;

       // Global fields for data output
       Dcl-Ds FullName Qualified;
         Title VarChar(5);
         First VarChar(25);
         Last VarChar(25);
       End-Ds;
       Dcl-Ds Address Qualified;
         Residential Ind;
         Street VarChar(50);
         City VarChar(50);
         State Char(2);
         Zip VarChar(20);
       End-Ds;
       Dcl-S Phone VarChar(20) Dim(3);
       Dcl-S FormattedName VarChar(57);
       Dcl-S FormattedAddress VarChar(125);

       // Processing fields
       Dcl-S XML Like(RXS_Var8Kv_t);

       // RXS templated data structures
       Dcl-Ds ParseDS LikeDS(RXS_ParseDS_t);


       // Load the XML to be parsed - for demonstration purposes, we're manually
       //  setting the XML in this field. In your program, you may be reading
       //  the XML from a stream file in the IFS, or receiving it as part of a
       //  web service request (from RXS_GetStdIn) or response (from
       //  RXS_Transmit) instead.
       reset XML;
       XML = '<PostAdr residential="true"><name title="Mx."><first>Jamie'
           + '</first><last>Hale</last></name><street>2886 Veltri Dr</street>'
           + '<city>Hickory Hills</city><state>VA</state><zip>94124</zip>'
           + '<phone>949-555-4671</phone><phone></phone></PostAdr>';

       // Many RPG-XML Suite APIs accept a configuration data structure
       //  parameter. These templated data structures must be initialized
       //  using RXS_ResetDS - they cannot be initialized with the reset
       //  operation.
       // RXS_ResetDS helps ensure that RXS templated data structures are
       //  initialized in a backwards-compatible fashion
       RXS_ResetDS( ParseDS : RXS_DS_TYPE_PARSE );
       
       // Specify the procedure pointer for the parsing handler subprocedure,
       //  using the %Paddr built-in function
       ParseDS.GlobalHandler = %Paddr( XMLHandler );

       monitor;
         // Parse the XML document
         RXS_Parse( XML : ParseDS );

         // Format the parsed data
         reset FormattedName;
         if FullName.Title <> *Blanks;
           FormattedName += FullName.Title + ' ';
         endif;
         FormattedName += FullName.First + ' ' + FullName.Last;

         reset FormattedAddress;
         FormattedAddress = Address.Street + ' '
                          + Address.City + ' '
                          + Address.State + ' '
                          + Address.Zip;

         // Outputting the parsed and formatted data
         RXS_JobLog( 'Name: ' + FormattedName );
         RXS_JobLog( 'Address: ' + FormattedAddress );
         if Phone(1) <> *Blanks;
           RXS_JobLog( 'Phone 1: ' + Phone(1) );
         endif;
         if Phone(2) <> *Blanks;
           RXS_JobLog( 'Phone 2: ' + Phone(2) );
         endif;
         if Phone(3) <> *Blanks;
           RXS_JobLog( 'Phone 3: ' + Phone(3) );
         endif;
       on-error;
         // If an error occurs during parsing, error messages and information
         //  can be found in the ParseDS parameter data structure
         RXS_JobLog( 'Error: ' + ParseDS.ReturnedErrorInfo.MessageText );
       endmon;

       *INLR = *On;
       return;


       // This is a customized parsing handler subprocedure that was initially
       //  generated with BLDPRS. Additional code has been added to support our
       //  program logic, and we've removed the when blocks for XPaths that
       //  we do not need to process.
       // We have also removed the calls to RXS_JobLog - we do not recommend
       //  leaving logging operations in place in a production environment, due
       //  to the additional overhead
       Dcl-Proc XMLHandler;
         Dcl-Pi *N;
           pType Char(10) Value;
           pXPath VarChar(1024) Value;
           pData Pointer Value;
           pDataLen Int(10) Value;
         End-Pi;

        Dcl-S ParsedData Like(RXS_Var1Kv_t);
        // This is a static field used to track the index of the phone array
        Dcl-S PhoneIdx Uns(3) Inz Static;

        select;

          // This is the element begin event for the parent <PostAdr> element
          //  in our XML document. This is a good place to perform any
          //  initialization steps needed, ie. resetting iterators and fields
          when pXPath = '/PostAdr>';
            reset PhoneIdx;
            reset FullName;
            reset Address;

          // This XPath is searching for the attribute content event associated
          //  with the residential attribute on the PostAdr element
          when pXPath = '/PostAdr@residential';
            // Retrieve the data from the attribute
            ParsedData = RXS_STR( pData : pDataLen );
            // Process the parsed data
            if ParsedData = 'true';
              // The RXSCB copybook includes constants that can be easily used
              //  to set indicator values
              Address.Residential = RXS_YES;
            else;
              Address.Residential = RXS_NO;
            endif;

          when pXPath = '/PostAdr/name@title';
            ParsedData = RXS_STR( pData : pDataLen );
            FullName.Title = ParsedData;

          when pXPath = '/PostAdr/name/first/';
            ParsedData = RXS_STR( pData : pDataLen );
            FullName.First = ParsedData;

          when pXPath = '/PostAdr/name/last/';
            ParsedData = RXS_STR( pData : pDataLen );
            FullName.Last = ParsedData;

          when pXPath = '/PostAdr/street/';
            ParsedData = RXS_STR( pData : pDataLen );
            Address.Street = ParsedData;

          when pXPath = '/PostAdr/city/';
            ParsedData = RXS_STR( pData : pDataLen );
            Address.City = ParsedData;

          when pXPath = '/PostAdr/state/';
            ParsedData = RXS_STR( pData : pDataLen );
            Address.State = ParsedData;

          when pXPath = '/PostAdr/zip/';
            ParsedData = RXS_STR( pData : pDataLen );
            Address.Zip = ParsedData;

          when pXPath = '/PostAdr/phone/';
            // Phone number content event detected - increasing the phone
            //  number index
            PhoneIdx += 1;
            ParsedData = RXS_STR( pData : pDataLen );
            Phone(PhoneIdx) = ParsedData;

          when pXPath = '/PostAdr/>';
            // This is the element end event for the parent <PostAdr> element,
            //  and will be the last event triggered in our document. We don't
            //  have any special processing in our program for this event.

        endsl;

       End-Proc;