How to Parse an XML File Using PHP: Part 1

Kquestion Add comments

This post follows the multiple choice question theme. I’ve spent quite a bit of time working on how to go about parsing an XML file using PHP. I had to pay particular attention to the structure of the XML file with reference to which elements appeared where and which were repeated but in different orders throughout the document.

I’m the sort who likes to learn by doing and having a play with things. I read a basic php xml parsing tutorial on sitepoint written by Kevin Yank (PHP and XML: Parsing RSS 1.0) it explains the process pretty well so if you want to know the specifics of creating the parser object I recommend you read that brief tutorial or consult the php manual.

When using PHPs built in XML parser there are three functions that are required:

  • An element opening function which is called whenever the parser encounters an opening element.
  • A data function which is called whenever some data between elements is encountered.
  • An element closing function which is called whenever the parser encounters a closing element.

The openElement function

function openElement($parser, $tagName, $attrs) {
  global $SQ, $CA, $MS, $Q, $A, $answercount, $ANo, $tag;
  $tag = $tagName;  if (!$SQ || !$CA || !$Q || !$A) {
  	switch ($tagName) {
  	case "SINGLEQUESTIONS":
  		$SQ = true;
  	break;
  	case "COMMONANSWERS":
  		$CA = true;
  	break;
  	case "MULTIPLESTATEMENTS":
  		$MS = true;
  	break;
  	case "QUESTION":
  		$Q = true;
  	break;
  	case "ANSWERS":
  		$A = true;
  	break;
  	}
  }  if ($A) {
  	if ($tagName == "ANSWER") {
  		$ANo[$answercount] = $attrs['NUMBER'];
  	}
  }
}

The function takes three arguments ($parser, $tagName, $attrs)

  • $parser is a reference to the parser being used to parse the document.
  • $tagName is the name of the element that triggered the function (It is always returned as ALL UPPERCASE).
  • $attrs is an array of the attributes that the element had e.g. if the element was <answer number=”3″> then the value of $attrs[’NUMBER’] would be “3″. (notice the case is always uppercase for attributes too).

The global statement at the top of the function is important to make sure that any variables the function manipulates will be available to the other functions.

The $SQ, $CA, $Q, $A variables have been used as boolean flags in my implementation to represent whether or not the parser is within specific elements. The switch statement checks to see if the element responsible for triggering the function ($tagName) is one of the elements being monitored and then sets the corresponding flag variable to true.

The flag variables are required because the same elements do not always appear in the same order in the document type definition so it is important to know which elements the parser is in when dealing with the information it will later glean. The $tag variable represents the last element that
triggered the function.

You can see that the $A variable is used to track whether or not the parser is within an answers element. The answers element contains answer elements (See MCQ XML). The last bit of the function stores the ‘number’ attribute of each answer element in an array I’ve called $ANo this is because the CORRECTANSWER element is a relative reference to the number attribute of the answer elements.

Once the answers are put into the database the relative values in the array will be replaced with unique identifiers from the database.

I’ll comment on the other multiple choice question XML parsing specific functions in other posts.

Leave a Reply

WP Theme & Icons by N.Design Studio | Akismet has gobbled 535 spam comments...Mmmm Tasty :-)
Entries RSS Comments RSS Log in