Member Avatar for cjohnweb

I'm trying to understand XPath, but I've come acrost an issue I can not seam to find an answer for. In this case it seams that XPath is not returning what it should.

I've got a sample html file, test.html:

<html> <div> <p>1</p> <p>2</p> <p>3</p> <p>4</p> <p>5</p> </div> </html> 

And my PHP file, test.php

<?php echo "<pre>"; $url = "test.html"; $oldSetting = libxml_use_internal_errors(true); libxml_clear_errors(); $html = new DOMDocument(); $html->loadHtmlFile($url); $xpath = new DOMXPath($html); $titles = $xpath->query("//p"); foreach ($titles as $title){ echo $title->nodeValue."<br />"; } libxml_clear_errors(); libxml_use_internal_errors($oldSetting); echo "</pre>"; ?> 

I can set the xpath query to //p and get all the p tags content on screen. That's good.
Set to /html//p I get the same. That's good.
Set //p[1] I get the first p tag. That's good.
Set to //p[5] I get the 5th p tag. That's good.

That's all groovy.

But if I do /html/div/p I get nothing. I've messed with a ton of similar queries with no luck.

I'm trying to read the url of an image from a website, and using Firefox's Firebug plugin I can copy the Xpath and I get something like

/html/body/div[2]/div/div[2]/div/div/div/div[2]/div/div/div[2]/p/img

But in PHP I get no result unless I remove all the "[2]", take out some of the div's and place a // before img.

So what's going on here, every example I've read says this is correct, but in the very very simple example above just a simple /html/div/p or /html/div//p does not work.

Thanks for your help!

Member Avatar for pritaeas

Possibly has something to do with the loadHtmlFile. If I use load (since your HTML is well-formed, using $titles = $xpath->query("//html/div/p"); works as expected.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.