SimpleXML can't get CDATA with ns prefixes

Adam

New Member
#1
My dilemma has to do with retrieving thread data through xenForo's RSS feeds. Here is a sample of the RSS data I'm trying to retrieve, everything works fine except for retrieving the <content:encoded>.

Sample file:
Mã:
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>News &amp; Announcements</title>
    <description>All of our important news and announcements will be here.</description>
    <pubDate>Fri, 26 Jun 2015 14:54:20 +0000</pubDate>
    <lastBuildDate>Fri, 26 Jun 2015 14:54:20 +0000</lastBuildDate>
    <generator>********* ****</generator>
    <link>https://***.****.****/forum/news/</link>
    <atom:link rel="self" type="application/rss+xml" href="https://***.****.****/forum/news/index.rss"/>
    <item>
      <title>Site under development.</title>
      <pubDate>Thu, 25 Jun 2015 05:49:43 +0000</pubDate>
      <link>https://***.****.****/threads/site-under-development.3/</link>
      <guid>https://***.****.****/threads/site-under-development.3/</guid>
      <author>invalid@example.com (*****)</author>
      <dc:creator>ShortCut Central</dc:creator>
      <content:encoded><![CDATA[Content to retrieve. <br /> Some more content a part of the same section]]></content:encoded>
    </item>
  </channel>
</rss>
My current code looks like
Mã:
<?php
class SCC_Main_miscFuncs {
    public static function printMostRecentPost() {
        // Re-enable the below once we're ready to release
        //$rssUrl = func_get_arg(1);
        $rssUrl = 'https://www.shortcutcentral.org/indev.rss';
        $xml = simplexml_load_string(self::returnContents($rssUrl));
        $rawData = self::returnContents($rssUrl); // Properly contains the CDATA
        echo '<pre>';
        //echo (string) $xml->channel->item->encoded;
        //echo (string) $xml->channel->item->content;
        //var_dump($xml);
        echo '</pre>';
        //echo (string) $xml->channel->item;
        //echo $array[@attributes]['item']['link'];
        //echo $xml->message;
    }

    public static function returnContents($url){
        $curl_handle=curl_init();
        curl_setopt($curl_handle, CURLOPT_URL,$url);
        curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT, 2);
        curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($curl_handle, CURLOPT_USERAGENT, 'ShortCut Central');
        $query = curl_exec($curl_handle);
        curl_close($curl_handle);
        return $query;
    }
}
Nothing seems to show the said CDATA except for the unparsed $rawData. I feel it might be because I'm not calling it properly (being completely new to XML and namespaces and namespace prefixes), but it not showing up through var_dump is giving me... hell. I saw some earlier posts about using XML children, but I don't entirely understand that concept, which is why, if my solution requires XML children, an explanation would be greatly appreciated.

Thank you!

Also might be worth mentioning that my php code is organized in the way that it is (classes and public, static functions) so that I can use it as an add-on for xenForo.
 

Admin

Administrator
Thành viên BQT
#2
You are correct that one method to return the namespaced node in SimpleXML is to use SimpleXMLElement::children() but you must pass the namespace as its first argument. You may pass the full namespace string "http://purl.org/rss/1.0/modules/content/", but it is easier to pass its prefix "content", and also then supply TRUE as the second argument to inform children() that you are passing a prefix rather than the full string.

So using an expression on your $xml object like:
Mã:
echo (string)$xml->channel->item->children('content', TRUE)->encoded;
// Prints:
// Content to retrieve. <br /> Some more content a part of the same section
Use whatever method makes the most sense in context of your code to retrieve all the relevant nodes in a loop.

Retrieving attributes from namespaced nodes isn't much different. To get the <atom:link href> for example:
Mã:
echo (string)$xml->channel->children('atom', true)->link->attributes()['href'];
// Prints
// https://***.****.****/forum/news/index.rss
 
OP
OP
A

Adam

New Member
#3
Thank you! I will try this later today and mark your answer as the working one once I'm done.
 
Top