- Object-Oriented PHP Class
- Aggregate multiple RSS feeds specified by an array
- Use caching to prevent unnecessary bandwidth consumption
- Separate caches per feed
- Specify the maximum size of the combined feed
- Be able to either directly output the data, or retrieve it as an XML string
As a result, I decided to construct the class myself.
If you just wish to download the code, you can skip to the code download section.
Creating the MergedRSS Class
The first thing I did, was created a class, which I called “MergedRSS”.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
<?php
class MergedRSS {
private $myFeeds = null;
private $myTitle = null;
private $myLink = null;
private $myDescription = null;
private $myPubDate = null;
private $myCacheTime = null;
// create our Merged RSS Feed
public function __construct($feeds = null, $channel_title = null,
$channel_link = null, $channel_description = null,
$channel_pubdate = null, $cache_time_in_seconds = 86400) {
// set variables
$this->myTitle = $channel_title;
$this->myLink = $channel_link;
$this->myDescription = $channel_description;
$this->myPubDate = $channel_pubdate;
$this->myCacheTime = $time_in_seconds;
// initialize feed variable
$this->myFeeds = array();
if (isset($feeds)) {
if (is_array($feeds)) {
// if it's an array, merge it into our existing array.
$this->myFeeds = array_merge($feeds);
} else {
// if it's a single feed url, just push it into the array
$this->myFeeds[] = $feeds;
}
}
}
}
?> |
Next thing I did was added a couple of functions to my class to retrieve the rss feeds from both cache and from the web.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
// retrieves contents from a cache file ; returns null on error
private function __fetch_rss_from_cache($cache_file) {
if (file_exists($cache_file)) {
// if file exists, then attempt to read as xml. if there's an error this means it's malformed
// in which case, a Warning will be thrown. This is adequate error detection for now.
return simplexml_load_file($cache_file);
}
return null;
}
// retrieves contents of an external RSS feed ; returns null on error
private function __fetch_rss_from_url($url) {
// Create new SimpleXMLElement instance based on the url. If there's an error, return null
try {
$sxe = new SimpleXMLElement($url, null, true);
return $sxe;
} catch (Exception $e) {
return null;
}
} |
Next I decided I needed a storage location and naming convention for my cache files. I ended up storing each feed as a separate cache file inside a directory called “cache”. This required adding write permissions (chmod 0777) to the cache directory. The reason for separate caches per feed, was to allow me to display a cached feed, in the event of a connectivity issue between my server and one or more feeds.
As far as naming the cache file, I decided the easiest way was to just replace any character that is not a letter, number or period, with an underscore. This not only helps keep the file identifiable when viewing the cache folder via SSH or FTP, but also prevents hackers from overwriting files outside of the cache directory. To quickly convert a feed URL into a file name I created the following function:
|
1 2 3 4 |
// creates a key for a specific feed url (used for creating friendly file names)
private function __create_feed_key($url) {
return preg_replace('/[^a-zA-Z0-9.]/', '_', $url) . 'cache';
} |
Also I knew I was going to need to sort the items at some point based on the pubDate key, so I created a comparison function to assist with that.
|
1 2 3 4 |
// compares two items based on "pubDate"
private function __compare_items($a,$b) {
return strtotime($b->pubDate) - strtotime($a->pubDate);
} |
At this point, we’re ready to create the main function that makes everything work. To avoid splitting the code up too much, I am going to keep the entire function intact, but have added comments through out to make it easier to understand what’s going on.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
// exports the data as a returned value and/or outputted to the screen
public function export($return_as_string = true, $output = false, $limit = null) {
// initialize a combined item array for later
$items = array();
// loop through each feed
foreach ($this->myFeeds as $feed_url) {
// determine my cache file name. for now i assume they're all kept in a file called "cache"
$cache_file = "cache/" . $this->__create_feed_key($feed_url);
// determine whether or not I should use the cached version of the xml
$use_cache = false;
if (file_exists($cache_file)) {
if (time() - filemtime($cache_file) < $this->myCacheTime) {
$use_cache = t__fetch_rss_from_cache($cache_file);
$results = $sxe->channel->item;
} else {
// retrieve updated rss feed
$sxe = $this->__fetch_rss_from_url($feed_url);
$results = $sxe->channel->item;
if (!isset($results)) {
// couldn't fetch from the url. grab a cached version if we can
if (file_exists($cache_file)) {
$sxe = $this->__fetch_rss_from_cache($cache_file);
$results = $sxe->channel->item;
}
} else {
// we need to update the cache file
$sxe->asXML($cache_file);
}
}
if (isset($results)) {
// add each item to the master item list
foreach ($results as $item) {
$items[] = $item;
}
}
}
// set all the initial, necessary xml data
$xml = "<?xml version="1.0" encoding="UTF-8"?>n";
$xml .= "n";
// begin adding channel information
$xml .= "n";
if (isset($this->myTitle)) { $xml .= "tn"; }
// required for validation
$xml .= "tn";
// more channel information
if (isset($this->myLink)) { $xml .= "t
".$this->myLink."n"; }
if (isset($this->myDescription)) { $xml .= "t".$this->myDescription."n"; }
if (isset($this->myPubDate)) { $xml .= "t
".$this->myPubDate."n"; }
// if there are any items to add to the feed, let's do it
if (sizeof($items) >0) {
// sort items
usort($items, array($this,"__compare_items"));
// if desired, splice items into an array of the specified size
if (isset($limit)) { array_splice($items, intval($limit)); }
// now let's convert all of our items to XML
for ($i=0; $iasXML() ."n";
}
}
$xml .= "n";
// if output is desired print to screen
if ($output) { echo $xml; }
// if user wants results returned as a string, do so
if ($return_as_string) { return $xml; }
} |
So now we have everything we need to retrieve, merge, sort and limit our RSS feeds. We can save this class as mergedrss.php.
Using the MergedRSS Class
In another file called feed.php, we place the following code:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
<?php
// If there are errors, don't show them. This will break RSS syntax.
ini_set('display_errors', 'off');
include_once("mergedrss.php");
// place our feeds in an array
$feeds = array(
'http://www.widgetsandburritos.com/feed/',
'http://www.dealofthedaysa.com/feed/',
'http://www.iambex.com/feed/',
);
// set the header type
header("Content-type: text/xml");
// set an arbitrary feed date
$feed_date = date("r", mktime(10,0,0,9,8,2010));
// Create new MergedRSS object with desired parameters
$MergedRSS = new MergedRSS($feeds, "My Merged Feed", "http://www.widgetsandburritos.com/",
"This is just a sample merged RSS feed", $feed_date);
//Export the first 10 items to screen
$MergedRSS->export(false, true, 10);
// Retrieve the first 5 items as xml code
$xml = $MergedRSS->export(true, false, 5);
?> |
The Final Results
You can see the above sample feed at: http://www.widgetsandburritos.com/test/feed.php
You can leave a comment via Facebook. Don't have a Facebook account? No worries. Just click "Other Comments" to leave your feedback.

The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.
——————————————————————————–
Only one top level element is allowed in an XML document. Error processing resource ‘http://www.widgetsandburritos.com/test…
Warning: SimpleXMLElement::__construct() [<a href=’simplexmlelement.–construct’>simplexmlelement….
There’s apparently something wrong with my sample feed.php file. Not sure what happened. I’ll take a look at it when I get the chance, and make sure it gets corrected. Thanks.
Ok I figured out what happened. One of the feeds that I was referencing went down. I had assumed that when a feed went down, the __fetch_rss_from_url() function would return null. But apparently it was throwing errors. The errors obviously did not use proper XML syntax which caused things to break. So I corrected it this way:
Changed the __fetch_rss_from_url() function to the following:
Then added
to the beginning of my code. I just figure if there are any errors with any of the feeds for whatever reason, instead of breaking the entire script, just ignore the broken feeds. We can always try again later.
Thanks for this great tutorial, it really helped me with a project where I needed to merge twitter feeds. Still working on the caching part.
If you have any questions let me know.
Hi, i downloaded your nice script but I’m experiencing some problems with the fetch function.
Php returns me an error:
“This page contains the following errors:
error on line 2 at column 1: Extra content at the end of the document
Below is a rendering of the page up to the first error.”
and the page doesn’t display.
Plus that, i was wandering a way to let the script save the xml files in order to put it in cron on my server.
I wrote this function instead of your echo $xml output
May you help me solving this?
Thx a lot.
Mike
Mike,
XML pages should only have a single “root element”. Check to make sure all tags are encapsulated inside the <rss></rss> tags. Also, if you’re just writing to a file instead of outputting to the screen, try getting rid of this line:
This will prevent PHP from running as an XML document and treat it as the normal HTML content type.
Let me know if this helps or not.
Now it works fine! Thx for your help dude
No prob. Glad I could help.
Hello David,
really useful php code, many thanks! I was looking at many other solutions, but none was running that smooth!
Is it somehow possible to identify merged RSS in the result? I mean, I would like to have result look like this:
Website A: Title of RSS item
Website B: Title of RSS item
Sorry for the delayed response. I was away at SXSW from last Thursday till this Tuesday and have been playing catchup since I’ve gotten back. It’s technically possible. Would require a little bit of data manipulation to do so. If I get a chance, I’ll try to set up an example of this.
Pingback: investigating RSS post and comments feeds with tags and categories « Love's Camel
Hi,
I was interested in that code, but I got a 404 error when I tried to get it, did you deleted it ?
Thank you, alex
If a feed is merged already with feedburner, this script will throw an error because of namespacing, I did the following edit to fix it;
foreach ($results as $item) {
// FIX FEEDBURNER ISSUE
$feedburner = $item->children('feedburner', true)->origLink;
if($feedburner) {
$item->link = $feedburner;
$origLink = " xmlns:feedburner=\"http://rssnamespace.org/feedburner/ext/1.0\"";
}
// FIX FEEDBURNER ISSUE
$items[] = $item;
}
..and lower down I added;
$xml .= "<rss version=\"2.0\" xmlns:content=\"http://purl.org/rss/1.0/modules/content/\" xmlns:wfw=\"http://wellformedweb.org/CommentAPI/\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:atom=\"http://www.w3.org/2005/Atom\" xmlns:sy=\"http://purl.org/rss/1.0/modules/syndication/\" xmlns:slash=\"http://purl.org/rss/1.0/modules/slash/\"";
// FIX FEEDBURNER ISSUE
if ($origLink) { $xml .= $origLink; }
// FIX FEEDBURNER ISSUE
$xml .= ">\n<channel>\n";