Merging Multiple RSS Feeds Using PHP and Caching

I’ve recently been doing some work with RSS feeds and decided I needed a way to combine multiple feeds into one single feed. From a quick google search, I discovered there are already a few code samples available to assist a user in doing that, but none of them had exactly what I wanted:
  • Object-Oriented PHP Class
  • Aggregate multiple RSS feeds specified by an array
  • Use caching to prevent unnecessary bandwidth consumption
  • Separate caches per feed
  • Specify the maximum size of the combined feed
  • Be able to either directly output the data, or retrieve it as an XML string

As a result, I decided to construct the class myself.

If you just wish to download the code, you can skip to the code download section.

Creating the MergedRSS Class

The first thing I did, was created a class, which I called “MergedRSS”.

Next thing I did was added a couple of functions to my class to retrieve the rss feeds from both cache and from the web.

Next I decided I needed a storage location and naming convention for my cache files. I ended up storing each feed as a separate cache file inside a directory called “cache”. This required adding write permissions (chmod 0777) to the cache directory. The reason for separate caches per feed, was to allow me to display a cached feed, in the event of a connectivity issue between my server and one or more feeds.

As far as naming the cache file, I decided the easiest way was to just replace any character that is not a letter, number or period, with an underscore. This not only helps keep the file identifiable when viewing the cache folder via SSH or FTP, but also prevents hackers from overwriting files outside of the cache directory. To quickly convert a feed URL into a file name I created the following function:

Also I knew I was going to need to sort the items at some point based on the pubDate key, so I created a comparison function to assist with that.

At this point, we’re ready to create the main function that makes everything work. To avoid splitting the code up too much, I am going to keep the entire function intact, but have added comments through out to make it easier to understand what’s going on.

So now we have everything we need to retrieve, merge, sort and limit our RSS feeds. We can save this class as mergedrss.php.

Using the MergedRSS Class

In another file called feed.php, we place the following code:

The Final Results

You can see the above sample feed at: http://www.widgetsandburritos.com/test/feed.php

Download code as *.zip

David Stinemetze is the Lead Developer and Director of Social Media for San Antonio Web Design, SEO and Hosting firm, Internet Direct.

Website | Facebook | Twitter

You can leave a comment via Facebook. Don't have a Facebook account? No worries. Just click "Other Comments" to leave your feedback.

  1. Brandon Lee says:

    The XML page cannot be displayed
    Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.

    ——————————————————————————–

    Only one top level element is allowed in an XML document. Error processing resource ‘http://www.widgetsandburritos.com/test…

    Warning: SimpleXMLElement::__construct() [<a href=’simplexmlelement.–construct’>simplexmlelement….

    • There’s apparently something wrong with my sample feed.php file. Not sure what happened. I’ll take a look at it when I get the chance, and make sure it gets corrected. Thanks.

    • Ok I figured out what happened. One of the feeds that I was referencing went down. I had assumed that when a feed went down, the __fetch_rss_from_url() function would return null. But apparently it was throwing errors. The errors obviously did not use proper XML syntax which caused things to break. So I corrected it this way:

      Changed the __fetch_rss_from_url() function to the following:

      Then added

      to the beginning of my code. I just figure if there are any errors with any of the feeds for whatever reason, instead of breaking the entire script, just ignore the broken feeds. We can always try again later.

  2. Ian says:

    Thanks for this great tutorial, it really helped me with a project where I needed to merge twitter feeds. Still working on the caching part.

  3. Mike says:

    Hi, i downloaded your nice script but I’m experiencing some problems with the fetch function.
    Php returns me an error:
    “This page contains the following errors:
    error on line 2 at column 1: Extra content at the end of the document
    Below is a rendering of the page up to the first error.”
    and the page doesn’t display.

    Plus that, i was wandering a way to let the script save the xml files in order to put it in cron on my server.
    I wrote this function instead of your echo $xml output

    May you help me solving this?
    Thx a lot.
    Mike

    • Mike,

      XML pages should only have a single “root element”. Check to make sure all tags are encapsulated inside the <rss></rss> tags. Also, if you’re just writing to a file instead of outputting to the screen, try getting rid of this line:

      This will prevent PHP from running as an XML document and treat it as the normal HTML content type.

      Let me know if this helps or not.

  4. Mike says:

    Now it works fine! Thx for your help dude :)

  5. Peter says:

    Hello David,

    really useful php code, many thanks! I was looking at many other solutions, but none was running that smooth!

  6. Jan says:

    Is it somehow possible to identify merged RSS in the result? I mean, I would like to have result look like this:
    Website A: Title of RSS item
    Website B: Title of RSS item

    • Sorry for the delayed response. I was away at SXSW from last Thursday till this Tuesday and have been playing catchup since I’ve gotten back. It’s technically possible. Would require a little bit of data manipulation to do so. If I get a chance, I’ll try to set up an example of this.

  7. Pingback: investigating RSS post and comments feeds with tags and categories « Love's Camel

  8. Alex says:

    Hi,

    I was interested in that code, but I got a 404 error when I tried to get it, did you deleted it ?

    Thank you, alex

  9. TurboPiPP says:

    If a feed is merged already with feedburner, this script will throw an error because of namespacing, I did the following edit to fix it;
    foreach ($results as $item) {
    // FIX FEEDBURNER ISSUE
    $feedburner = $item->children('feedburner', true)->origLink;
    if($feedburner) {
    $item->link = $feedburner;
    $origLink = " xmlns:feedburner=\"http://rssnamespace.org/feedburner/ext/1.0\"";
    }
    // FIX FEEDBURNER ISSUE
    $items[] = $item;
    }

    ..and lower down I added;

    $xml .= "<rss version=\"2.0\" xmlns:content=\"http://purl.org/rss/1.0/modules/content/\" xmlns:wfw=\"http://wellformedweb.org/CommentAPI/\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:atom=\"http://www.w3.org/2005/Atom\" xmlns:sy=\"http://purl.org/rss/1.0/modules/syndication/\" xmlns:slash=\"http://purl.org/rss/1.0/modules/slash/\"";

    // FIX FEEDBURNER ISSUE
    if ($origLink) { $xml .= $origLink; }
    // FIX FEEDBURNER ISSUE

    $xml .= ">\n<channel>\n";

Leave a Reply

Your email address will not be published. Required fields are marked *

*