How to Build a PHP-based RSS Feed Parser

Recently, I was building a website for a telecommunications startup and because they didn’t have a lot of content for their website, I suggested they use the content from a Yahoo Finance RSS feed. In today’s article, I will demonstrate how to write a simple function in PHP that allows you to display content from an RSS feed.

Wikipedia defines a parser as a part of a compiler, or an interpreter, which checks for proper syntax and builds a data structure based on that syntax. In this tutorial, we will take a look at how the PHP parser creates an array in which each element will contain the data found in the RSS feed items. However, I will go into more details when we start writing the PHP code.

I won’t go much into the XHTML/CSS code in this tutorial. Instead, I’ll focus on the PHP side. We’ll be creating a simple site with just two pages; home and news. The idea is to write two PHP functions that will spit out different content; one is for the display of the titles from the feed, and the other is for the content that will appear in a list form like those of a blog archive.

RSS Feed Parser


The Design

This is how the finished website will look.

Final

The design is simple and sleek, nothing too fancy. All it takes is fifteen minutes in Photoshop, using gradients and a bokeh effect for the header design. If you have any questions regarding the creating of this layout, feel free to ask in the comment box at the end of this article.

Before going into the HTML for the site, this is how you should organize the file structure for the finished website:

Folders

We have one folder containing all the images that appear in the site; two PHP files, one for each page and one containing the functions; and a CSS file. Our site in today's tutorial is extremely simple as such just a folder to contain the images is sufficient to ensure good organization of your files. However, if you are building a more complex site, it is advisable to create another folder titled 'includes' or 'scripts' so that you can move the functions.php file there. Also, if you have multiple style sheets, adding a folder titled 'CSS' would help keep your files organized as well.

For PC users, you can try the same local web server I use, XAMPP or try WAMP. If you are a MAC user, try MAMP.

The HTML

We are creating a rather simple two column website layout, as such the design markup should be easy to understand. However, if you’re new to web design, I recommend reading this wonderful tutorial on NETTUTS where you will be able to find explanations for most of the HTML/CSS techniques used here.

Whenever you’re starting to code a website, I suggest that you finish the HTML side and only then move onto styling the page with CSS. That way you won’t have some newly added elements break the layout that you’ve spent hours fitting together.

We start our index.php file with the doctype declaration and the head tag. After opening the body tag, we add the top navigation as an unordered list and wrap it in a container DIV:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
	<title>Fictive Company</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
	<link rel="stylesheet" type="text/css" href="style.css" media="screen" />
</head>

<body>
<div class="container">
	<ul id="nav">
		<li><a href="index.php">Home</a></li>
		<li><a href="news.php">News</a></li>
	</ul>
</div>

Later, we will style the container DIV so that it centers the content and keeps it 960 pixels wide to avoid horizontal scrolling on smaller resolutions.

Next, we add the logo and the header DIV that will have the bokeh effect background:

<div id="header">
	<div class="container">
		<a id="logo" href="index.php">Fictive Company</a>
	</div>
</div>

In the main body of the page, we have the central column and sidebar column wrapped in a container DIV:

<div class="container">
	<div id="main">
		<div class="central">
			<div class="box">
				<h1>Welcome to the Fictive Company Website</h1>
				<p>Lorem Ipsum...</p>
			</div>
		</div>

In the sidebar, we have two box DIVs, one for the news titles that is pulled from the RSS feed, and another for a list of links, like the blogroll used in WordPress. We finish with two closing DIV tags, one for the container, and the other for the main DIV.

<div id="sidebar">
			<div class="box">
				<h2>News Feed</h2>
				<!--This is where the news titles will go.-->
			</div>

			<div class="box">
				<h2>Blogroll</h2>
				<ul class="blogroll">
					<li><a href="http://www.onextrapixel.com/">Onextrapixel</a></li>
					<li><a href="http://10steps.sg/">10Steps.SG</a></li>
					<li><a href="http://www.csswebsites.nl/">CSS Websites</a></li>
				</ul>
			</div>
		</div>
	</div>
</div>

Finally, the footer is added and we close the body and HTML tags:

<div class="container">
	<div id="footer">
		<p id="copyright">Copyright © 2009. by <strong>Fictive Company</strong>. All Rights Reserved.</p>
	</div>
</div>

</body>
</html>

Here is what the page looks like without any styles applied:

Without Style

The CSS

I start the CSS files for all my projects with a slightly modified version of Eric Meyer’s CSS Reset. The CSS reset will save you from a lot of headaches, so do consider using one. Also, it is useful to use little titles as comments so that you can find your way around your CSS files more easily if you have to edit something later.

/* RESET */
html, body, div, span, applet, object, iframe, h1, h2, h3, h4, h5, h6, p, blockquote, pre, a, abbr, acronym, address, big, cite, code, del, dfn, em, font, img, ins, kbd, q, s, samp, small, strike, strong, sub, sup, tt, var, dl, dt, dd, ol, ul, li, fieldset, form, label, legend, table, caption, tbody, tfoot, thead, tr, th, td {
	margin: 0;
	padding: 0;
	border: 0;
	outline: 0;
	font-weight: inherit;
	font-style: inherit;
	font-size: 100%;
	font-family: inherit;
	text-decoration: none;
}
:focus { outline: 0 }
strong { font-weight: bold }
em { font-style: italic }
ol, ul { list-style: none }
a img { border:none } /* Gets rid of IE's blue borders */

I have removed some of the rules that aren’t needed for this website (like tables and blockquotes). You can also remove some of the selectors form the first rule (again: tables, blockquotes, form elements etc.) but I usually leave them there in case there is something I need to add later.

The following section is titled typography because it mainly sets various text properties.

/* TYPOGRAPHY */
body {
	color: #333;
	background: #f2f2f2 url(images/bgBody.png) repeat-x;
	font: 14px/20px Candara, Arial, Helvetica, sans-serif;
	text-align: justify;
}

p { margin-bottom: 20px }
p.desc { margin-bottom: 10px }
p.date { color: #555; font-style: italic }

a { color: #3674b2 }
a:hover { border-bottom: 1px solid #3674b2 }

h1 {
	font: 24px/30px Corbel, Arial, Helvetica, sans-serif;
	font-weight: bold;
	letter-spacing: 1px;
	text-shadow: 0 1px 0 #fff;
	margin-bottom: 15px;
	border-bottom: 1px solid #614578;
	color: #614578;
}

h2 {
	font: 20px/26px Corbel, Arial, Helvetica, sans-serif;
	font-weight: bold;
	text-shadow: 0 1px 0 #fff;
	margin-bottom: 10px;
	border-bottom: 1px solid #614578;
	color: #614578;
}

h1 a, h2 a, h1 a:hover, h2 a:hover { border: none }

ul.blogroll li {
	font-size: 18px;
	line-height: 23px;
	text-align: left;
	margin-bottom: 5px;
	border-bottom: 1px dotted #999;
}

ul.blogroll li:hover { border-bottom: 1px solid #999 }
ul.blogroll li a:hover { border: none }

Next, we will set the rules for the main layout elements:

/* LAYOUT */
.container {
	width: 960px;
	margin: 0 auto;
}

/* HEADER */
#header {
	background: url(images/bgHeader.png) no-repeat top center;
	height: 150px;
	margin: 0 0 30px;
}

#header .container { padding-top: 35px }

#logo {
	background: url(images/logo.png) no-repeat;
	width: 453px;
	height: 80px;
	text-indent: -9999px;
	display: block;
}

#logo:hover { border: none }

/* Navigation */
#nav { text-align: right }
#nav li { display: inline; margin-left: 30px }
#nav li a {
	line-height: 40px;
	color: #7c2c47;
	font-size: 20px;
	text-shadow: 0 1px 0 #fff;
}
#nav li a:hover { border: none; text-shadow: 0 0 3px #7c2c47; }

And now for the main content:

/* MAIN */
#main {
	clear: both;
	float: left;
}

.central {
	width: 630px;
	margin-right: 30px;
	float: left;
}

#sidebar {
	width: 300px;
	float: left;
}

.box {
	background: #e6e6e6 url(images/bgBox.png) repeat-x;
	border: 2px solid #d9d9d9;
	padding: 8px 15px 18px;
	margin-bottom: 30px;
	border-radius: 12px;
	-moz-border-radius: 12px;
	-webkit-border-radius: 12px;
}

I’ve added three border radius properties to the box class. The first one, the border-radius, is the official CSS3 property which currently isn’t supported by any of the top five browsers. Although the new pre-alpha release of Opera 10.50 adds support for the border-radius property, it is not widely used. The other two properties, with –moz- and –webkit- prefixes, are for Gecko- and Webkit-based browsers, respectively.

With the above code, you’ll be able to see nice, smoothly rounded corners if you view this page with Safari, which is used for screenshots in this tutorial. Google Chrome and Mozilla Firefox are also popular browsers that will give you the same view shown here. Unfortunately, current versions of Opera and Internet Explorer (IE) do not support CSS rounded borders. However, you can add rounded borders with the use of JavaScript in Opera and IE but that is beyond the scope of this tutorial.

Finally, we style the footer:

/* FOOTER */
#footer {
	text-align: center;
	clear: both;
	margin-bottom: 18px;
	padding-top: 18px;
}

#copyright {
	font-size: 17px;
	color: #555;
}

This is how the page will turn out after the styles have been applied:

With Style

You’ll notice that the News Feed box is empty. That’s because we still haven’t written the PHP functions that will parse the RSS feed and display its content.

The PHP

Before we start writing our parser, we will discuss the different RSS specifications. Onextrapixel’s RSS feed, which we’ll use for this tutorial, is based on the RSS 2.0 specification. The syntax of this specification is rather simple XML:

<rss version="2.0">
<channel>
<title>Onextrapixel - Showcasing Web Treats Without Hitch</title>
<link>http://www.onextrapixel.com</link>
<description>A digital playground for web designers and developers by sharing freebies, great tutorials, useful resources and online tips.</description>
<item>
	<title>Item Title</title>
        <link>http://www.onextrapixel.com/</link>
        <description>Post or article description (post excerpt)</description>
	<pubDate>Thu, 17 Dec 2009 07:46:56 +0000</pubDate>
</item>
<item>
	<title>Item Title</title>
        <link>http://www.onextrapixel.com/</link>
        <description>Post or article description (post excerpt)</description>
	<pubDate>Thu, 17 Dec 2009 07:46:56 +0000</pubDate>
</item>
</channel>
</rss>

Each feed entry is wrapped in the item tag and contains several tags that are of interest to us. We will be using the title, link, description and pubDate tags.

Some sites use other RSS specifications, like 0.91 and 1.0. Both are compatible with the RSS parser that we are about to write. However, it is worth noting that the 1.0 specification misses the pubDate tag; other than that, the feed should be parsed with no problems.

Finally, we create the functions.php file and start writing the functions that will read the content of the XML file containing the RSS feed.

The simpler code, listing the titles, looks like this:

function parserSide($feedURL) {
	$rss = simplexml_load_file($feedURL);
	echo "<ul class='newsSide'>";
	$i = 0;
	foreach ($rss->channel->item as $feedItem) {
		$i++;
		echo "<li><a href='$feedItem->link' title='$feedItem->title'>" . $feedItem->title . "</a></li>";
		if($i >= 5) break;
	}
	echo "</ul>";
}

Here, I will explain the purpose of each line of code for beginners to understand. For the more experienced individual, please put up with me for the meantime.

function parserSide($feedURL) {
	$rss = simplexml_load_file($feedURL);

We have declared a function parserSide that passes the feedURL variable from the page into the function. Then we’ve assigned the contents from the XML file at that URL to the rss variable. The simplexml_load_file function reads the contents of an XML file and converts it to an object.

echo "<ul class='newsSide'>";
	$i = 0;

Next, we’ve echoed out the opening unordered list tag and gave it a newsSide class so that we can easily style it later. We have also set a counter variable, i, to a value of zero.

foreach ($rss->channel->item as $feedItem) {
		$i++;
		echo "<li><a href='$feedItem->link' title='$feedItem->title'>" . $feedItem->title . "</a></li>";
		if($i >= 5) break;
	}
	echo "</ul>";
}

We have come to the foreach loop that will spit out the list items, i.e. news titles wrapped in anchor tags. As I’ve already said, all RSS feed items are wrapped in a channel tag, so to get to the content of the actual item we set the value of the feedItem variable to equal $rss->channel->item.

The first thing that we do inside the loop is increase the value of the i counter by one. Next, we echo out the list item tags, inside which we’ll place the linked title of the post:

echo "<li><a href='$feedItem->link' title='$feedItem->title'>" . $feedItem->title . "</a></li>";

You’ll notice that not everything we echo out is between the double quotation marks. Instead we have used the dot symbol to concatenate parts of the code that will be echoed out. We do this is because PHP understands anything between double quotes as a string, while we need to echo out the contents of the variables as HTML. At the end of this tutorial, I have added a link to a post that explains the concatenation of strings in more detail.

Finally, we check if the counter is equal or greater than a specific number we choose. In case you want to list ten items, you would simply write: if($i >= 10) and the loop would end after listing ten items. We close the loop with the curly bracket, echo the end unordered list tag and close the curly brackets for the function.

Now, we have to add two little bits of PHP to the index.php file which has so far contained only HTML. We add the following block of code just after the body tag:

<?php require_once "functions.php"; ?>

This will include the functions file so that we can make use of the functions inside of it. The other bit is calling the parserSide function and telling it which RSS feed to parse. We add this line of code just after the H2 tags containing the News Feed title:

<?php parserSide("http://feeds.feedburner.com/onextrapixel"); ?>

Let’s take a look at our progress so far.

News Feed

If you click on the image to view it full size, you will see that the content is pushed down by about twenty pixels. This is a result of the first PHP code we added after the body tag. The browser sees it as content and shifts everything below it. To fix this, we will wrap the PHP snippet of code in a DIV that will be absolutely positioned so that it doesn’t interfere with our content. We’ll also make it zero sized, just to be on the safe side.

<div id='zero'><?php require_once "functions.php"; ?></div>

And the CSS for our zero DIV:

#zero {
	position: absolute;
	top: 0;
	left: 0;
	width: 0;
	height: 0;
}

And, voilà:

Absolute Positioned

I feel that the look of the list items need more work, so we’ll be adding some styles to the newsSide class:

ul.newsSide li {
	text-align: left;
	margin-bottom: 5px;
	border-bottom: 1px dotted #999;
}

And we use the border styles from the blogroll class:

ul.blogroll li:hover, ul.newsSide li:hover { border-bottom: 1px solid #999 }
ul.blogroll li a:hover, ul.newsSide li a:hover { border: none }

Now it looks much better:

2nd PHP Function

Now we need to create the second PHP function but, before that, open the index.php in your favourite text editor and save it as news.php. Inside of news.php we make a few small changes. In the central DIV we change the H1 title to Company News and we replace the paragraph tags with the following:

<?php parser("http://feeds.feedburner.com/onextrapixel"); ?>

Your central DIV should look like this:

<div class="central">
	<div class="box">
		<h1>Company News</h1>
		<?php parser("http://feeds.feedburner.com/onextrapixel"); ?>
	</div>
</div>

From the sidebar, we remove the box DIV containing our feed item titles, so the sidebar DIV looks like this:

<div id="sidebar">
	<div class="box">
		<h2>Blogroll</h2>
		<ul class="blogroll">
			<li><a href="http://www.onextrapixel.com/">Onextrapixel</a></li>
			<li><a href="http://10steps.sg/">10Steps.SG</a></li>
			<li><a href="http://www.csswebsites.nl/">CSS Websites</a></li>
		</ul>
	</div>
</div>

Now we’ll go back to the functions.php file to write the second function. The second function is a little more complex because we will be echoing out post titles, content excerpts, publication dates and a ‘Continue Reading’ line. However, the logic is the same, so it shouldn’t be too hard to understand.

function parser($feedURL) {
	$rss = simplexml_load_file($feedURL);
	echo "<ul class='news'>";
	$i = 0;
	foreach ($rss->channel->item as $feedItem) {
		$i++;
		echo "<li>
			<h2 class='news'><a href='$feedItem->link' title='$feedItem->title'>" . $feedItem->title . "</a></h2>
			<p class='desc'>" . $feedItem->description . "</p>
			<p class='date'>Posted on: " . $feedItem->pubDate . "<a class='cont' href='$feedItem->link' title='$feedItem->title'>Continue Reading</a></p>
		</li>";
		if($i >= 5) break;
	}
	echo "</ul>";
}

We start with an unordered list with a different class this time; please note that we add the class news because the styling will be different. Inside the foreach loop, we first echo out the list items tags and then a linked title wrapped in H2 tags.

Next we wrap the content excerpt inside a paragraph tag and give it a class of desc, because we will style it to have a smaller bottom margin, then another paragraph tag with a date class. Inside of which, we will show the date on which the item was posted and also a link to the full post. We’ll also add a ‘Continue Reading’ link inside the date paragraph and assign a cont class to it.

Here is a preview of the News page:

News Page

The look of the news items isn’t quite to my liking so I will make use of the classes I’ve assigned to various elements and create some more CSS rules:

h2.news {
	padding: 3px 8px;
	margin-bottom: 5px;
	background-color: #eee;
	border: none;
	border-radius: 8px;
	-moz-border-radius: 8px;
	-webkit-border-radius: 8px;
}
h2.news:hover { background-color: #fff }

p.desc { margin-bottom: 10px }
p.date { color: #555; font-style: italic }

a.cont { display: block; float: right; font-style: normal }

The H2 headings containing the news titles will no longer be underlined; instead, they will have a nice background with rounded corners. Next, the paragraph containing the content will have a smaller bottom margin so that everything comes together more nicely. We make the date appear in an italic font and a little lighter and make sure that the ‘Continue Reading’ link stays non-italic and we also float it to the right.

Continue Reading

Now it looks much better. The date format however, needs more work. Fortunately, that is rather easy to change. All we need to do is make a few modifications to the functions.php file. The foreach loop in out parser function will now look like this:

foreach ($rss->channel->item as $feedItem) {
	$i++;
	$myDate = ($feedItem->pubDate);
	$dateForm = explode(" ", $myDate);
	echo "<li>
		<h2 class='news'><a href='$feedItem->link' title='$feedItem->title'>" . $feedItem->title . "</a></h2>
		<p class='desc'>" . $feedItem->description . "</p>
		<p class='date'>Posted on: " . $dateForm[1] . ". " . $dateForm[2] . ". " . $dateForm[3] . "." . "<a class='cont' href='$entry->link' title='$entry->title'>Continue Reading</a></p>
		</li>";
		if($i >= 5) break;
	}

We take the date from the $feedItem->pubDate and assign it to the myDate variable. Next, we make use of PHP’s explode function. The explode function takes a string and splits its content into an array at specific delimiter strings. In our case, this means that the original date string will be split at any space character and the substrings will then be put into an array.

The function has two arguments; the first one, in double quotes, is the delimiter and the other argument is the source string. Now that we have split our original date string, we can display it in any way we chose:

<p class='date'>Posted on: " . $dateForm[1] . ". " . $dateForm[2] . ". " . $dateForm[3] . "." . "<a class='cont' href='$entry->link' title='$entry->title'>Continue Reading</a></p>

I prefer my display date to be in the European format – day, month, and year. So I’ve used the first three elements in the dateForm array in their original order. If you prefer a different date format, you can chose to rearrange them in any way you like.

Here it the final look of the News page:

Date Format

In case you wanted to include the time, you would explode the fifth element in the dateForm array ($dateForm[4], because 0 is the first item), set the colon as the delimiter and you’d get an array with three elements: hours, minutes, seconds.

In Closing

That would sum up this tutorial. I hope that it was easy to understand and that you enjoyed it. Below are the links that you might find useful and if you have any questions or suggestions on improving this tutorial or if you have a tutorial request, please don’t hesitate to mention it in the comment below.

Useful Links