Thousands of sites (particularly news sites and weblogs) publish their latest headlines and/or stories in a machine-readable format so that other sites can easily link to them. This content is usually in the form of an RSS feed (which is an XML-based syndication standard).

You can read aggregated content from many sites using RSS feed readers, such as Amphetadesk.

Drupal provides the means to aggregate feeds from many sites and display these aggregated feeds to your site\'s visitors. To do this, enable the aggregator module in site administration and then go to the aggregator configuration page, where you can subscribe to feeds and set up other options.

How do I find RSS feeds to aggregate?

Many web sites (especially weblogs) display small XML icons or other obvious links on their home page. You can follow these to obtain the web address for the RSS feed. Common extensions for RSS feeds are .rss, .xml and .rdf. For example: Slashdot RSS.

If you can\'t find a feed for a site, or you want to find several feeds on a given topic, try an RSS syndication directory such as Syndic8.

To learn more about RSS, read Mark Pilgrim\'s What is RSS and WebReference.com\'s The Evolution of RSS articles.

NOTE: Enable your site\'s XML syndication button by turning on the Syndicate block in block management.

How do I add a news feed?

To subscribe to an RSS feed on another site, use the aggregation page.

Once there, click the new feed tab. Drupal will then ask for the following:

Once you have submitted the new feed, check to make sure it is working properly by selecting update items on the aggregation page. If you do not see any items listed for that feed, edit the feed and make sure that the URL was entered correctly.

Adding categories

News items can be filed into categories. To create a category, start at the aggregation page.

Once there, select new category from the menu. Drupal will then ask for the following:

Using the news aggregator

The news aggregator has a number of ways that it displays your subscribed content:

Pages that display items (for sources, categories, etc.) display the following for each item:

Additionally, users with the administer news feeds permission will see a link to categorize the news items. Clicking this will allow them to select which category(s) each news item is in.

Technical details

Drupal automatically generates an OPML feed file that is available by selecting the XML icon on the News Sources page.

When fetching feeds Drupal supports conditional GETs, this reduces the bandwidth usage for feeds that have not been updated since the last check.

If a feed is permanently moved to a new location Drupal will automatically update the feed URL to the new address.

', array('%block' => url('admin/block'), '%admin-news' => url('admin/aggregator'), '%new-feed' => url('admin/aggregator/add/feed'), '%new-category' => url('admin/aggregator/add/category'), '%update-items' => url('admin/aggregator'), '%news-aggregator' => url('aggregator'), '%sources' => url('aggregator/sources'), '%categories' => url('aggregator/categories'))); case 'admin/modules#description': return t('Aggregates syndicated content (RSS and RDF feeds).'); case 'admin/aggregator': return t('

Thousands of sites (particularly news sites and weblogs) publish their latest headlines and/or stories in a machine-readable format so that other sites can easily link to them. This content is usually in the form of an RSS feed (which is an XML-based syndication standard). To display the feed or category in a block you must decide how many items to show by editing the feed or block and turning on the feed\'s block.

', array('%block' => url('admin/block'))); case 'admin/aggregator/add/feed': return t('

Add a site that has an RSS/RDF feed. The URL is the full path to the RSS feed file. For the feed to update automatically you must run "cron.php" on a regular basis. If you already have a feed with the URL you are planning to use, the system will not accept another feed with the same URL.

'); case 'admin/aggregator/add/category': return t('

Categories provide a way to group items from different news feeds together. Each news category has its own feed page and block. For example, you could tag various sport-related feeds as belonging to a category called Sports. News items can be added to a category automatically by setting a feed to automatically place its item into that category, or by using the categorize items link in any listing of news items.

'); } } function aggregator_settings() { $items = array(0 => t('none')) + drupal_map_assoc(array(3, 5, 10, 15, 20, 25), '_aggregator_items'); $period = drupal_map_assoc(array(3600, 10800, 21600, 32400, 43200, 86400, 172800, 259200, 604800, 1209600, 2419200, 4838400, 9676800), 'format_interval'); $output = ''; $output .= form_select(t('Items shown in sources and categories pages'), 'aggregator_summary_items', variable_get('aggregator_summary_items', 3), $items, t('The number of items which will be shown with each feed or category in the feed and category summary pages.')); $output .= form_select(t('Discard news items older than'), 'aggregator_clear', variable_get('aggregator_clear', 9676800), $period, t('Older news items will be automatically discarded. Requires crontab.')); $output .= form_radios(t('Category selection type'), 'aggregator_category_selector', variable_get('aggregator_category_selector', 'check'), array('check' => t('checkboxes'), 'select' => t('multiple selector')), t('The type of category selection widget which is shown on categorization pages. Checkboxes are easier to use; a multiple selector is good for working with large numbers of categories.')); return $output; } /** * Helper function for drupal_map_assoc. */ function _aggregator_items($count) { return format_plural($count, '1 item', '%count items'); } /** * Implementation of hook_perm(). */ function aggregator_perm() { return array('administer news feeds', 'access news feeds'); } /** * Implementation of hook_menu(). */ function aggregator_menu($may_cache) { $items = array(); if ($may_cache) { $edit = user_access('administer news feeds'); $view = user_access('access news feeds'); $items[] = array('path' => 'admin/aggregator', 'title' => t('aggregator'), 'callback' => 'aggregator_admin_overview', 'access' => $edit); $items[] = array('path' => 'admin/aggregator/edit/feed', 'title' => t('edit feed'), 'callback' => 'aggregator_admin_edit_feed', 'access' => $edit, 'type' => MENU_CALLBACK); $items[] = array('path' => 'admin/aggregator/edit/category', 'title' => t('edit category'), 'callback' => 'aggregator_admin_edit_category', 'access' => $edit, 'type' => MENU_CALLBACK); $items[] = array('path' => 'admin/aggregator/remove', 'title' => t('remove items'), 'callback' => 'aggregator_admin_remove_feed', 'access' => $edit, 'type' => MENU_CALLBACK); $items[] = array('path' => 'admin/aggregator/update', 'title' => t('update items'), 'callback' => 'aggregator_admin_refresh_feed', 'access' => $edit, 'type' => MENU_CALLBACK); $items[] = array('path' => 'admin/aggregator/list', 'title' => t('list'), 'type' => MENU_DEFAULT_LOCAL_TASK, 'weight' => -10); $items[] = array('path' => 'admin/aggregator/add/feed', 'title' => t('add feed'), 'callback' => 'aggregator_admin_edit_feed', 'access' => $edit, 'type' => MENU_LOCAL_TASK); $items[] = array('path' => 'admin/aggregator/add/category', 'title' => t('add category'), 'callback' => 'aggregator_admin_edit_category', 'access' => $edit, 'type' => MENU_LOCAL_TASK); $items[] = array('path' => 'aggregator', 'title' => t('news aggregator'), 'callback' => 'aggregator_page_last', 'access' => $view, 'weight' => 5); $items[] = array('path' => 'aggregator/sources', 'title' => t('sources'), 'callback' => 'aggregator_page_sources', 'access' => $view); $items[] = array('path' => 'aggregator/categories', 'title' => t('categories'), 'callback' => 'aggregator_page_categories', 'access' => $view, 'type' => MENU_ITEM_GROUPING); // Sources: $result = db_query('SELECT title, fid FROM {aggregator_feed} ORDER BY title'); while ($feed = db_fetch_object($result)) { $items[] = array('path' => 'aggregator/sources/'. $feed->fid, 'title' => $feed->title, 'callback' => 'aggregator_page_source', 'access' => $view); $items[] = array('path' => 'aggregator/sources/'. $feed->fid .'/view', 'title' => t('view'), 'type' => MENU_DEFAULT_LOCAL_TASK, 'weight' => -10); $items[] = array('path' => 'aggregator/sources/'. $feed->fid .'/categorize', 'title' => t('categorize'), 'callback' => 'aggregator_page_source', 'access' => $edit, 'type' => MENU_LOCAL_TASK); $items[] = array('path' => 'aggregator/sources/'. $feed->fid .'/configure', 'title' => t('configure'), 'callback' => 'aggregator_edit', 'access' => $edit, 'type' => MENU_LOCAL_TASK, 'weight' => 1); } // Categories: $result = db_query('SELECT title, cid FROM {aggregator_category} ORDER BY title'); while ($category = db_fetch_object($result)) { $items[] = array('path' => 'aggregator/categories/'. $category->cid, 'title' => $category->title, 'callback' => 'aggregator_page_category', 'access' => $view); $items[] = array('path' => 'aggregator/categories/'. $category->cid .'/view', 'title' => t('view'), 'type' => MENU_DEFAULT_LOCAL_TASK, 'weight' => -10); $items[] = array('path' => 'aggregator/categories/'. $category->cid .'/categorize', 'title' => t('categorize'), 'callback' => 'aggregator_page_category', 'access' => $edit, 'type' => MENU_LOCAL_TASK); $items[] = array('path' => 'aggregator/categories/'. $category->cid .'/configure', 'title' => t('configure'), 'callback' => 'aggregator_edit', 'access' => $edit, 'type' => MENU_LOCAL_TASK, 'weight' => 1); } $items[] = array('path' => 'aggregator/opml', 'title' => t('opml'), 'callback' => 'aggregator_page_opml', 'access' => $view, 'type' => MENU_CALLBACK); } return $items; } /** * Implementation of hook_cron(). * * Checks news feeds for updates once their refresh interval has elapsed. */ function aggregator_cron() { $result = db_query('SELECT * FROM {aggregator_feed} WHERE checked + refresh < %d', time()); while ($feed = db_fetch_array($result)) { aggregator_refresh($feed); } } /** * Implementation of hook_block(). * * Generates blocks for the latest news items in each category and feed. */ function aggregator_block($op, $delta = 0, $edit = array()) { if (user_access('access news feeds')) { if ($op == 'list') { $result = db_query('SELECT cid, title FROM {aggregator_category} ORDER BY title'); while ($category = db_fetch_object($result)) { $block['category-'. $category->cid]['info'] = t('%title category latest items', array('%title' => theme('placeholder', $category->title))); } $result = db_query('SELECT fid, title FROM {aggregator_feed} ORDER BY fid'); while ($feed = db_fetch_object($result)) { $block['feed-'. $feed->fid]['info'] = t('%title feed latest items', array('%title' => theme('placeholder', $feed->title))); } } else if ($op == 'configure') { list($type, $id) = explode('-', $delta); if ($type == 'category') { $value = db_result(db_query('SELECT block FROM {aggregator_category} WHERE cid = %d', $id)); } else { $value = db_result(db_query('SELECT block FROM {aggregator_feed} WHERE fid = %d', $id)); } $output = form_select(t('Number of news items in block'), 'block', $value, drupal_map_assoc(array(2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20))); return $output; } else if ($op == 'save') { list($type, $id) = explode('-', $delta); if ($type == 'category') { $value = db_query('UPDATE {aggregator_category} SET block = %d WHERE cid = %d', $edit['block'], $id); } else { $value = db_query('UPDATE {aggregator_feed} SET block = %d WHERE fid = %d', $edit['block'], $id); } } else if ($op == 'view') { list($type, $id) = explode('-', $delta); switch ($type) { case 'feed': if ($feed = db_fetch_object(db_query('SELECT fid, title, block FROM {aggregator_feed} WHERE fid = %d', $id))) { $block['subject'] = check_plain($feed->title); $result = db_query_range('SELECT * FROM {aggregator_item} WHERE fid = %d ORDER BY timestamp DESC, iid DESC', $feed->fid, 0, $feed->block); $block['content'] = ''; } break; case 'category': if ($category = db_fetch_object(db_query('SELECT cid, title, block FROM {aggregator_category} WHERE cid = %d', $id))) { $block['subject'] = check_plain($category->title); $result = db_query_range('SELECT i.* FROM {aggregator_category_item} ci LEFT JOIN {aggregator_item} i ON ci.iid = i.iid WHERE ci.cid = %d ORDER BY i.timestamp DESC, i.iid DESC', $category->cid, 0, $category->block); $block['content'] = ''; } break; } $items = array(); while ($item = db_fetch_object($result)) { $items[] = theme('aggregator_block_item', $item); } $block['content'] = theme('item_list', $items) . $block['content']; } return $block; } } function aggregator_remove($feed) { $result = db_query('SELECT iid FROM {aggregator_item} WHERE fid = %d', $feed['fid']); while ($item = db_fetch_object($result)) { $items[] = "iid = $item->iid"; } if ($items) { db_query('DELETE FROM {aggregator_category_item} WHERE '. implode(' OR ', $items)); } db_query('DELETE FROM {aggregator_item} WHERE fid = %d', $feed['fid']); db_query("UPDATE {aggregator_feed} SET checked = 0, etag = '', modified = 0 WHERE fid = %d", $feed['fid']); drupal_set_message(t('Removed news items from %site.', array('%site' => theme('placeholder', $feed['title'])))); } /** * Call-back function used by the XML parser. */ function aggregator_element_start($parser, $name, $attributes) { global $item, $element, $tag; switch ($name) { case 'IMAGE': case 'TEXTINPUT': $element = $name; break; case 'ITEM': $element = $name; $item += 1; } $tag = $name; } /** * Call-back function used by the XML parser. */ function aggregator_element_end($parser, $name) { global $element; switch ($name) { case 'IMAGE': case 'TEXTINPUT': case 'ITEM': $element = ''; } } /** * Call-back function used by the XML parser. */ function aggregator_element_data($parser, $data) { global $channel, $element, $items, $item, $image, $tag; switch ($element) { case 'ITEM': $items[$item][$tag] .= $data; break; case 'IMAGE': $image[$tag] .= $data; break; case 'TEXTINPUT': // The sub-element is not supported. However, we must recognize // it or its contents will end up in the item array. break; default: $channel[$tag] .= $data; } } /** * Checks a news feed for new items. */ function aggregator_refresh($feed) { global $channel, $image; // Generate conditional GET headers. $headers = array(); if ($feed['etag']) { $headers['If-None-Match'] = $feed['etag']; } if ($feed['modified']) { $headers['If-Modified-Since'] = gmdate('D, d M Y H:i:s', $feed['modified']) .' GMT'; } // Request feed. $result = drupal_http_request($feed['url'], $headers); // Process HTTP response code. switch ($result->code) { case 304: db_query('UPDATE {aggregator_feed} SET checked = %d WHERE fid = %d', time(), $feed['fid']); drupal_set_message(t('No new syndicated content from %site.', array('%site' => theme('placeholder', $feed['title'])))); break; case 301: $feed['url'] = $result->redirect_url; watchdog('aggregator', t('Updated URL for feed %title to %url.', array('%title' => theme('placeholder', $feed['title']), '%url' => theme('placeholder', $feed['url'])))); break; case 200: case 302: case 307: // Filter the input data: if (aggregator_parse_feed($result->data, $feed)) { if ($result->headers['Last-Modified']) { $modified = strtotime($result->headers['Last-Modified']); } /* ** Prepare the channel data: */ foreach ($channel as $key => $value) { $channel[$key] = trim(strip_tags($value)); } /* ** Prepare the image data (if any): */ foreach ($image as $key => $value) { $image[$key] = trim($value); } if ($image['LINK'] && $image['URL'] && $image['TITLE']) { $image = ''. $image['TITLE'] .''; } else { $image = NULL; } /* ** Update the feed data: */ db_query("UPDATE {aggregator_feed} SET url = '%s', checked = %d, link = '%s', description = '%s', image = '%s', etag = '%s', modified = %d WHERE fid = %d", $feed['url'], time(), $channel['LINK'], $channel['DESCRIPTION'], $image, $result->headers['ETag'], $modified, $feed['fid']); /* ** Clear the cache: */ cache_clear_all(); $message = t('Syndicated content from %site.', array('%site' => theme('placeholder', $feed[title]))); watchdog('aggregator', $message); drupal_set_message($message); } break; default: $message = t('Failed to parse RSS feed %site: %error.', array('%site' => theme('placeholder', $feed['title']), '%error' => theme('placeholder', $result->code .' '. $result->error))); watchdog('aggregator', $message, WATCHDOG_WARNING); drupal_set_message($message); } } /** * Parse the W3C date/time format, a subset of ISO 8601. PHP date parsing * functions do not handle this format. * See http://www.w3.org/TR/NOTE-datetime for more information. * Originally from MagpieRSS (http://magpierss.sourceforge.net/). * * @param $date_str A string with a potentially W3C DTF date. * @return A timestamp if parsed successfully or -1 if not. */ function aggregator_parse_w3cdtf($date_str) { if (preg_match('/(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2})(:(\d{2}))?(?:([-+])(\d{2}):?(\d{2})|(Z))?/', $date_str, $match)) { list($year, $month, $day, $hours, $minutes, $seconds) = array($match[1], $match[2], $match[3], $match[4], $match[5], $match[6]); // calc epoch for current date assuming GMT $epoch = gmmktime($hours, $minutes, $seconds, $month, $day, $year); if ($match[10] != 'Z') { // Z is zulu time, aka GMT list($tz_mod, $tz_hour, $tz_min) = array($match[8], $match[9], $match[10]); // zero out the variables if (!$tz_hour) { $tz_hour = 0; } if (!$tz_min) { $tz_min = 0; } $offset_secs = (($tz_hour * 60) + $tz_min) * 60; // is timezone ahead of GMT? then subtract offset if ($tz_mod == '+') { $offset_secs *= -1; } $epoch += $offset_secs; } return $epoch; } else { return -1; } } function aggregator_parse_feed(&$data, $feed) { global $items, $image, $channel; // Unset the global variables before we use them: unset($GLOBALS['element'], $GLOBALS['item'], $GLOBALS['tag']); $items = array(); $image = array(); $channel = array(); // parse the data: $xml_parser = drupal_xml_parser_create($data); xml_set_element_handler($xml_parser, 'aggregator_element_start', 'aggregator_element_end'); xml_set_character_data_handler($xml_parser, 'aggregator_element_data'); if (!xml_parse($xml_parser, $data, 1)) { $message = t('Failed to parse RSS feed %site: %error at line %line.', array('%site' => theme('placeholder', $feed['title']), '%error' => xml_error_string(xml_get_error_code($xml_parser)), '%line' => xml_get_current_line_number($xml_parser))); watchdog('aggregator', $message, WATCHDOG_WARNING); drupal_set_message($message, 'error'); return 0; } xml_parser_free($xml_parser); /* ** We reverse the array such that we store the first item last, ** and the last item first. In the database, the newest item ** should be at the top. */ $items = array_reverse($items); foreach ($items as $item) { unset($title, $link, $author, $description); // Prepare the item: foreach ($item as $key => $value) { // TODO: Make handling of aggregated HTML more flexible/configurable. $value = decode_entities(trim($value)); $value = strip_tags($value, '