Grouper: RSS manager, XML converter, website scraper
Web This Site

Grouper - Documentation


Getting Started: Free Download | Purchase | Install
Reference: Functions | Plugins | Themes
Etc.: Configure | Affiliates

Blog Plugin

The Blog was originally designed to "scrape" the contents of weblogs and convert them to RSS feeds, but it can be configured to work with other web pages that contain individual items of information in a regular format. The default configuration works with blogs containing the comment tags as noted below to mark their different parts. To make it work with a particular weblog, you can either add the comment tags to the weblog template, or change the plugin's settings to recognize the structure of the existing template. Altering the template is probably the easier method, when possible.

Installation:
To use the Blog plugin, blog.php must be located in the "plugins" folder inside the folder containing grouper.php. This is the default location when Grouper Evolution is installed.

Use:
The following code will generate an RSS feed from a weblog that has the necessary comment tags in its template:

<?php
require_once '/YOUR/PATH/TO/grouper/grouper.php';
GrouperLoadPlugin('blog.php');
GrouperSourceURL('http://example.com/blog/');
GrouperShow('','CACHE-FILE-NAME');
?>

Note: if you are using Grouper version 1.4.2 or earlier, you must replace the call to GrouperSourceURL with something like the following:

GrouperSourceConf('searchdomain','example.com');
GrouperSourceConf('querystart','/blog/');

Configuration:
You may configure the behavior of the Blog plugin using the function GrouperSourceConf, as follows:

GrouperSourceConf('OptionName','new value');

The Blog plugin has the following options: The rest of the options tell Blog what to look for on the page to locate the different parts of the newsfeed. By default, these are comment tags which must be added to the blog template. If you wish to scrape a blog that does not include these tags, you will need to study the HTML source for the blog and find other tags or text that can be used to locate each item. For example, take a look at the source of CaRP Tips. The default value for each is show in italics: