Grouper - Documentation
Getting Started: Free Download |
Purchase |
Install
Reference: Functions | Plugins | Themes
Etc.: Configure | Affiliates
Reference: Functions | Plugins | Themes
Etc.: Configure | Affiliates
Blog Plugin
The Blog was originally designed to "scrape" the contents of weblogs and convert them to RSS feeds, but it can be configured to work with other web pages that contain individual items of information in a regular format. The default configuration works with blogs containing the comment tags as noted below to mark their different parts. To make it work with a particular weblog, you can either add the comment tags to the weblog template, or change the plugin's settings to recognize the structure of the existing template. Altering the template is probably the easier method, when possible.Installation:
To use the Blog plugin, blog.php must be located in the "plugins" folder inside the folder containing grouper.php. This is the default location when Grouper Evolution is installed.
Use:
The following code will generate an RSS feed from a weblog that has the necessary comment tags in its template:
<?php
require_once '/YOUR/PATH/TO/grouper/grouper.php';
GrouperLoadPlugin('blog.php');
GrouperSourceURL('http://example.com/blog/');
GrouperShow('','CACHE-FILE-NAME');
?>
Note: if you are using Grouper version 1.4.2 or earlier, you must replace the call to GrouperSourceURL with something like the following:
GrouperSourceConf('searchdomain','example.com');
GrouperSourceConf('querystart','/blog/');
Configuration:
You may configure the behavior of the Blog plugin using the function GrouperSourceConf, as follows:
GrouperSourceConf('OptionName','new value');
The Blog plugin has the following options:
- maxidesc: The maximum number of characters to include in the item description. Any additional characters will be discarded.
- atruncidesc: The text to add after an item description that has been truncated by the maxidesc setting.
- encoding: [Grouper <= 1.6.1] The character encoding of the page (and thus of the newsfeed). You can usually leave this as it is.
- channeltitle: [Grouper < 1.6] The default title for your RSS channel. If a title is successfully extracted from the page, it will override this value.
- channeldescription: [Grouper < 1.6] The default description for your RSS channel. If a description is successfully extracted from the page, it will override this value.
- cfields: The channel fields (fields that apply to the overall page, as opposed to the individual items) to look for in the page and include, if found, in the RSS feed. Separate multiple values with a comma. Supported values are title and description.
- ifields: The item fields (fields for each individual item) to look for in the page and include, if found, in the RSS feed. Separate multiple values with a comma. Supported values are: title, description, datetime, date, time, author and link. NOTE: Use datetime if the page contains timestamps including both the date and time. If the date and time are listed in separate locations, use date and time instead. Blog will combine them into a single pubDate field in the RSS feed.
- searchdomain: [Grouper < 1.6] The domain name of the blog or other page you wish to scrape (for example, 'www.geckotribe.com'). Use the function GrouperSourceURL to set this option and querystart at the same time.
- querystart: [Grouper < 1.6] The path to the page you wish to scrape. This value MUST begin with '/'. If the path is to a directory and the document contains relative links, it must end with '/' for the links to be processed correctly. Note that this applies only to link fields, not to links in the description text (which are not altered by this plugin). Use the function GrouperSourceURL to set this option and querystart at the same time.
- tossbefore: <!-- GrouperStart --> If this value is not blank, anything appearing before it on the page will be discarded.
- tossafter: <!-- GrouperEnd --> If this value is not blank, anything appearing after it on the page will be discarded.
- channelstart & channelend: <!-- Heading --> & <!-- /Heading --> Everything appearing between these values will be searched for the channel information.
- itemsstart & itemsend: <!-- Blog Posts --> & <!-- /Blog Posts --> Everything appearing between these values will be searched for ALL of the individual items.
- itemstart & itemend: <!-- Item --> & <!-- /Item --> Everything appearing between these values will be searched for the fields in EACH individual item.
- ctitlestart & ctitleend: <!-- Title --> & <!-- /Title --> Everything appearing between these values (within the channelstart/channelend section) will be used as the channel title.
- cdescriptionstart & cdescriptionend: <!-- Description --> & <!-- /Description --> Everything appearing between these values (within the channelstart/channelend section) will be used as the channel description.
- ititlestart & ititleend: <!-- Title --> & <!-- /Title --> Everything appearing between these values (within an individual itemstart/itemend section) will be used as that item's title.
- idescriptionstart & idescriptionend: <!-- Description --> & <!-- /Description --> Everything appearing between these values (within an individual itemstart/itemend section) will be used as that item's description.
- iauthorstart & iauthorend: <!-- Author --> & <!-- /Author --> Everything appearing between these values (within an individual itemstart/itemend section) will be used as that item's author.
- ilinkstart & ilinkend: <!-- Link --> & <!-- /Link --> Everything appearing between these values (within an individual itemstart/itemend section) will be used as that item's link.
- idatetimestart & idatetimeend: <!-- DateTime --> & <!-- /DateTime --> Everything appearing between these values (within an individual itemstart/itemend section) will be used as that item's pubDate. If DateTime, Date and Time are all found, DateTime will override Date and Time.
- idatestart & idateend: <!-- Date --> & <!-- /Date --> Everything appearing between these values (within an individual itemstart/itemend section) will be used as that item's date, which will be combined with the time to generate the item's pubDate. If DateTime is found, it will override Date and Time.
- itimestart & itimeend: <!-- Time --> & <!-- /Time --> Everything appearing between these values (within an individual itemstart/itemend section) will be used as that item's time, which will be combined with the date to generate the item's pubDate. If DateTime is found, it will override Date and Time.