README for RDFGrabber

   RDFGrabber is a product to search content from other web sites
   provided they make it available in 
   "RDF":http://www.w3.org/RDF/ format.

   The benefit of doing it this way is that the data you get is not
   encumbered with HTML, giving you more flexibility when applying your
   own look and feel.

How to use it

   First you install RDFGrabber.tgz in the Products folder and restart
   Zope. You will now be able to create objects of the type "RDF
   Grabber". The form will ask you four questions: the id, title, URLs
   of the RDF File/Files and an optional proxy.
   You enter the URL that will return a correct RDF format. 
   If your RDF file is password protected you can specify authentication 
   parameters like this:
   http://user:password@host.domain.com/file.rdf

   I have created some test files on Zope.org:
   Prefix all names with http://www.zope.org/Members/EIONET/RDFGrabber/

   rdfexample1.rdf -- Simple example
   
   rdfexample2.rdf -- The second example.

   rdfexample2.rdf -- The third example.

   When you have created the object, you must update or synchronize the
   object with the content on the remote webserver. Click Update to
   perform it. Most common mistake is bad encoding of the file in which
   case you get a syntax error.

   There is also an optional property for a proxy-server. You enter the
   URL of the proxy as in http://proxy.mycompany.com:8080.

   Let's say you have created a RDF-file called articledb. Then insert this
   in your dtml-document to query the RDF:
<pre>
&lt;dtml-with articledb&gt;
&lt;dtml-in "query(predicate='http://purl.org/metadata/dublin_core#Title')"&gt;
	&lt;dtml-with sequence-item&gt;
		&lt;dtml-var subject&gt;
		&lt;dtml-var predicate&gt;
		&lt;dtml-var object&gt;
	&lt;/dtml-with&gt;
&lt;/dtml-in&gt;
&lt;/dtml-with&gt;
</pre>

   If you want your RDF object to import data on a regular basis, you
   can write a program which updates the channel by doing a GET on the update
   method as in lynx -source http://www.mysite.com/slashdot/update &gt;/dev/null

How it works

   An RDF-file consists of an triple (Subject, Predicate, Object). 
   They are implemented as Python tuples.

   A Python dictionary is also known as an associative array. It is kind
   of like a sack, where you can put all your goodies tagged with a
   keyword you can use to get them back.

   RDFGrabber parses the RDF-file, and for each tag inside the four
   main parts, it stores them under a keyword. Since there is only a few
   mandatory tags, you must typically first check if the dictionary
   contains the item before you can use it.

   RDFGrabber supports the core RDF, and modules, for example:
   syndication and Dublin Core. How it supports them is very simple. It
   simply maps the namespaces to easily usable keywords. The Dublin Core
   has one tag for dates, but RDFGrabber doesnt try to understand the
   date. It just treats it as a string.

Querying

  The RDF-file can be queried on its Predicates. The query-tab shows an
  example of this as a combo-box presenting the parsed predicates. 
  The user has also the possibility to fill in an arbitary expression-value.
  the API 

Persistence
  
  The RDF-source is not stored in the ZODB because the object is likely to
  be updated often, which means that the ZODB would grow a lot.
  Therefore it is instead represented as volatile and dumped to the filesystem 
  using the Python "pickle" command.

Restrictions & peculiarities

   Encoding -- The encoding from the xml processing instruction is saved and
   added to the rdf dictionary.

   HTML -- HTML (or XHTML) is not allowed inside an RDF file. This may
   come as surprise to some, but this would circumvent what RDF is
   trying to achieve.

   Entities -- All known and unknown entities are supported.

Acknowledgements

   The parser is inspired from the
   <a href ="http://sourceforge.net/projects/redfoot">redfoot project</a>.
