The Sponger is a piece of built-in Virtuoso SPARQL Processor middleware, for extracting RDF "on the fly" from non-RDF Web Data Sources.
A majority of the world's data naturally resides in non-RDF form at the current time. The "Sponger" is middleware that accelerates the bootstrap of the Semantic Data Web by unobtrusively generating RDF from non-RDF data sources.
When an RDF-aware client requests data from a network-accessible resource via the Sponger, the following events occur:
The imported data forms a local cache with invalidation rules conforming to those of traditional HTTP clients (Web Browsers).
That is to say, expiration time is determined based on subsequent data fetches of the same resource (note: the first data load will record the 'expires
' header) with current time compared to expiration time stored in the local cache.
If HTTP 'expires
' header data isn't returned by the source data server, then the Sponger will derive it's own invalidation time frame by evaluating the 'date
' header and 'last-modified
' HTTP headers.
Irrespective of path taken, local cache invalidation is driven by an assessment of current time relative to recorded expiration time.
Architecturally, the Sponger is comprised of Cartridges which are themselves comprised of RDF Extractors (RDFizers) and Ontology (Schema) Mappers.
The Schema Mappers are typically written with XSLT (e.g., GRDDL and other OpenLink Mapping Schemes) or Virtuoso PL. The Metadata Extractors may be developed in Virtuoso PL, C/C++, Java, or any other language that can be integrated into Virtuoso via its server extension APIs.
The Sponger can be used also via a Virtuoso built-in REST style Web Service, through the Proxy endpoint of any Virtuoso installation.
Note: Thecartridges_filesystem.vad
must be installed for the actual extraction and mapping to occur.
The Sponger is very much like an implementation of cURL, exposed as a built-in Virtuoso Web Service (so you can interact with it as you do with Triples).
The RDF Cartridges (mappers and extractors) are packaged as Virtuoso VAD packages, easily installed via Virtuoso's ISQL interface or the browser-based Virtuoso Conductor.
Enter the following URIs into 3rd Party RDF Client Application or Service:
http://fgiasson.com
as a URI in the OpenLink Browser (which has built-in support for /proxy) http://www.google.com/base/feeds/snippets?bq=%20%5Bemployer:%20Hewlett-Packard%5D%20%20%5Bjob%20type:full-time%5D
http://dpedia.openlinksw.com:8890/proxy
http://dbpedia.openlinksw.com:8890/DAV/JS/rdfbrowser/index.html
-- click on the Images tab