The Virtuoso Sponger is the Linked Data middleware component of Virtuoso. It generates Linked Data from a variety of data sources, and supports a wide variety of data representation and serialization formats. The Sponger is transparently integrated into Virtuoso's SPARQL Query Processor where it delivers URI de-referencing within SPARQL query patterns, across disparate data spaces. It also delivers configurable smart HTTP caching services. Optionally, it can be used by the Virtuoso Content Crawler to periodically populate and replenish data within the native RDF Quad Store.
The Sponger is also a full-fledged HTTP proxy service, directly accessible via SOAP or REST interfaces.
As depicted below, OpenLink's broad portfolio of Linked-Data-aware products supports a number of routes for creating or consuming Linked Data. The Sponger provides a key platform for developers to generate quality data meshes from unstructured or semi-structured data sources.
A majority of the worlds data naturally resides in non-Linked-Data form at the current time. The Sponger delivers middleware that accelerates the bootstrap of the Semantic Data Web by unobtrusively generating Linked Data (typically in RDF form, today) from non-Linked-Data data sources. This "Swiss army knife" for on-the-fly Linked Data generation provides a bridge between the traditional Document Web and the Linked Data Web ("Data Web").
Sponging non-Linked-Data Web sources and converting their data content to Linked Data exposes that data in a canonical form for querying and inference, and enables fast and easy construction of Linked-Data-driven "mesh-ups" (as opposed to code-driven Web 2.0 mash-ups).
Linked Data extraction and instance data generation products that offer functionality similar to that demonstrated by the Sponger are also commonly referred to as "RDFizers."
Designed with a pluggable architecture, the Sponger's core functionality is provided by Cartridges. Each cartridge includes Data Extractors which extract data from one or more data sources, and Ontology Mappers which map the extracted data to one or more ontologies/schemas, en route to producing RDF Linked Data.
Cartridges are highly customizable, and can be developed using any language supported by the Virtuoso Server Extensions API. This enables generation of structured linked data from virtually any resource type, rather than limiting users to resource types supported by the default Sponger Cartridge collection bundled as part of the Virtuoso Sponger VAD package (cartridges_dav.vad).
(See an animation of the concept, if the embed above fails in your browser.)
The Sponger also includes a pluggable name resolution mechanism that enables Custom Resolvers for naming schemes (e.g., URNs) associated with protocols beyond HTTP. Examples of custom resolvers include:
|URN handler||Sample URI||Resource Description||Linked Data View||Linked Data Graph||Needs|
|DOI|| ||HTML Representation||Linked Data View||Data Explorer View|| |
|LSID|| ||HTML Representation||Linked Data View||Data Explorer View||None|
|OAI|| ||HTML Representation||Linked Data View||Data Explorer View||None|
Cache expiration is managed through the
MinExpiration parameter in the
cartridgesVAD package, which includes all publicly-available Sponger cartridges and associated components. Check to ensure it is installed using the System Admin -> Packages tab of the Virtuoso Conductor.
cartridgesVAD is not listed, it can be downloaded now. Install the
cartridges_dav.vadpackage using the Conductor UI from the System Admin -> Packages tab or by using iSQL:
SQL> DB.DBA.VAD_INSTALL('tmp/cartridges_dav.vad',0); SQL_STATE SQL_MESSAGE VARCHAR VARCHAR _______________________________________________________________________________ 00000 No errors detected 00000 Installation of "Linked Data Cartridges" is complete. 00000 Now making a final checkpoint. 00000 Final checkpoint is made. 00000 SUCCESS 6 Rows. -- 1078 msec.
SPARQL_SPONGEprivileges to user
SPARQL. (Note: more sophisticated security is provided via WebID based ACL protection of your SPARQL endpoint).
An Extractor Cartridge processes a Resource of a given format, extracting RDF according to rules appropriate to that format. External data does not come into play; only the content of the Resource fed to the Sponger.
These Cartridges handle open formats -- typically community-developed, openly-documented, and freely-licensed data structures.
These Cartridges handle closed formats -- typically proprietary; sometimes undocumented; possibly licensed to no-one except the format originator. Sometimes data may not be parsed as desired or expected, as many of these Cartridges have required reverse-engineering of the data format in question.
A Meta Cartridge submits a Resource to a third-party Web Service for processing. Returned RDF supplements the RDF generated by Extractor and other Meta Cartridges. Locally generated RDF may also be submitted to the third-party services, instead-of or in-addition-to the original Resource itself.
Default Sponger behavior is for all installed Meta Cartridges to be brought to bear on all submitted Resources.
Click the image for a full-size, clickable version!
Virtuoso's Sponger is a sophisticated piece of middleware that provides full Linked Data fidelity for pre-existing data objects or resources. This Linked Data is then accessible via HTTP-based Web Services, and SPARQL is enhanced with Sponger pragmas and some optional additions to the FROM clause. See full list of supported pragmas and usage examples.