Virtuoso Sponger
- What Is The Sponger?
- Why is it Important?
- How Does It Work?
- Installation Steps
- Sponger Cartridges included in a Standard Virtuoso Installation
- Sponger Cartridge-based, Dynamic Linked Data Cloud
- Sponger pragmas
- Sponger Cartridge Configuration
- Sponger Usage Examples
- Other Related Pages
What Is The Sponger?
The Virtuoso Sponger is the Linked Data middleware component of Virtuoso. It generates Linked Data from a variety of data sources, and supports a wide variety of data representation and serialization formats. The Sponger is transparently integrated into Virtuoso's SPARQL Query Processor where it delivers URI de-referencing within SPARQL query patterns, across disparate data spaces. It also delivers configurable smart HTTP caching services. Optionally, it can be used by the Virtuoso Content Crawler to periodically populate and replenish data within the native RDF Quad Store.
The Sponger is also a full-fledged HTTP proxy service, directly accessible via SOAP or REST interfaces.
As depicted below, OpenLink's broad portfolio of Linked-Data-aware products supports a number of routes for creating or consuming Linked Data. The Sponger provides a key platform for developers to generate quality data meshes from unstructured or semi-structured data sources.
Why is it Important?
A majority of the worlds data naturally resides in non-Linked-Data form at the current time. The Sponger delivers middleware that accelerates the bootstrap of the Semantic Data Web by unobtrusively generating Linked Data (typically in RDF form, today) from non-Linked-Data data sources. This "Swiss army knife" for on-the-fly Linked Data generation provides a bridge between the traditional Document Web and the Linked Data Web ("Data Web").
Sponging non-Linked-Data Web sources and converting their data content to Linked Data exposes that data in a canonical form for querying and inference, and enables fast and easy construction of Linked-Data-driven "mesh-ups" (as opposed to code-driven Web 2.0 mash-ups).
Linked Data extraction and instance data generation products that offer functionality similar to that demonstrated by the Sponger are also commonly referred to as "RDFizers."
How Does It Work?
Designed with a pluggable architecture, the Sponger's core functionality is provided by Cartridges. Each cartridge includes Data Extractors which extract data from one or more data sources, and Ontology Mappers which map the extracted data to one or more ontologies/schemas, en route to producing RDF Linked Data.
Cartridges are highly customizable, and can be developed using any language supported by the Virtuoso Server Extensions API. This enables generation of structured linked data from virtually any resource type, rather than limiting users to resource types supported by the default Sponger Cartridge collection bundled as part of the Virtuoso Sponger VAD package (cartridges_dav.vad).
(See an animation of the concept, if the embed above fails in your browser.)
The Sponger also includes a pluggable name resolution mechanism that enables Custom Resolvers for naming schemes (e.g., URNs) associated with protocols beyond HTTP. Examples of custom resolvers include:
URN handler | Sample URI | Resource Description | Linked Data View | Linked Data Graph | Needs |
---|---|---|---|---|---|
DOI | doi:10.1038/35057062 |
HTML Representation | Linked Data View | Data Explorer View | hslookup plugin; and enabling of relevant mappers for html , pdf , xml , etc. |
LSID | urn:lsid:ubio.org:namebank:12292 |
HTML Representation | Linked Data View | Data Explorer View | None |
OAI | oai:dcmi.ischool.washington.edu:article/8 |
HTML Representation | Linked Data View | Data Explorer View | None |
Cache expiration is managed through the
MinExpiration
parameter in the virtuoso.ini
file.
Installation Steps
- A default Virtuoso installation includes the
cartridges
VAD package, which includes all publicly-available Sponger cartridges and associated components. Check to ensure it is installed using the System Admin -> Packages tab of the Virtuoso Conductor.- If listed as uninstalled, click the install button to the right of the package.
- If the
cartridges
VAD is not listed, it can be downloaded now. Install thecartridges_dav.vad
package using the Conductor UI from the System Admin -> Packages tab or by using iSQL:
SQL> DB.DBA.VAD_INSTALL('tmp/cartridges_dav.vad',0); SQL_STATE SQL_MESSAGE VARCHAR VARCHAR _______________________________________________________________________________ 00000 No errors detected 00000 Installation of "Linked Data Cartridges" is complete. 00000 Now making a final checkpoint. 00000 Final checkpoint is made. 00000 SUCCESS 6 Rows. -- 1078 msec.
- To enable data insertion into the Quad Store via SPARQL queries, you need to assign
SPARQL_SPONGE
privileges to userSPARQL
. (Note: more sophisticated security is provided via WebID based ACL protection of your SPARQL endpoint). - Configuring Sponger Cartridges
Sponger Cartridges included in a Standard Virtuoso Installation
There are a few kinds of Cartridge, and many of each kind are included in a standard Virtuoso installation. Click here for a breakdown of OpenLink-supported Data Sources.
Extractor Cartridges
An Extractor Cartridge processes a Resource of a given format, extracting RDF according to rules appropriate to that format. External data does not come into play; only the content of the Resource fed to the Sponger.
Supported Standard Non-RDF Data Formats
These Cartridges handle open formats -- typically community-developed, openly-documented, and freely-licensed data structures.
Supported Vendor-specific Non-RDF Data Formats
These Cartridges handle closed formats -- typically proprietary; sometimes undocumented; possibly licensed to no-one except the format originator. Sometimes data may not be parsed as desired or expected, as many of these Cartridges have required reverse-engineering of the data format in question.
Meta Cartridges
A Meta Cartridge submits a Resource to a third-party Web Service for processing. Returned RDF supplements the RDF generated by Extractor and other Meta Cartridges. Locally generated RDF may also be submitted to the third-party services, instead-of or in-addition-to the original Resource itself.
Default Sponger behavior is for all installed Meta Cartridges to be brought to bear on all submitted Resources.
- Complete list of supported Meta Cartridges
- Meta Cartridge Usage via REST Request
- Parametrized Examples of Meta Cartridge Usage via REST Request
Sponger Cartridge-based, Dynamic Linked Data Cloud
Click the image for a full-size, clickable version!
Sponger pragmas
Virtuoso's Sponger is a sophisticated piece of middleware that provides full Linked Data fidelity for pre-existing data objects or resources. This Linked Data is then accessible via HTTP-based Web Services, and SPARQL is enhanced with Sponger pragmas and some optional additions to the FROM clause. See full list of supported pragmas and usage examples.
Sponger Cartridge Configuration
Sponger Usage Examples
- SPARQL Processor Usage Example
- RDF Proxy Service Example
- Browsing & Exploring RDF View Example Using ODE
- Browsing & Exploring RDF View Example Using iSPARQL
- Basic Sponger Cartridge Example
- HTTP Example for Extracting Metadata using CURL
- RESTFul Interaction Examples
- Flickr Cartridge Example
- MusicBrainz Metadatabase Example
- SPARQL Tutorial -- Magic of SPARUL and Sponger
Other Related Pages
- Technical White Paper
- Supported Virtuoso Sponger Cartridges
- SPARQL Sponger
- Interacting with Sponger Middleware via RESTful Patterns
- Interacting with Sponger Meta Cartridge via RESTful Patterns
- Sponger Cartridge RDF Extractor
- Extending SPARQL IRI Dereferencing with RDF Mappers
- Programmer Guide for Virtuoso Linked Data Middleware ("Sponger")
- Create RDF Custom Cartridge Tutorial
- OpenLink-supplied Virtuoso Sponger Cartridges
- Virtuoso Authentication Server
- Virtuoso SPARQL OAuth Tutorial
- Virtuoso Sponger Access Control List (ACL) Setup
- WebID Protocol & SPARQL Endpoint ACLs Tutorial
- Virtuoso Documentation
CategoryEvangelism CategoryDocumentation CategoryPR CategoryVirtuoso CategoryRDF CategorySPARQL