Enhancements the Virtuoso Sponger brings to SPARQL

Why?
What?
How?

Basics
Details

INPUT Pragmas
GET Pragmas
SQL Pragmas
OUTPUT Pragmas

Sponger Usage Examples

Why?

In the world of Linked Data, the Web is treated as a global data space where every data object has an identifier (URI) that serves as a key to its entity-attribute-value (3-tuple or triples)-based description. To make these "keys" work, data object URIs have to be dereferenceable — i.e., they must resolve to actual object content through functionality commonly delivered via data object locator and retriever URI specializations (or subtypes) such as URLs.

What?

Virtuoso's Sponger is a sophisticated piece of middleware that provides full Linked Data fidelity for pre-existing data objects or resources. This Linked Data is then accessible via HTTP-based Web Services, and SPARQL is enhanced with Sponger pragmas (or directives) and some optional additions to the FROM clause.

How?

Basics

Sponger pragmas control various aspects of functionality —

Identifier Dereference: handled by INPUT pragmas.
Actual Data Retrieval: handled by GET pragmas.
SQL Code Generation: handled by SQL pragmas.
Output Format Adjustments: handled by OUTPUT pragmas.

Pragmas are qualified at usage time using the following pattern:

<pragma-type>:<actual-method> ["<method-modifier>"]

Details

INPUT Pragmas

INPUT Pragmas enable you control dereference behavior applied to a SPARQL query. Net effect, fine-grained control over how variables and explicit data object identifiers are dereferenced en route to creating base data from which SPARQL query solutions are derived.

Methods and method-modifiers associated with this pragma type include:

Method	Modifier(s)	Description	Usage Example
`input:default-graph-exclude`	`"<IRI>"`	Works like "`NOT FROM`" clause	Example
`input:default-graph-uri`	`"<IRI>"`	Works like "`FROM`" clause	Example
`input:freeze`		Blocks further changes in the list of source graphs. The web service endpoint (or similar non-web application) can edit an incoming query by placing a list of pragmas ending with `input:freeze` in front of the query text. If an intruder tries to place some graph names, they will get a compilation error, not access to the data. `input:freeze` disables all `input:grab-...` pragmas as well.	Example
`input:grab-all`	`"yes"`	Instructs the SPARQL processor to dereference everything related to the query. All variables and literal IRIs in the query become values for `input:grab-var` and `input:grab-iri`. The resulting performance may be very bad.	Example
`input:grab-base`	`"<IRI>"`	Specifies the base IRI to use when converting relative IRIs to absolute. (Default: empty string.)	Example
`input:grab-depth`	`"0"`	Sets the maximum 'degrees of separation' or links (predicates) between nodes in the target graph. Acceptable range is non-negative integers. `0` means unlimited.	Example
`input:grab-destination`	`"<IRI>"`	Overrides the default IRI dereferencing and Local Graph IRI designation. Retrieved content (triples) is stored in a graph IRI designated by the modifier value.	Example
`input:grab-follow-predicate`	`"<IRI>"`	Specifies a predicate IRI to be used when traversing a graph. (This pragma may be included multiple times). Synonym of `input:grab-seealso`.	Example
`input:grab-iri`	`"<IRI>"`	Specifies an IRI that should be retrieved before executing the rest of the query, if it is not in the quad store already. (This pragma can be included multiple times).	Example
`input:grab-limit`	`"<number>"`	Sets the maximum number of resources (triple subject or object IRIs) to be de-referenced. Acceptable range is non-negative integers. `0` means unlimited.	Example
`input:grab-loader`	`"<procedure-name>"`	Identifies the procedure used to retrieve, parse, and store content. (Default: `DB.DBA.RDF_SPONGE_UP`)	Example
`input:grab-resolver`	`"<procedure-name>"`	Identifies the procedure that handles IRI dereference and actual content retrieval via a specific data access protocol (e.g., HTTP). (Default: `DB.DBA.RDF_GRAB_RESOLVER_DEFAULT.)`	Example
`input:grab-seealso`	`"<IRI>"`	Synonym of `input:grab-follow-predicate`.	Example
`input:grab-var`	`"?<var-name>"`	Specifies the name of the SPARQL variable whose values should be used as IRIs of resources that should be downloaded.	Example
`input:grab-group-destination`	`"<IRI>"`	Resembles `input:grab-destination` but sponges will create individual graphs for Network Resource Fetch results, and in addition to this common routine, a copy of each Network Resource Fetch result will be added to the resource specified by the value of `input:grab-group-destination`. `input:grab-destination` redirects loadings; `input:grab-group-destination` duplicates them.	Example
`input:grab-intermediate`	`"<IRI>"`	Extends the set of IRIs to sponge, useful in combination with `input:grab-seealso`. If present, then, for a given subject, Network Resource Fetch will retrieve not only values of see-also predicates for that subject, but also the subject itself. The define value is not used in current implementation.	Example
`input:ifp`	`"<keyword>"`	Adds `IFP` keyword in `OPTION (QUIETCAST, ...)` clause in the generated SQL. The value of this define is not used yet; an empty string is safe for future extensions.	Example
`input:inference`	`"<IRI>"`	Specifies the name of an inference rule to provide context for backward-chained reasoner.	Example
`input:named-graph-exclude`	`"<IRI>"`	Works like "`NOT FROM NAMED`" clause	Example
`input:named-graph-uri`	`"<IRI>"`	Works like "`FROM NAMED`" clause	Example
`input:param`	`"<variable-name>"`	Declares a variable name to be used as a custom SPARQL protocol parameter. SPARQL query leverages this custom parameter using the special "`?::{variable}`" sytnax (excluding quotation marks). If query text is generated by a query builder that does not understand Virtuoso's SPARQL-BI extensions, then the generated query text may contain a conventional query variable as long as it uses the `define input:param "X"` pragma in its preamble. Note: This will not work for positional parameters; i.e., you cannot replace a SPARQL-BI reference like `?::3` with `?3` combined with a `define input:param "3"` pragma.	Example
`input:params`	`"<variable-name>"`	Synonym of `input:param`	Example
`input:same-as`	`"yes"`	Sets inference context for `owl:sameAs` (entity equivalence by name) reasoning and union expansion.	Example
`input:storage`	`"<IRI>"`	Sets dataset (quads) storage scope. The value is a storage identifier (IRI) where the default value is `virtrdf:DefaultQuadStorage`. If the value is an empty string, then only quads associated with Linked Data Views are used. This is a good choice for low-level admin procedures, for two reasons: they will not interfere with any changes in `virtrdf:DefaultQuadStorage`; and they will continue to work even if all compiler's metadata is corrupted, including the description of `virtrdf:DefaultQuadStorage`. (`define input:storage ""` switches the SPARQL compiler to a small set of metadata that is built in 'C' code and thus are very hard for end-users to corrupt.)	Example
`input:target-fallback-graph-uri`	`"<IRI>"`	This pragma tells the compiler to use `<XXX>` as target for SPARQL 1.1 `INSERT` and `DELETE` operations if no other graph is specified in the query.	Example
`input:with-fallback-graph-uri`	`"<IRI>"`	This pragma tells the compiler to use `<XXX>` as target both for SPARQL 1.1 operations if no other graph is specified and for default graph IRI if no other source graphs are named in the query.	Example

GET Pragmas

GET Pragmas enables you to control actual data-object content-retrieval behavior applied to a SPARQL query. The net effect is fine-grained control over data-access-oriented matters such as —

Data object content format, via content negotiation
Cache invalidation
Proxy handling

This pragma type is also usable as a comma-separated list of SPARQL ... FROM <options>. Its methods and method-modifiers include —

Method	Modifier(s)	Description	Usage Example
`get:accept`	`"application/xml"` `"application/rdf+xml"` `"application/rdf+turtle"` `"application/x-turtle"` `"application/turtle"` `"text/rdf+n3"` `"text/turtle"`	`get:accept` is most commonly used to access a web service that returns HTML by default but can also return RDF if forced to do so. The default value is `"application/rdf+xml; q=1.0, text/rdf+n3; q=0.9, application/rdf+turtle; q=0.5, application/x-turtle; q=0.6, application/turtle; q=0.5, text/turtle; q=1.0, application/xml; q=0.2, /; q=0.1"`	Example
`get:cartridge`	`"extractor"` `"meta"`	Designates the use of Sponger ?meta? or ?extractor? cartridges in the query being executed.	Example
`get:method`	`"GET"` `"MGET"`	`"GET"` loads the resource itself. `"MGET"` loads metadata about the resource.	Example
`get:private`	`""` `<graph_group_IRI>`	When used for sponging graph `X`, it adjusts graph-level security of graph `X` (and of `graph_group_IRI`, if specified) so that `X` becomes a privately accessible graph of the user who sponges the `X`. If `graph_group_IRI` is specified, `X` becomes accessible to users that can access `graph_group_IRI` with the same permissions they have on `graph_group_IRI`. The exact rules are — If graph is `virtrdf:`, an error is signaled. If graph name is an IRI of handshaked web service endpoint or "public IRI" of a handshaked web service endpoint, an error is signaled. If access is public by default, even for private graphs, an error is signaled and sponging is not tried. If default is "no access" but someone (other than current user) has specifically granted read access to the graph in question AND current user is not `dba` AND current user has no bit 32 permission on this graph, an error is signaled. If read access is public by default for world and disabled for private graphs, then the graph to be sponged is added to the group of private graphs. If current user is not `DBA`, current user is granted `read+write+sponge+admin` access to the graph to be sponged. In addition, current user gets special permission bit 32, indicating that the graph is made by private sponge of this specific user. If the value of `get:private` is an IRI, then — the IRI is supposed to be an IRI of "plain" graph group. An error is signaled in case of non-existing graph group, group of private graphs, or group of graphs to be replicated. the graph is added to that group. each non-`dba` user that can get list of files of the group will get permissions for the loaded graph equal to permissions they have on graph group minus "list" permission.	Example for entirely confidential database Example using private graphs
`get:proxy`	`"<host[:port]>"`	Similar to setting up a Web browser to work with a proxy-style HTTP server, this identifies the CNAME (URL `host:port` or `authority` component) to target if direct retrieval from the URL in the `FROM` clause or handling of a data object's dereferenceable identifier is not possible.	Example
`get:refresh`	`"<seconds>"`	Limits the lifetime of a local cached copy of the source. The value is in seconds.	Example
`get:query`			Example
`get:soft`	`"soft"` `"replace"` `"add"`	"`soft`" applies cache-invalidation to the sponged resource en route to replacing content or doing nothing. "`replace`" replaces triples stored in named graphs. "`add`" simply adds triples to existing named graphs.	Example
`get:uri`	`"<IRI>"`	Identifies a specific URI to be de-referenced, distinct from the document URL in the `FROM` clause of a SPARQL query. Typically, this would be used to deference a specific subject or object of a relation in the data retrieved in by the document URL in the `FROM` clause.	Example

SQL Pragmas

Pragmas to control code generation:

Method	Modifier(s)	Description	Usage Example
`sql:assert-user`	`"<username>"`	Defines the user who is supposed to be the single "proper" use for the query. If the compiler is launched by any other user, an error is signaled. The typical use is `define sql:assist-user "dba"`. This is too weak to be a security measure, but may help in debugging of security issues.	Example
`sql:big-data-const`			Example
`sql:describe-mode`	`""` `"SPO"` `"CBD"` `"OBJCBD"` `"<custom>"`	See detailed description here.	Example
`sql:globals-mode`	`"XSLT"` `"SQL"`	Tells how to print names of global variables. Supported values are `"XSLT"` — print colon before name of global variable `"SQL"` — print as usual	Example
`sql:gs-app-callback`		Application-specific callback, returns permission bits of a given graph.	Example
`sql:gs-app-uid`		Application-specific user-id to use in callback.	Example
`sql:log-enable`		Value that will be passed to SPARUL procedures, where it will be passed to log_enable() BIF. `define sql:log-enable N` will result in `log_enable(N, 1)` at the beginning of the operation; another log_enable() call will restore previous mode of transaction log at exit from the procedure including any error signaled from it. For example, set to `2` to disable logging to avoid a huge transaction after-image when sponging is deep and wide.	Example
`sql:param`	`"<variable-name>"`	Synonym of `input:param`	Example
`sql:params`	`"<variable-name>"`	Synonym of `input:param`	Example
`sql:select-option`		Value will be added as a global `OPTION()` clause of the generated SQL `SELECT`. This clause is always printed; it is always at least `OPTION (QUIETCAST, ...)`. The most popular use case is `define sql:table-option "ORDER"` to tell the SQL compiler to execute `JOINs` in the order of their use in the query; this can make query compilation much faster, but the compilation result can be terrible if you do not know precisely what you're doing and do not inspect the execution plan of the generated SQL query.	Example
`sql:signal-void-variables`		When set to `0`, this forces the SPARQL compiler to signal errors if some variables cannot be bound due to, for instance, misspelled names or attempts to make joins across disjoint domains. These diagnostics are especially important when the query is long. It is the most useful debugging variable if Linked Data Views are in use. It tells the SPARQL compiler to signal an error if it can prove that some variable can never be bound. Usually it means an error in the query, like a typo in IRI or a totally wrong triple pattern.	Example
`sql:table-option`		Value will be added as an option to each triple in the query, and later it will be printed in `TABLE OPTION (...)` clause of source table clause. This works only for SQL code for plain triples from `RDF_QUAD`; fragments of queries related to RDF Views will remain unchanged.	Example

OUTPUT Pragmas

Pragmas to control the type of the result.

Method	Modifier(s)	Description	Usage Example
`output:dict-format`	`"<format-specifier>"`	Tells the compiler that the query should produce a string output with the serialization of the result, not a result set. Only `CONSTRUCT` and `DESCRIBE` queries are affected by the value of `output:dict-format`. Use `output:scalar-format` and/or `output:format` for `ASK` queries.	Example
`output:format`	`"<format-specifier>"`	Tells the compiler that the query should produce a string output with the serialization of the result, not a result set. The value of `output:format` is primarily used for `SELECT` and data manipulation queries. It will also be used for `CONSTRUCT`, `DESCRIBE`, and `ASK` queries, if `output:dict-format` or `output:scalar-format` are not used.	Example
`output:scalar-format`	`"<format-specifier>"`	Tells the compiler that the query should produce a string output with the serialization of the result, not a result set. Only `ASK` queries are affected by the value of `output:scalar-format`. Use `output:dict-format` and/or `output:format` for `CONSTRUCT` or `DESCRIBE` queries.	Example
`output:valmode`	`"SQLVAL"` `"LONG"` `"AUTO"`	Tells the compiler which SQL datatypes should be used for output values. `"SQLVAL"`, the default, is appropriate for ODBC clients and the like which know nothing about RDF and expect plain SQL values. `"LONG"` tells the compiler to preserve RDF boxes as is and to return IRI IDs instead of IRI string value. This is good for when a Virtuoso/PL procedure is RDF-aware and keeps results to be passed on to other SPARQL queries or some low-level RDF routines. `"AUTO"`, is for dirty hackers that do not want any conversion of any sort at the output to read the SQL output of SPARQL front-end, who will find the format of each column and add the needed conversions later.	Example

Virtuoso Open-Source Edition