This HTML5 document contains 55 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

PrefixNamespace IRI
dctermshttp://purl.org/dc/terms/
atomhttp://atomowl.org/ontologies/atomrdf#
foafhttp://xmlns.com/foaf/0.1/
n15http://vos.openlinksw.com/dataspace/services/wiki/
oplhttp://www.openlinksw.com/schema/attribution#
n2http://vos.openlinksw.com/dataspace/owiki/wiki/VOS/
n20http://pingthesemanticweb.com/
dchttp://purl.org/dc/elements/1.1/
n10http://vos.openlinksw.com/dataspace/dav#
rdfshttp://www.w3.org/2000/01/rdf-schema#
n16http://rdfs.org/sioc/services#
n12http://vos.openlinksw.com/dataspace/person/dav#
siocthttp://rdfs.org/sioc/types#
n18http://vos.openlinksw.com/dataspace/owiki/wiki/VOS/VirtProgrammerGuideRDFCartridge/sioc.
n5http://vos.openlinksw.com/dataspace/owiki/wiki/
n19http://demo.openlinksw.com/sparql?default-graph-uri=http%3A%2F%2Fdemo.openlinksw.com%2Ftutorial%2Fhosting%2Fho_s_30%2FWebCalendar%2Ftools%2Fsummary.txt&should-sponge=&query=select+*+%0D%0Awhere+%7B+%3Fs+%3Fp+%3Fo+%7D&format=text%2Fhtml&debug=
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n11http://vos.openlinksw.com/dataspace/owiki#
xsdhhttp://www.w3.org/2001/XMLSchema#
n14http://vos.openlinksw.com/dataspace/%28NULL%29/wiki/VOS/
n9http://vos.openlinksw.com/dataspace/person/owiki#
siochttp://rdfs.org/sioc/ns#
Subject Item
n12:this
foaf:made
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n10:this
sioc:creator_of
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n15:item
n16:services_of
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n11:this
sioc:creator_of
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n5:VOS
sioc:container_of
n2:VirtProgrammerGuideRDFCartridge
atom:entry
n2:VirtProgrammerGuideRDFCartridge
atom:contains
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx40
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx25
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx46
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSponger
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx11
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx3
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx7
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtProgrammerGuideRDFCartridge
rdf:type
sioct:Comment atom:Entry
dcterms:created
2017-06-13T05:46:52.125445
dcterms:modified
2017-06-29T07:39:33.144465
rdfs:label
VirtProgrammerGuideRDFCartridge
foaf:maker
n9:this n12:this
dc:title
VirtProgrammerGuideRDFCartridge
opl:isDescribedUsing
n18:rdf
sioc:has_creator
n11:this n10:this
sioc:content
%VOSWARNING% %VOSNAV% ---+ Sponger Cartridges (RDF Mappers or RDFizers) for Virtuoso This article explains how to develop and test a custom Sponger Cartridge. ---++ Concept Sponger Cartridges provide a modular approach to RDF-oriented entity extraction and ontology mapping as part of a Linked Data production pipeline. Typical sources include (X)HTML pages, images, Office documents, and PDFs, among others. Cartridges expose their functionality to service consumers via the Virtuoso Sponger ("Sponger"), a middleware layer that extracts and delivers RDF to other Virtuoso components such as the Web Crawler and the SPARQL Query Processor. In addition, it is directly exposed as a REST-style Web Service for exploitation by external applications and services. A Cartridge is comprised of an initialization PL procedure (hook) and an entity extractor and mapper. The entity extractor and mapping component can be developed using PL, <code>C</code>, or any external language supported by Virtuoso via the Virtuoso Server Extensions APIs. Once the cartridge has been developed, it is plugged into the Virtuoso Sponger by adding a record to the <code>DB.DBA.<nowiki>SYS_RDF_MAPPERS</nowiki></code> table. ---++ How Cartridges Work ---+++ SPARQL Query Processing When a SPARQL query is dispatched to Virtuoso, it invokes the Sponger during the act of URI dereferencing; i.e., it actually crawls the Web, locates a resource via it's URI/IRI (using a variety of RDF discovery heuristics), and then processes the data that the URI exposes. If RDF is discovered, the cartridges play no role. On the other hand, if RDF isn't discovered the Sponger will look in the <code>DB.DBA.<nowiki>SYS_RDF_MAPPERS</nowiki></code> table (in <code>RM_ID</code> order), and for every matching URI or MIME type pattern (depending on <code>RM_TYPE</code> column value), it will invoke the associated cartridge via the hook procedure. If the hook returns zero, the next cartridge will be tried. If the result is negative, the process stops, informing the SPARQL engine that nothing was retrieved. If the result is positive, the process stops, this time informing the SPARQL engine that RDF data was successfully retrieved. ---+++ Sponger Cartridge RDF Extractor PL hook Requirements Every PL function used to plug a cartridge into the SPARQL engine must have the following parameter signature: | <b><code>IN graph_iri VARCHAR</code></b> | the IRI of the local storage graph | | <b><code>IN <nowiki>new_origin_uri</nowiki> VARCHAR</code></b> | the URI of the target information resource | | <b><code>IN destination VARCHAR</code></b> | the IRI of the target graph | | <b><code>INOUT content ANY</code></b> | the content of the information resource retrieved for dispatch to the Sponger | | <b><code>INOUT async_queue ANY</code></b> | an asynchronous queue; can be used for background processing (if required) | | <b><code>INOUT ping_service ANY</code></b> | the value in the <code>[SPARQL]</code> section of a Virtuoso instance (i.e., <code><nowiki>PingService</nowiki></code> INI parameter) which enables RDF triple propagation and notification to pinger services such as <code>[[http://pingthesemanticweb.com/][http://pingthesemanticweb.com/]]<code> | | <b><code>INOUT api_key ANY</code></b> | a plain text ID, either a single key value or serialized vector of keys (basically the value of <code>RM_KEY</code> column of the <code>DB.DBA.<nowiki>SYS_RDF_MAPPERS</nowiki></code> table), which is used for handling of API keys for 3rd party Web Services | Note: the names of the parameters are not important, but their order and presence are vital. ---++ Implementation In the example script we implement a basic cartridge which maps a <code>text/plain</code> MIME type to an imaginary ontology, which extends the class <code>Document</code> from <code>FOAF</code> with properties '<code>txt:UniqueWords</code>' and '<code>txt:Chars</code>', where we specity the prefix '<code>txt:</code>' as '<code>urn:txt:v0.0:</code>'. <verbatim> USE DB; CREATE PROCEDURE DB.DBA.RDF_LOAD_TXT_META ( IN graph_iri VARCHAR, IN new_origin_uri VARCHAR, IN dest VARCHAR, INOUT ret_body ANY, INOUT aq ANY, INOUT ps ANY, INOUT ser_key ANY ) { DECLARE words, chars INT; DECLARE vtb, arr, subj, ses, str ANY; DECLARE ses ANY; -- if any error, we just say nothing can be done DECLARE EXIT HANDLER FOR SQLSTATE '*' { return 0; }; subj := COALESCE (dest, new_origin_uri); vtb := vt_batch (); chars := LENGTH (ret_body); -- using the text index procedures, we get a list of words vt_batch_feed (vtb, ret_body, 1); arr := vt_batch_strings_array (vtb); -- the list has 'word' and positions array, so we must divide by 2 words := LENGTH (arr) / 2; ses := string_output (); -- we compose a N3 literal http (sprintf ('<%s> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document> .\n', subj), ses); http (sprintf ('<%s> <urn:txt:v0.0:UniqueWords> "%d" .\n', subj, words), ses); http (sprintf ('<%s> <urn:txt:v0.0:Chars> "%d" .\n', subj, chars), ses); str := string_output_string (ses); -- we push the N3 text into the local store DB.DBA.TTLP (str, new_origin_uri, subj); RETURN 1; } ; DELETE FROM DB.DBA.SYS_RDF_MAPPERS WHERE RM_HOOK = 'DB.DBA.RDF_LOAD_TXT_META'; INSERT SOFT DB.DBA.SYS_RDF_MAPPERS ( RM_PATTERN, RM_TYPE, RM_HOOK, RM_KEY, RM_DESCRIPTION ) VALUES ( '(text/plain)', 'MIME', 'DB.DBA.RDF_LOAD_TXT_META', NULL, 'Text Files (demo)' ) ; -- here we set order to some large number so don't break existing mappers UPDATE DB.DBA.SYS_RDF_MAPPERS SET RM_ID = 2000 WHERE RM_HOOK = 'DB.DBA.RDF_LOAD_TXT_META'; </verbatim> To test the cartridge, we just use the <code>/sparql</code> endpoint with option 'Retrieve remote RDF data for all missing source graphs' to execute: <verbatim> SELECT * FROM <URL-of-a-txt-file> WHERE { ?s ?p ?o } </verbatim> Note that the <code>SPARQL_SPONGE</code> role must be granted to any user account (by default, the "<code>SPARQL</code>" user) which will be issuing SPARQL queries that invoke Sponger cartridges, to enable persistence to local storage. If the above is set correctly, then you can just hit [[http://demo.openlinksw.com/sparql?default-graph-uri=http%3A%2F%2Fdemo.openlinksw.com%2Ftutorial%2Fhosting%2Fho_s_30%2FWebCalendar%2Ftools%2Fsummary.txt&should-sponge=&query=select+*+%0D%0Awhere+%7B+%3Fs+%3Fp+%3Fo+%7D&format=text%2Fhtml&debug=on][this link]]. More complex examples can be found in the cartridges package implementation. ---++ Authentication in the Sponger To enable user-defined authentication, there are additional parameters for the <code><strong>/proxy</strong></code> and <code><strong>/sparql</strong></code> endpoints, which should be sent by RDF browser or iSPARQL users -- * for <code><strong>/proxy</strong></code> endpoint: <verbatim> 'login=&lt;account name&gt;' </verbatim> * for <code><strong>/sparql</strong></code> endpoint: <verbatim> get-login=&lt;account name&gt; </verbatim> ---++ References * [[VirtSpongerCartridgeRDFExtractor][RDF Cartridge Extractor Component]] * [[RDFMappers][SPARQL Processor and RDFizer Cartridges]] CategoryRDF CategoryVirtuoso %VOSCOPY%
sioc:id
2e2ed1d3156e9df3f64cfd6394535c8f
sioc:link
n2:VirtProgrammerGuideRDFCartridge
sioc:has_container
n5:VOS
n16:has_services
n15:item
atom:title
VirtProgrammerGuideRDFCartridge
sioc:links_to
n2:VirtSpongerCartridgeRDFExtractor n2:CategoryRDF n2:RDFMappers n14:UniqueWords n2:CategoryVirtuoso n19:on n20:
atom:source
n5:VOS
atom:author
n12:this
atom:published
2017-06-13T05:46:52Z
atom:updated
2017-06-29T07:39:33Z
sioc:topic
n5:VOS
Subject Item
n2:VirtSpongerRESTPatterns
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx22
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx4
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx24
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx2
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx39
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx8
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx1
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx41
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx6
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx27
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge
Subject Item
n2:VirtSpongerLinkedDataHooksIntoSPARQLEx45
sioc:links_to
n2:VirtProgrammerGuideRDFCartridge