<docbook><section><title>VirtSpongerCartridgeRDFExtractor</title><para> </para>
<bridgehead class="http://www.w3.org/1999/xhtml:h2">Sponger Cartridge RDF Extractor</bridgehead>
<para>Used to extract RDF from a Web Data Source.
 It consumes services from Virtuoso PL, C/C++, Java-based, and other RDF Extractors.</para>
<para>RDF mappers provide a way to extract metadata from non-RDF documents such as HTML pages, images, Office documents, etc., and pass this to the SPARQL sponger (crawler which retrieves missing source graphs).
 For brevity further in this article, we will refer to the &quot;RDF mapper&quot; simply as the &quot;mapper&quot;.</para>
<bridgehead class="http://www.w3.org/1999/xhtml:h2">RDF Mappers Concept</bridgehead>
<para>RDF mappers consist of PL procedure (hook) and extractor, where extractor itself can be built using PL, C or any external language supported by Virtuoso server.
 See the <ulink url="VirtProgrammerGuideRDFCartridge">Sponger Cartridge RDF Extractor PL Requirements</ulink> for more information.</para>
<para>Once the mapper is developed, it must be plugged into the SPARQL engine by adding a record to the table DB.DBA.SYS_RDF_MAPPERS.</para>
<para> If a SPARQL query instructs the SPARQL processor to retrieve a target graph into local storage, then the SPARQL sponger will be invoked.
 If the target graph IRI represents a dereferenceable URL, then content will be retrieved using content negotiation.
The next step is to detect the content type:</para>
<itemizedlist mark="bullet" spacing="compact"><listitem>If RDF and no further transformation (such as GRDDL) is needed, then the process would stop.
</listitem>
<listitem>If &#39;text/plain&#39; and not known to have metadata, then the SPARQL sponger will look in the DB.DBA.SYS_RDF_MAPPERS table by order of RM_ID and for every matching URL or MIME type pattern (depends on column  RM_TYPE) will call the mapper hook.
<itemizedlist mark="bullet" spacing="compact"><listitem>If hook returns zero, the next mapper will be tried; </listitem>
<listitem>If result is negative, the process would stop instructing the SPARQL nothing was retrieved; </listitem>
<listitem>If result is positive, the process would stop instructing the SPARQL that metadata was retrieved.</listitem>
</itemizedlist></listitem>
</itemizedlist><bridgehead class="http://www.w3.org/1999/xhtml:h2">References</bridgehead>
<itemizedlist mark="bullet" spacing="compact"><listitem><ulink url="RDFMappers">RDF Mappers</ulink> </listitem>
<listitem><ulink url="SPARQLSponger">Virtuoso SPARQLSponger</ulink> </listitem>
<listitem><ulink url="VirtProgrammerGuideRDFCartridge">RDF Cartridge Programmer Guide</ulink> </listitem>
<listitem><ulink url="VirtSpongerCartridgeSupportedDataSources">OpenLink-supplied Virtuoso Sponger Cartridges</ulink></listitem>
</itemizedlist><para> <ulink url="CategoryVirtuoso">CategoryVirtuoso</ulink> <ulink url="CategorySpec">CategorySpec</ulink> <ulink url="CategoryDocumentation">CategoryDocumentation</ulink> <ulink url="CategoryODS">CategoryODS</ulink> <ulink url="CategoryRDF">CategoryRDF</ulink></para>
<para> </para>
</section></docbook>