This HTML5 document contains 55 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

PrefixNamespace IRI
n14http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp3.
n4http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp13.
n33http://rdfs.org/sioc/services#
n10http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp12.
dchttp://purl.org/dc/elements/1.1/
n11http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp4.
n15http://vos.openlinksw.com/dataspace/owiki#
n19http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp6.
n2http://vos.openlinksw.com/dataspace/owiki/wiki/VOS/
dctermshttp://purl.org/dc/terms/
n34http://vos.openlinksw.com/dataspace/services/wiki/
rdfshttp://www.w3.org/2000/01/rdf-schema#
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n12http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp5.
atomhttp://atomowl.org/ontologies/atomrdf#
n18http://cname/
n24http://vos.openlinksw.com/dataspace/dav#
xsdhhttp://www.w3.org/2001/XMLSchema#
n20http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp7.
n8http://vos.openlinksw.com:80/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp10.
siochttp://rdfs.org/sioc/ns#
n9http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp11.
n37http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/cr3.
n23http://vos.openlinksw.com/dataspace/person/owiki#
n16http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp8.
oplhttp://www.openlinksw.com/schema/attribution#
n25http://vos.openlinksw.com/dataspace/person/dav#
n7http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp14.
n27http://vos.openlinksw.com:80/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/.scp10.png.
n17http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp9.
foafhttp://xmlns.com/foaf/0.1/
n28http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#
n13http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp2.
siocthttp://rdfs.org/sioc/types#
n30http://vos.openlinksw.com/dataspace/owiki/wiki/VOS/VirtCrawlerSPARQLEndpoints/sioc.
n5http://vos.openlinksw.com/dataspace/owiki/wiki/
n36http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp1.
n38http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/cra1.
n26http://localhost:8890/
Subject Item
n25:this
foaf:made
n2:VirtCrawlerSPARQLEndpoints
Subject Item
n24:this
sioc:creator_of
n2:VirtCrawlerSPARQLEndpoints
Subject Item
n34:item
n33:services_of
n2:VirtCrawlerSPARQLEndpoints
Subject Item
n15:this
sioc:creator_of
n2:VirtCrawlerSPARQLEndpoints
Subject Item
n5:VOS
sioc:container_of
n2:VirtCrawlerSPARQLEndpoints
atom:entry
n2:VirtCrawlerSPARQLEndpoints
atom:contains
n2:VirtCrawlerSPARQLEndpoints
Subject Item
n2:VirtSetCrawlerJobsGuideSemanticSitemapsFuncExample
sioc:links_to
n2:VirtCrawlerSPARQLEndpoints
Subject Item
n2:VirtSetCrawlerJobsGuideSemanticSitemaps
sioc:links_to
n2:VirtCrawlerSPARQLEndpoints
Subject Item
n2:VirtCrawlerSPARQLEndpoints
rdf:type
atom:Entry sioct:Comment
dcterms:created
2017-06-13T05:49:29.939406
dcterms:modified
2017-06-29T07:36:32.084181
rdfs:label
VirtCrawlerSPARQLEndpoints
foaf:maker
n23:this n25:this
dc:title
VirtCrawlerSPARQLEndpoints
opl:isDescribedUsing
n30:rdf
sioc:has_creator
n15:this n24:this
sioc:attachment
n4:png n7:png n8:png n9:png n10:png n11:png n12:png n13:png n14:png n16:png n17:png n19:png n20:png n27:yk9ThZ n36:png n37:png n38:png
sioc:content
%META:TOPICPARENT{name="VirtSetCrawlerJobsGuide"}% ---+Setting up a Content Crawler Job to Retrieve Content from SPARQL endpoint The following step-by guide walks you through the process of: * Populating a Virtuoso Quad Store with data from a 3rd party SPARQL endpoint * Generating RDF dumps that are accessible to basic HTTP or WebDAV user agents. 1. Sample SPARQL query producing a list SPARQL endpoints: <verbatim> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX scovo: <http://purl.org/NET/scovo#> PREFIX void: <http://rdfs.org/ns/void#> PREFIX akt: <http://www.aktors.org/ontology/portal#> SELECT DISTINCT ?endpoint WHERE { ?ds a void:Dataset . ?ds void:sparqlEndpoint ?endpoint } </verbatim> 1 Here is a sample SPARQL protocol URL constructed from one of the sparql endpoints in the result from the query above: <verbatim> http://void.rkbexplorer.com/sparql/?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E+%0D%0APREFIX+void%3A+++++%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E++%0D%0ASELECT+distinct+%3Furl++WHERE+%7B+%3Fds+a+void%3ADataset+%3B+foaf%3Ahomepage+%3Furl+%7D%0D%0A&format=sparql </verbatim> 1 Here is the cURL output showing a Virtuoso SPARQL URL that executes against a 3rd party SPARQL Endpoint URL: <verbatim> $ curl "http://void.rkbexplorer.com/sparql/?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E+%0D%0APREFIX+void %3A+++++%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E++%0D%0ASELECT+distinct+%3Furl++WHERE+%7B+%3Fds+a+void%3ADataset+%3B+foaf%3Ah omepage+%3Furl+%7D%0D%0A&format=sparql" <?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head> <variable name="url"/> </head> <results ordered="false" distinct="true"> <result> <binding name="url"><uri>http://kisti.rkbexplorer.com/</uri></binding> </result> <result> <binding name="url"><uri>http://epsrc.rkbexplorer.com/</uri></binding> </result> <result> <binding name="url"><uri>http://test2.rkbexplorer.com/</uri></binding> </result> <result> <binding name="url"><uri>http://test.rkbexplorer.com/</uri></binding> </result> ... ... ... </results> </sparql> </verbatim> 1 Go to Conductor UI. For ex. http://localhost:8890/conductor : %BR%%BR%<a href="%ATTACHURLPATH%/scp1.png" target="_blank"><img src="%ATTACHURLPATH%/scp1.png" width="600px" /></a>%BR%%BR% 1 Enter dba credentials 1 Go to "Web Application Server"-> "Content Management" -> "Content Imports" %BR%%BR%<a href="%ATTACHURLPATH%/scp2.png" target="_blank"><img src="%ATTACHURLPATH%/scp2.png" width="600px" /></a>%BR%%BR% 1 Click "New Target" %BR%%BR%<a href="%ATTACHURLPATH%/scp3.png" target="_blank"><img src="%ATTACHURLPATH%/scp3.png" width="600px" /></a>%BR%%BR% 1 In the presented form enter for ex.: * "Crawl Job Name": voiD store * "Data Source Address (URL)": the url from above i.e.: <verbatim> http://void.rkbexplorer.com/sparql/?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E+%0D%0APREFIX+void%3A+++++%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E++%0D%0ASELECT+distinct+%3Furl++WHERE+%7B+%3Fds+a+void%3ADataset+%3B+foaf%3Ahomepage+%3Furl+%7D%0D%0A&format=sparql </verbatim> * "Local WebDAV Identifier": <verbatim> /DAV/void.rkbexplorer.com/content </verbatim> * "Follow links matching (delimited with ;)": <verbatim> % </verbatim> * Un-hatch "Use robots.txt" ; * "XPath expression for links extraction": <verbatim> //binding[@name="url"]/uri/text() </verbatim> * Hatch "Semantic Web Crawling"; * "If Graph IRI is unassigned use this Data Source URL:": enter for ex: <verbatim> http://void.collection </verbatim> * Hatch "Follow URLs outside of the target host"; * Hatch "Run "Sponger" and "Accept RDF" %BR%%BR%<a href="%ATTACHURLPATH%/scp4.png" target="_blank"><img src="%ATTACHURLPATH%/scp4.png" width="600px" /></a> %BR%<a href="%ATTACHURLPATH%/scp5.png" target="_blank"><img src="%ATTACHURLPATH%/scp5.png" width="600px" /></a>%BR%%BR% 1 Click "Create". 1 The target should be created and presented in the list of available targets: %BR%%BR%<a href="%ATTACHURLPATH%/scp7.png" target="_blank"><img src="%ATTACHURLPATH%/scp7.png" width="600px" /></a>%BR%%BR% 1 Click "Import Queues": %BR%%BR%<a href="%ATTACHURLPATH%/scp8.png" target="_blank"><img src="%ATTACHURLPATH%/scp8.png" width="600px" /></a>%BR%%BR% 1 Click "Run" for the imported target: %BR%%BR%<a href="%ATTACHURLPATH%/scp9.png" target="_blank"><img src="%ATTACHURLPATH%/scp9.png" width="600px" /></a>%BR%%BR% 1 To check the retrieved content go to "Web Application Server"-> "Content Management" -> "Content Imports" -> "Retrieved Sites": %BR%%BR%<a href="%ATTACHURLPATH%/scp11.png" target="_blank"><img src="%ATTACHURLPATH%/scp11.png" width="600px" /></a>%BR%%BR% 1 Click voiD store -> "Edit": %BR%%BR%<a href="%ATTACHURLPATH%/scp12.png" target="_blank"><img src="%ATTACHURLPATH%/scp12.png" width="600px" /></a>%BR%%BR% 1 To check the imported URLs go to "Web Application Server"-> "Content Management" -> "Repository" path <b>DAV/void.rkbexplorer.com/content</b>: %BR%%BR%<a href="%ATTACHURLPATH%/scp10.png" target="_blank"><img src="%ATTACHURLPATH%/scp10.png" width="600px" /></a>%BR%%BR% 1 To check the inserted into the RDF QUAD data go to http://cname/sparql and execute the following query: <verbatim> SELECT * FROM <http://void.collection> WHERE { ?s ?p ?o } </verbatim> %BR%%BR%<a href="%ATTACHURLPATH%/scp13.png" target="_blank"><img src="%ATTACHURLPATH%/scp13.png" width="600px" /></a>%BR%%BR% %BR%%BR%<a href="%ATTACHURLPATH%/scp14.png" target="_blank"><img src="%ATTACHURLPATH%/scp14.png" width="600px" /></a>%BR%%BR% ---++Related * [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler][Setting up a Content Crawler Job to Add RDF Data to the Quad Store]] * [[VirtSetCrawlerJobsGuideSitemaps][Setting up a Content Crawler Job to Retrieve Sitemaps]] (when the source includes RDFa) * [[VirtSetCrawlerJobsGuideSemanticSitemaps][Setting up a Content Crawler Job to Retrieve Semantic Sitemaps]] (a variation of the standard sitemap) * [[VirtSetCrawlerJobsGuideDirectories][Setting up a Content Crawler Job to Retrieve Content from Specific Directories]] * [[VirtCrawlerGuideAtom][Setting up a Content Crawler Job to Retrieve Content from ATOM feed]]
sioc:id
375585bb5d3b7b0fa04f28ce2a196565
sioc:link
n2:VirtCrawlerSPARQLEndpoints
sioc:has_container
n5:VOS
n33:has_services
n34:item
atom:title
VirtCrawlerSPARQLEndpoints
sioc:links_to
n2:VirtSetCrawlerJobsGuideDirectories n18:sparql n2:VirtCrawlerGuideAtom n26:conductor n28:rdfinsertmethodvirtuosocrawler n2:WebDAV n2:VirtSetCrawlerJobsGuideSitemaps
atom:source
n5:VOS
atom:author
n25:this
atom:published
2017-06-13T05:49:29Z
atom:updated
2017-06-29T07:36:32Z
sioc:topic
n5:VOS