This HTML5 document contains 53 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
dctermshttp://purl.org/dc/terms/
atomhttp://atomowl.org/ontologies/atomrdf#
foafhttp://xmlns.com/foaf/0.1/
n4http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/
n18http://vos.openlinksw.com/dataspace/services/wiki/
oplhttp://www.openlinksw.com/schema/attribution#
n2http://vos.openlinksw.com/dataspace/owiki/wiki/VOS/
n20http://localhost:8890/
dchttp://purl.org/dc/elements/1.1/
n15http://vos.openlinksw.com/dataspace/dav#
rdfshttp://www.w3.org/2000/01/rdf-schema#
n19http://rdfs.org/sioc/services#
siocthttp://rdfs.org/sioc/types#
n7http://vos.openlinksw.com/dataspace/person/dav#
n17http://vos.openlinksw.com/dataspace/owiki/wiki/VOS/VirtSetCrawlerJobsGuideDirectories/
n9http://vos.openlinksw.com/dataspace/owiki/wiki/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n10http://vos.openlinksw.com/dataspace/owiki#
n21http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#
xsdhhttp://www.w3.org/2001/XMLSchema#
n8http://vos.openlinksw.com/dataspace/%28NULL%29/wiki/VOS/
n14http://vos.openlinksw.com/dataspace/person/owiki#
siochttp://rdfs.org/sioc/ns#

Statements

Subject Item
n7:this
foaf:made
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n15:this
sioc:creator_of
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n18:item
n19:services_of
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n10:this
sioc:creator_of
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n9:VOS
sioc:container_of
n2:VirtSetCrawlerJobsGuideDirectories
atom:entry
n2:VirtSetCrawlerJobsGuideDirectories
atom:contains
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n2:VirtSetCrawlerJobsGuideSemanticSitemapsFuncExample
sioc:links_to
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n2:VirtSetCrawlerJobsGuideDirectories
rdf:type
atom:Entry sioct:Comment
dcterms:created
2017-06-13T05:37:45.484655
dcterms:modified
2017-06-13T05:37:45.484655
rdfs:label
VirtSetCrawlerJobsGuideDirectories
foaf:maker
n14:this n7:this
dc:title
VirtSetCrawlerJobsGuideDirectories
opl:isDescribedUsing
n17:sioc.rdf
sioc:has_creator
n10:this n15:this
sioc:attachment
n4:d7.png n4:d5.png n4:d6.png n4:d3.png n4:d4.png n4:d1a.png n4:d2.png n4:cr3.png n4:d1.png n4:cr1.png n4:cr2.png
sioc:content
%META:TOPICPARENT{name="VirtSetCrawlerJobsGuide"}% ---+Setting up a Content Crawler Job to Retrieve Content from Specific Directories The following guide describes how to set up crawler job for getting directories using Conductor. 1 Go to Conductor UI. For ex. at http://localhost:8890/conductor . 1 Enter dba credentials. 1 Go to "Web Application Server". %BR%%BR%<a href="%ATTACHURLPATH%/cr1.png" target="_blank"><img src="%ATTACHURLPATH%/cr1.png" width="600px" /></a>%BR%%BR% 1 Go to "Content Imports". %BR%%BR%<a href="%ATTACHURLPATH%/cr2.png" target="_blank"><img src="%ATTACHURLPATH%/cr2.png" width="600px" /></a>%BR%%BR% 1 Click "New Target". %BR%%BR%<a href="%ATTACHURLPATH%/cr3.png" target="_blank"><img src="%ATTACHURLPATH%/cr3.png" width="600px" /></a>%BR%%BR% 1 In the shown form set respectively: * "Crawl Job Name": <verbatim> Gov.UK data </verbatim> * "Data Source Address (URL)": <verbatim> http://source.data.gov.uk/data/ </verbatim> * "Local WebDAV Identifier" for available user, for ex. demo: <verbatim> /DAV/home/demo/gov.uk/ </verbatim> * Choose from the available list "Local resources owner" an user, for ex. demo ; %BR%%BR%<a href="%ATTACHURLPATH%/d1.png" target="_blank"><img src="%ATTACHURLPATH%/d1.png" width="600px" /></a>%BR%%BR% * Click the button "Create". 1 As result the Robot target will be created: %BR%%BR%<a href="%ATTACHURLPATH%/d2.png" target="_blank"><img src="%ATTACHURLPATH%/d2.png" width="600px" /></a>%BR%%BR% 1 Click "Import Queues". %BR%%BR%<a href="%ATTACHURLPATH%/d3.png" target="_blank"><img src="%ATTACHURLPATH%/d3.png" width="600px" /></a>%BR%%BR% 1 For "Robot target" with label "Gov.UK data " click "Run". 1 As result will be shown the status of the pages: retrieved, pending or respectively waiting. %BR%%BR%<a href="%ATTACHURLPATH%/d4.png" target="_blank"><img src="%ATTACHURLPATH%/d4.png" width="600px" /></a>%BR%%BR% 1 Click "Retrieved Sites" 1 As result should be shown the number of the total pages retrieved. %BR%%BR%<a href="%ATTACHURLPATH%/d5.png" target="_blank"><img src="%ATTACHURLPATH%/d5.png" width="600px" /></a>%BR%%BR% 1 Go to "Web Application Server" -> "Content Management" . 1 Enter path: <verbatim> DAV/home/demo/gov.uk </verbatim> %BR%%BR%<a href="%ATTACHURLPATH%/d6.png" target="_blank"><img src="%ATTACHURLPATH%/d6.png" width="600px" /></a>%BR%%BR% 1 Go to path: <verbatim> DAV/home/demo/gov.uk/data </verbatim> 1 As result the retrieved content will be shown. %BR%%BR%<a href="%ATTACHURLPATH%/d7.png" target="_blank"><img src="%ATTACHURLPATH%/d7.png" width="600px" /></a>%BR%%BR% ---++Related * [[VirtSetCrawlerJobsGuide][Setting up Crawler Jobs Guide using Conductor]] * [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler][Setting up a Content Crawler Job to Add RDF Data to the Quad Store]] * [[VirtSetCrawlerJobsGuideSitemaps][Setting up a Content Crawler Job to Retrieve Sitemaps (where the source includes RDFa)]] * [[VirtSetCrawlerJobsGuideSemanticSitemaps][Setting up a Content Crawler Job to Retrieve Semantic Sitemaps (a variation of the standard sitemap)]] * [[VirtCrawlerSPARQLEndpoints][Setting up a Content Crawler Job to Retrieve Content from SPARQL endpoint]]
sioc:id
01b349799c30efc349e0f22448cc4a70
sioc:link
n2:VirtSetCrawlerJobsGuideDirectories
sioc:has_container
n9:VOS
n19:has_services
n18:item
atom:title
VirtSetCrawlerJobsGuideDirectories
sioc:links_to
n8:VirtSetCrawlerJobsGuideSitemaps n8:VirtSetCrawlerJobsGuideSemanticSitemaps n8:WebDAV n8:VirtSetCrawlerJobsGuide n8:VirtCrawlerSPARQLEndpoints n20:conductor n21:rdfinsertmethodvirtuosocrawler
atom:source
n9:VOS
atom:author
n7:this
atom:published
2017-06-13T05:37:45Z
atom:updated
2017-06-13T05:37:45Z
sioc:topic
n9:VOS
Subject Item
n2:VirtSetCrawlerJobsGuideSemanticSitemaps
sioc:links_to
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n2:VirtSetCrawlerJobsGuide
sioc:links_to
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n2:VirtSetCrawlerJobsGuideSitemaps
sioc:links_to
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n2:VirtCrawlerSPARQLEndpoints
sioc:links_to
n2:VirtSetCrawlerJobsGuideDirectories
Subject Item
n2:VirtCrawlerGuideAtom
sioc:links_to
n2:VirtSetCrawlerJobsGuideDirectories