OpenLink Software provides a backup of the current DBpedia Database as hosted on the live service at http://dbpedia.org/
, that users can restore into a Virtuoso EC2 AMI instance in the cloud, giving them an instance of DBpedia for their own use.
ami-59628630
or ami-c46084ad
) must be used with this backup.
m1.large
EC2 instance type, and we recommend a 64-bit large image AMI instance with at least 8GB of memory for best performance.Important — The following section should be added to the Virtuoso configuration file (/opt/virtuoso/database/virtuoso.ini
) to control and safeguard your SPARQL end-point against overzealous usage:
[SPARQL] MaxCacheExpiration = 1 ; Cache Expiration time in seconds that overrides Sponger's default cache invalidation scheme ExternalQuerySource = 1 ExternalXsltSource = 1 ResultSetMaxRows = 100000 ;DefaultGraph = http://demo.openlinksw.com/dataspace/person/demo MaxQueryCostEstimationTime = 10000 ; in seconds MaxQueryExecutionTime = 30 ; in seconds ;ImmutableGraphs = http://unknown:8890/dataspace ;PingService = http://rpc.pingthesemanticweb.com/ DefaultQuery = select distinct ?URI ?ObjectType where {?URI a ?ObjectType} limit 50 DeferInferenceRulesInit = 0 ; Defer Loading of inference rules at start up
Details about these settings can be found in the Virtuoso Online Documentation in the SPARQL Configuration File section.
The "DeferInferenceRulesInit = 1
" setting is important when hosting large RDF data sets like DBpedia, as it defers the load of the inference rules which can take quite some time (up to an hour) during server startup.
OAuth support can be used to secure the SPARQL endpoint by installing the conductor_dav.vad VAD package. This allows the /sparql endpoint to be disabled or mapped to the Virtuoso OAuth SPARQL service thereby requiring an API key to use the endpoint as detailed in the Virtuoso OAuth documentation.
Virtuoso Web Services ACLs can be used to control (limit) access to the SPARQL endpoint as detailed in the documentation link.
If you are running a Virtuoso EC2 AMI instance created before December 2, 2008, you will need to update your DBpedia VAD Application package to obtain the latest enhancements, by taking the following steps --
dbpedia_dav.vad
) package.
dbpedia_dav.vad
file and click the "open" button to select it.
http://your-ec2-instance-cname/conductor
.
http://your-ec2-instance-cname/ec2exts
to load the Virtuoso Extensions for Amazon EC2 Images login page and log in as the "dba" user.
Protocol: WebDAV/HTTP Host: s3.amazonaws.com Path or Bucket: dbpedia-version-32-bundle Backup File Prefix: dbpedia-version-32
Note that server restart may take a while as there are some initialization procedures that take some time to complete. Once these are complete, the restored DBpedia database is ready for use.
You can then access pages such as these on your DBpedia server:
http://your-ec2-instance-cname/About
http://your-ec2-instance-cname/sparql
http://your-ec2-instance-cname/resource/OpenLink_Software