Virtuoso DBpedia 2016-04 Live Edition Pay As You Go (PAGO) EBS-backed EC2 AMI

Introduction

In addition to the Instance-backed EC2 AMI that has been in existence since 2008, Virtuoso is now also available as an EBS-backed EC2 AMI based on either a BYOL (Bring You Own License) or a PAGO (Pay As You Go) License Model. With either License Model, you will end up with a preconfigured Virtuoso instance. The fundamental benefits provided by this type of AMI include:

  • Virtuoso DBMS Server is preinstalled with basic tuning for the host operating system.
  • DBpedia 2016-04 Database is preloaded and pre-configured.
  • You can start and stop the DBpedia instance without having to terminate its host AMI.
  • With the hourly model, you pay only for the time the AMI is used.

Prerequisites

  • An Amazon Web Services (AWS) account.
  • Recently created AWS accounts will have been automatically signed up for the Amazon S3 and EC2 Web Service. If you created your AWS account a long time ago, you may now need to manually sign up for these services.
  • Ensure an AWS security group allowing access to ports 22 (SSH) and 80 (HTTP) is used.

Instantiating Virtuoso DBpedia 2016-04 Live Edition PAGO Enterprise Edition via Web Interface

  1. Locate the Virtuoso DBpedia 2016-04 Live Edition PAGO image in AWS Marketplace and click the Continue button.

    AWS Marketplace DBpedia 2016-04 Live Edition

  2. Choose a suitable size EC2 Instance Type and Key Pair, then click on the button Launch with 1-click button.

    AWS Marketplace DBpedia 2016-04 Live Edition Launch on EC2

  3. A confirmation dialog will be presented indicating the image has been deployed.

    AWS Marketplace DBpedia 2016-04 Live Edition now Deployed

  4. Check the in the AWS Console EC2 images Web Interface that the image has been successfully instantiated.

    AWS EC2 Launched Image

First-time Setup & Usage Notes

These steps are only necessary when you start the DBpedia DB for the first time, immediately after instantiating the AMI.

This section may be ignored thereafter, as it is not necessary after AMI reboots.

  1. ssh into your instantiated AMI using:

    ssh -i {secure-pem-file} ec2-user@{ec2-dns-name-or-ip-address}

  2. Start the Virtuoso DBMS Server against the DBpedia Database by issuing the command:

    sudo /etc/rc.d/init.d/virtuoso restart



    Note: It takes the Virtuoso DBMS Server approximately 20 minutes to bring the DBpedia database online, due to its size.

  3. Once online, your DBpedia instance will be ready for use from

    • Basic Linked Data Exploration Page

      http://{amazon-ec2-ami-dns-name-or-ip-address}/resource/DBpedia

    • Advanced Faceted Browsing Page

      http://{amazon-ec2-ami-dns-name-or-ip-address}/describe/?uri=http://dbpedia.org/resource/DBpedia

    • Faceted Browsing Endpoint

      http://{amazon-ec2-ami-dns-name-or-ip-address}/fct

    • SPARQL Query Service Endpoint

      http://{amazon-ec2-ami-dns-name-or-ip-address}/sparql

    • Virtuoso Instance Admin Page (Conductor)

      http://{amazon-ec2-ami-dns-name-or-ip-address}/conductor



  4. We strongly recommend you now use the Conductor to change the password for the 'dba' user from the the AMI instance-id.

    1. Retrieve the AMI instance-id from the AMI properties presented by the Amazon AWS console UI, or by executing the following in the Linux shell:

      curl http://169.254.169.254/latest/meta-data/instance-id

    2. Load the Conductor interface

      http://{amazon-ec2-ami-dns-name-or-ip-address}/conductor

    3. At the authentication challenge, log in as the dba user, with the AMI instance-id as the password.
    4. Drill down to System Admin -> User Accounts.
    5. Locate the dba user, and click the associated Edit link.
    6. The form allows many things to be changed. For now, just input your desired password into both Password and Confirm Password boxes, and click the Save button.
    7. You can now perform other administrative tasks through the Conductor interface, or return to basic DBpedia use.
Note: If unable to connect to the Virtuoso server using the instance-id as password, please contact technical.support@openlinksw.com for assistance.

DBpedia 2016-04 Live Edition Database Interaction via Web Interface

  • An obvious starting point for DBpedia database access is

    http://{amazon-ec2-ami-dns-name-or-ip-address}/resource/DBpedia

  • To administer the Virtuoso DBMS Server, go to

    http://{amazon-ec2-ami-dns-name-or-ip-address}/conductor

Administering Virtuoso Instance via SSH

All scripts for starting and stopping the Virtuoso instance are found in the following locations --

  • /etc/rc.d/init.d -- scripts that enable automatic database server instantiation at operating system (AMI) boot or reboot time

  • /opt/virtuoso -- scripts for starting and stopping the database server within a running operating system (AMI)

OpenLink License Manager

  • Start the License Manager:

    /etc/rc.d/init.d/oplmgr start

  • Stop the License Manager

    /etc/rc.d/init.d/oplmgr stop

  • Restart the License Manager

    /etc/rc.d/init.d/oplmgr restart

Virtuoso Server

  • Start the Virtuoso Server:

    /etc/rc.d/init.d/virtuoso start

  • Stop the Virtuoso Server:

    /etc/rc.d/init.d/virtuoso stop

  • Restart the Virtuoso Server:

    /etc/rc.d/init.d/virtuoso restart

Virtuoso Database Instance Interaction

  1. Set the Virtuoso environment variables by running the command

    . /opt/virtuoso/virtuoso-enterprise.sh

  2. Run the Virtuoso "isql" command line tool to connect to the database. Note: your EC2 AMI's instance-id will be the dba user's password, until you change it (as recommended above).

    $ isql 1111 -U dba -P {Password} Connected to OpenLink Virtuoso Driver: 07.10.3214 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL>

  3. Run the "tables" command to obtain a list of tables in the default schema

    SQL> tables; Showing SQLTables of tables like 'NULL.NULL.NULL', tabletype/colname like 'NULL' TABLE_QUALIFIER TABLE_OWNER TABLE_NAME TABLE_TYPE REMARKS VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR _______________________________________________________________________________ DB DBA ADMIN_SESSION SYSTEM TABLE NULL DB DBA ADM_OPT_ARRAY_TO_RS_PVIEW SYSTEM TABLE NULL DB DBA ADM_XML_VIEWS SYSTEM TABLE NULL . . . DB DBA SYS_SQL_INVERSE SYSTEM TABLE NULL DB DBA SYS_TRIGGERS SYSTEM TABLE NULL DB DBA SYS_VIEWS SYSTEM TABLE NULL 209 Rows. -- 1890 msec. SQL>

  4. You can stop the Virtuoso Database Server by running:

    virtuoso-stop.sh dbpedia

  5. You can restart the Virtuoso Database Server by running:

    virtuoso-start.sh dbpedia

Enable DBpedia Live Updates

The DBpedia Integrator utility program is provided, that downloads changesets from the DBpedia live website and processes them into the local virtuoso instance on this AMI to keep the DBpedia datasets updated with live Wikipedia updates.

To enable the DBpedia Live updates:

  1. Goto the /dbpedia/dbpintegrator directory.

    -bash-4.2$ cd /dbpedia/dbpintegrator

  2. Run the command sudo sh update_ontology.sh once to check the setup and attempt to update the database with the latests ontology fixes.

    -bash-4.2$ sudo sh update_ontology.sh -bash-4.2$

  3. Run the command sudo sh update_changesets.sh to enable the DBpedia Live updates i.e. start loading the various change sets available.

    -bash-4.2$ sudo sh update_changesets.sh nohup: appending output to ?nohup.out? -bash-4.2$

  4. A web page for viewing the live updates to the AMI instance is available at http://{amazon-ec2-ami-dns-name-or-ip-address}/live where the updates can be viewed as they occur.

    AWS EC2 Launched Image

Note: It may take a number of hours or days, depending on server resources and bandwidth, for all the change sets to be loaded and the DBpedia instance upto date and obtaining realtime updates from Wikipedia. Monitor the live update web page Latest changes and Top 20 Most Recently Updated Entities sections to see the current state of the live update process.

Note: When steps 2 & 3 are run the first time they will setup the password for connecting to Virtuoso that is derived from instance-id. Thus, if the password has been changed, this script needs to be updated with the same password.

If anything goes wrong it will logged in the associate log file dbpedia_dbms_errors.log otherwise update progress is written to dbp.log .

Setting up CRON job

The Linux CRON utility can be used to automatically (re)start the scripts by adding a few lines to the cron setup for the root user:


# crontab -e

Which starts the CRON editor, add the following lines to the bottom of the file:


@hourly   /dbpedia/dbpintegrator/update_changesets.sh
@daily      /dbpedia/dbpintegrator/update_ontology.sh

and save the resulting file.

Performance Notes

In regards to performance, please be aware of the following:

  • We currently bundle a 10 Database Sessions and 4 CPU Affinity license with this AMI -- Database & CPU Affinity upgrade licenses are available as upgrade options.
  • There are a range of AMI choices, offering various combinations of system memory and CPU cores.

Collectively, the factors above affect the performance of your DBpedia instance. For best performance, use EC2 Instance Types with more memory and CPU cores.

Note: By default, this AMI is configured to run on an m3.large EC2 Instance Type. If a larger EC2 Instance Type is chosen, then the NumberOfBuffer and MaxDirtyBuffers parameters in the /opt/virtuoso/dbpedia/dbpedia.ini configuration file should be increased to correspond to the chosen Instance Type's available memory, as detailed in the Virtuoso Performance Tuning Guide:

EC2 Instance Type System RAM NumberOfBuffers? MaxDirtyBuffers?
m3.large 7 GB 680000 500000
r3.large 15 GB 1360000 1000000
r3.xlarge 30.5 GB 2720000 2000000
r3.2xlarge 61 GB 5450000 4000000

After changing these settings, restart the Virtuoso server as described above.

Troubleshooting

If you encounter any problems resolving the sample DBpedia URIs listed in the steps above, please:

  1. Determine whether Virtuoso is running, with this command

    ps -ef | grep "virt*" | grep -v grep

  2. Check the log of Virtuoso's most recent activity, with this command

    tail /dbpedia/*.log

Output of those commands will show whether Virtuoso DBpedia DB setup (which can take a while due to DB size) is still in progress, or setup has completed but Virtuoso awaits one of the following:

  • Startup command

    /etc/init.d/virtuoso start

  • Restart command

    /etc/init.d/virtuoso restart

Related Items

CategoryHowTo CategoryODS CategoryVirtuoso CategoryDocumentation CategoryEC2