DBpedia Live (Virtuoso PAGO) EBS-backed EC2 AMI

Introduction

In addition to the Instance-backed EC2 AMI that has been available since 2008, a standard unpopulated Virtuoso instance is available as an EBS-backed EC2 AMI based on either a BYOL (Bring Your Own License) or a PAGO (Pay As You Go) basis. In each case, the AMI delivers a preconfigured Virtuoso instance.

We also now offer two PAGO variants, each pre-loaded with a DBpedia dataset.

This type of AMI provides several fundamental benefits including —

Prerequisites

Instantiating DBpedia Live (Virtuoso PAGO) via Web Interface

  1. Locate the DBpedia Live (Virtuoso PAGO) image in AWS Marketplace and click the Continue to Subscribe button.

    AWS Marketplace DBpedia Live (Virtuoso PAGO)

  2. Choose a suitable size EC2 Instance Type and License Quantity, then click on the button Continue to Configuration button. An EC2 Instance Type with a minimum of 16GB RAM is recommended, m5.xlarge for example.

    AWS Marketplace DBpedia Live (Virtuoso PAGO) Launch on EC2

  3. Click on the Continue to Launch button.

    AWS Marketplace DBpedia Live (Virtuoso PAGO) now Deployed

  4. Review the configuration settings and once satisfied click the Launch button.

    AWS Marketplace DBpedia Live (Virtuoso PAGO) now Deployed
    AWS Marketplace DBpedia Live (Virtuoso PAGO) now Deployed
    AWS Marketplace DBpedia Live (Virtuoso PAGO) now Deployed

  5. Check the in the AWS Console EC2 images Web Interface that the image has been successfully instantiated.

    AWS EC2 Launched Image

First-time Setup & Usage Notes

These steps in this section are only necessary when you start the DBpedia DB for the first time, immediately after instantiating the AMI.

This section may be ignored thereafter, as it is not necessary after AMI reboots.

  1. ssh into your instantiated AMI using:

    ssh -i {secure-pem-file} ec2-user@{ec2-dns-name-or-ip-address}

  2. Start the Virtuoso DBMS Server against the DBpedia Database by issuing the following command. Note: At initial launch, it takes the Virtuoso DBMS Server a few minutes to bring the DBpedia database online, due to its size.

    sudo service virtuoso restart

  3. We strongly recommend you now use the Conductor to change the password for the 'dba' user from the the AMI instance-id.

    1. Retrieve the AMI instance-id by either --
      • checking the AMI properties presented by the Amazon AWS console UI --

        AWS Console

      • executing the following command in the Linux shell --

        curl http://169.254.169.254/latest/meta-data/instance-id

    2. Load the Conductor interface

      http://{amazon-ec2-ami-dns-name-or-ip-address}/conductor

    3. At the authentication challenge, log in as the dba user, with the AMI instance-id as the password. Note: If unable to connect to the Virtuoso server using the instance-id as password, please create a Support Case for fastest assistance.
    4. Drill down to System AdminUser Accounts.
    5. Locate the dba user, and click the associated Edit link.
    6. The form allows many things to be changed. For now, just input your desired password into both Password and Confirm Password boxes, and click the Save button.
    7. You can now perform other administrative tasks through the Conductor interface, or return to basic DBpedia use.

DBpedia Live (Virtuoso PAGO) Database Interaction via Web Interface

Once online, your DBpedia Live instance will be ready for use from —

Administering the Virtuoso Instance via SSH

All scripts for starting and stopping the Virtuoso instance are found in the following locations —

License Manager

The OpenLink License Manager must be launched before you launch the Virtuoso instance, and must remain running at all times for Virtuoso to run.

Virtuoso Server

Command-line Interaction with the Virtuoso Database Instance

  1. Set the Virtuoso environment variables by running the command below. Note: This does and must start with dot-space-slash.

    . /opt/virtuoso/virtuoso-enterprise.sh

  2. Run the Virtuoso "isql" command line tool to connect to the database. Note: your EC2 AMI's instance-id will be the dba user's password, until you change it (as recommended above).

    $ isql 1111 -U dba -P {Password} Connected to OpenLink Virtuoso Driver: 07.10.3214 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL>

  3. Run the "tables" command to obtain a list of tables in the default schema

    SQL> tables; Showing SQLTables of tables like 'NULL.NULL.NULL', tabletype/colname like 'NULL' TABLE_QUALIFIER TABLE_OWNER TABLE_NAME TABLE_TYPE REMARKS VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR _______________________________________________________________________________ DB DBA ADMIN_SESSION SYSTEM TABLE NULL DB DBA ADM_OPT_ARRAY_TO_RS_PVIEW SYSTEM TABLE NULL DB DBA ADM_XML_VIEWS SYSTEM TABLE NULL . . . DB DBA SYS_SQL_INVERSE SYSTEM TABLE NULL DB DBA SYS_TRIGGERS SYSTEM TABLE NULL DB DBA SYS_VIEWS SYSTEM TABLE NULL 209 Rows. -- 1890 msec. SQL>

  4. You can stop the Virtuoso Database Server by running —

    virtuoso-stop.sh

  5. You can restart the Virtuoso Database Server by running —

    virtuoso-start.sh

Enabling DBpedia Live Updates

The provided DBpedia Integrator utility program (dbpintegrator) downloads change-sets from the DBpedia live website, and processes them into the local Virtuoso instance on this AMI to keep your DBpedia datasets updated following changes to the Wikipedia.

To enable the DBpedia Live updates —

  1. Go to the /opt/virtuoso/dbpintegrator directory.

    -bash-4.2$ cd /opt/virtuoso/dbpintegrator

  2. Edit the file dbpedia_updates_downloader.ini and set the Store.pw param to the dba users password which by default will be set to the AMI instance-id, unless it has already been changed. Thus, if you have changed that password as recommended, you will need to update the script with the same password.

  3. Run the command sudo sh update_ontology.sh once to check the setup and attempt to update the database with the latest ontology fixes.

    -bash-4.2$ sudo sh update_ontology.sh -bash-4.2$

    Note: The first time these change-sets are applied to your instance, it may take several hours or even days, depending on server resources and bandwidth, for all the change-sets to be loaded, and so for the DBpedia instance to be brought up to date and subsequently to obtain realtime updates from Wikipedia. You can monitor the Latest changes and Top 20 Most Recently Updated Entities sections of the live update web page (http://{amazon-ec2-ami-dns-name-or-ip-address}/live) to see the current state of the live update process.
  4. Run the command sudo sh update_changesets.sh to start loading the available change-sets.

    -bash-4.2$ sudo sh update_changesets.sh nohup: appending output to ?nohup.out? -bash-4.2$

    Note 1: The update_changesets.sh script is written to use the default dba password that is derived from the AMI instance-id. Thus, if you have changed that password as recommended, you will need to update the script with the same password.

    Note 2: The first time these change-sets are applied to your instance, it may take several hours or even days, depending on server resources and bandwidth, for all the change-sets to be loaded, and so for the DBpedia instance to be brought up to date and subsequently to obtain realtime updates from Wikipedia. You can monitor the Latest changes and Top 20 Most Recently Updated Entities sections of the live update web page (http://{amazon-ec2-ami-dns-name-or-ip-address}/live) to see the current state of the live update process.
  5. A web page for viewing the live updates to the AMI instance is available at http://{amazon-ec2-ami-dns-name-or-ip-address}/live where the updates can be viewed as they occur.

    AWS EC2 Launched Image

If anything goes wrong it will logged in the associate log file dbpedia_dbms_errors.log otherwise update progress is written to dbp.log .

Setting up cron job

The Linux cron utility can be used to automatically (re)start the scripts by adding a few lines to the cron setup for the root user.

  1. Start the cron editor (based on vi) with —

    # crontab -e

  2. Navigate to the bottom of the file with the single keystroke, capital-G.
  3. Use the single keystroke, lowercase-O, to start a new line at the bottom, and add the following two lines (you can just copy-and-paste):

    @hourly /dbpedia/dbpintegrator/update_changesets.sh @daily /dbpedia/dbpintegrator/update_ontology.sh

  4. Save the edited file with the single keystroke, ESC, followed by the four-character string below, and ENTER:

    :wq!

Performance Notes

Please be aware of the following, which impact the performance and utility of your AMI:

Troubleshooting

If you encounter any problems resolving the sample DBpedia URIs listed in the steps above, please:

  1. Determine whether Virtuoso is running, with this command

    ps -ef | grep "virt*" | grep -v grep

  2. Check the log of Virtuoso's most recent activity, with this command

    tail /dbpedia/*.log

The output of those commands will show you whether the initial Virtuoso DBpedia DB setup (which can take a while due to DB size) is still in progress, the setup encountered some error, or the setup has completed but Virtuoso awaits one of the following commands:

Related Items

CategoryHowTo CategoryODS CategoryVirtuoso CategoryDocumentation CategoryEC2