• Topic
  • Discussion
  • VOS.VirtDBpediaSnapshotPagoAmi(1.6) -- DAVWikiAdmin? , 2018-05-17 12:37:01 Edit WebDAV System Administrator 2018-05-17 12:37:01

    DBpedia Snapshot (Virtuoso PAGO) EBS-backed EC2 AMI

    Introduction

    In addition to the Instance-backed EC2 AMI that has been available since 2008, a standard unpopulated Virtuoso instance is available as an EBS-backed EC2 AMI based on either a BYOL (Bring Your Own License) or a PAGO (Pay As You Go) basis. In each case, the AMI delivers a preconfigured Virtuoso instance.

    We also now offer two PAGO variants, both pre-loaded with DBpedia data.

    • The DBpedia Snapshot (Virtuoso PAGO) (documented on this page) starts as a static instance, preloaded with the DBpedia 2016-10 dataset, mirroring the public DBpedia instance found at http://dbpedia.org/sparql. You can make changes to this data, but it will not track changes made to Wikipedia nor DBpedia-Live.

    • The DBpedia Live (Virtuoso PAGO) (documented on another page) starts as a static instance, preloaded with the DBpedia 2016-04 dataset, and includes an optional switch that enables data updates based on the Wikipedia firehose, effectively giving you a mirror of the public DBpedia-Live instance found at http://live.dbpedia.org/sparql.

    This type of AMI provides several fundamental benefits including —

    • Virtuoso DBMS Server is preinstalled with basic tuning for the host operating system. (That said, since we support many AMI machine types/sizes, you should still tune the configuration to suit the available RAM in your instance.)
    • DBpedia Dataset is preloaded and preconfigured (and may be configurable to auto-update).
    • You can start and stop the DBpedia instance without having to terminate its host AMI.
    • With the hourly model, you pay only for the time the AMI is used.

    Prerequisites

    • An Amazon Web Services (AWS) account.
    • Recently created AWS accounts will have been automatically signed up for the Amazon S3 and EC2 Web Service. If you created your AWS account a long time ago, you may now need to manually sign up for these services.
    • Ensure an AWS security group allowing access to ports 22 (standard SSH), 80 (standard HTTP), and 8890 (Virtuoso HTTP-based Admin) is used. (This is the setup of the AMI offerings.)

    Instantiating DBpedia Snapshot (Virtuoso PAGO) via Web Interface

    1. Locate the DBpedia Snapshot (Virtuoso PAGO) image in AWS Marketplace and click the Continue button.

      AWS Marketplace DBpedia Snapshot (Virtuoso PAGO)

    2. Choose a suitable size EC2 Instance Type and Key Pair, then click on the button Launch with 1-click button.

      AWS Marketplace DBpedia Snapshot (Virtuoso PAGO) Launch on EC2

    3. A confirmation dialog will be presented indicating the image has been deployed.

      AWS Marketplace DBpedia Snapshot (Virtuoso PAGO) now Deployed

    4. Check the in the AWS Console EC2 images Web Interface that the image has been successfully instantiated.

      AWS EC2 Launched Image

    5. Load the Virtuoso Admin Console (a/k/a Conductor) in your browser to confirm all is running properly:

      http://{amazon-ec2-ami-dns-name-or-ip-address}:8890/conductor

    First-time Setup & Usage Notes

    These steps in this section are only necessary the first time you start the Virtuoso instances on the AMI. This section may be ignored thereafter, as it is not necessary after AMI reboots.

    There are two Virtuoso instances in this AMI. One which comes up quickly, with no significant content, so you know the AMI is basically functional; and one which comes up more slowly, with the full DBpedia dataset, which takes significant time to start due to some Amazon requirements for such AWS instances.

    Basic Instance

    1. ssh into your instantiated AMI using a command of the form —

      ssh -i {secure-pem-file} ec2-user@{amazon-ec2-dns-name-or-ip-address}

    2. The Virtuoso DBMS Server for the Basic Instance will have started with the AMI. You can verify this with --

      ps -ef | grep "virt*" | grep -v grep

      • If you do not see a running instance, execute the following commands, and then repeat the command above.

        cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-start.sh database

    3. We strongly recommend you now use the Conductor to change the password for the 'dba' user from the the AMI instance-id.

      1. Retrieve the AMI instance-id by either --
        • checking the AMI properties presented by the Amazon AWS console UI --

          AWS Console

        • executing the following command in the Linux shell --

          curl http://169.254.169.254/latest/meta-data/instance-id

      2. Execute this command sequence --

        cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash cd bin ./isql localhost:1112 dba

      3. When prompted for password, provide the instance-id
      4. At the SQL> prompt, execute these two commands --

        vad_install('/opt/virtuoso/vad/conductor_dav.vad',0) quit

      5. Load the Conductor interface.

        http://{amazon-ec2-ami-dns-name-or-ip-address}:8890/conductor

        • If you get any error at this point, execute the following commands, and then re-try loading the Conductor in your web browser.

          cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-stop.sh database virtuoso-start.sh database

      6. At the authentication challenge, log in as the dba user, with the AMI instance-id as the password. Note: If unable to connect to the Virtuoso server using the instance-id as password, please register with our Support Site, and create a Support Case for fastest assistance.
      7. Drill down to System AdminUser Accounts.
      8. Locate the dba user, and click the associated Edit link.
      9. The form allows many things to be changed. For now, just input your desired password into both Password and Confirm Password boxes, and click the Save button.
      10. You can now perform other administrative tasks through the Conductor interface, or return to basic use.

    DBpedia Instance

    1. ssh into your instantiated AMI using —

      ssh -i {secure-pem-file} ec2-user@{ec2-dns-name-or-ip-address}

    2. Stop the Basic Instance (to conserve system and license resources) by running —

      virtuoso-stop.sh dbpedia

    3. Start the Virtuoso DBMS Server against the DBpedia Database by issuing the commands below. Note: At initial launch, it takes the Virtuoso DBMS Server approximately 20 minutes to bring the DBpedia database online, due to its size.

      cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-start.sh dbpedia

    4. We strongly recommend you now use the Conductor to change the password for the 'dba' user from the the AMI instance-id.

      1. Retrieve the AMI instance-id by either --
        • checking the AMI properties presented by the Amazon AWS console UI --

          AWS Console

        • executing the following command in the Linux shell --

          curl http://169.254.169.254/latest/meta-data/instance-id

      2. Load the Conductor interface

        http://{amazon-ec2-ami-dns-name-or-ip-address}/conductor

        • If you get any error at this point, execute the following commands, and then re-try loading the Conductor in your web browser.

          cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-stop.sh dbpedia virtuoso-start.sh dbpedia

      3. At the authentication challenge, log in as the dba user, with the AMI instance-id as the password. Note: If unable to connect to the Virtuoso server using the instance-id as password, please register with our Support Site, and create a Support Case for fastest assistance.
      4. Drill down to System AdminUser Accounts.
      5. Locate the dba user, and click the associated Edit link.
      6. The form allows many things to be changed. For now, just input your desired password into both Password and Confirm Password boxes, and click the Save button.
      7. You can now perform other administrative tasks through the Conductor interface, or return to basic DBpedia use.

    DBpedia Snapshot (Virtuoso PAGO) Database Interaction via Web Interface

    Once online, your DBpedia Snapshot instance will be ready for use from —

    • Basic Linked Data Exploration Page — an obvious starting point

      http://{amazon-ec2-ami-dns-name-or-ip-address}/resource/DBpedia

    • Faceted Browsing Endpoint

      http://{amazon-ec2-ami-dns-name-or-ip-address}/fct

    • Advanced Faceted Browsing Page

      http://{amazon-ec2-ami-dns-name-or-ip-address}/describe/?uri=http://dbpedia.org/resource/DBpedia

    • SPARQL Query Service Endpoint

      http://{amazon-ec2-ami-dns-name-or-ip-address}/sparql

    • Virtuoso Instance Administration Page (Virtuoso Conductor)

      http://{amazon-ec2-ami-dns-name-or-ip-address}/conductor

    Administering the Virtuoso Instance via SSH

    All scripts for starting and stopping the Virtuoso instance are found in the following locations —

    • /etc/rc.d/init.d — scripts that enable automatic database server instantiation at operating system (AMI) boot or reboot time

    • /opt/virtuoso — scripts for starting and stopping the database server within a running operating system (AMI)

    License Manager

    The OpenLink License Manager must be launched before you launch the Virtuoso instance, and must remain running at all times for Virtuoso to run.

    • Start the License Manager

      /etc/rc.d/init.d/oplmgr start

    • Stop the License Manager

      /etc/rc.d/init.d/oplmgr stop

    • Restart the License Manager

      /etc/rc.d/init.d/oplmgr restart

    Virtuoso Server

    • Start the Basic Virtuoso Server

      /etc/rc.d/init.d/virtuoso start

    • Stop the Basic Virtuoso Server

      /etc/rc.d/init.d/virtuoso stop

    • Restart the Basic Virtuoso Server

      /etc/rc.d/init.d/virtuoso restart

    Virtuoso DBpedia Server

    • Start the DBpedia Virtuoso Server

      cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-start.sh dbpedia

    • Stop the DBpedia Virtuoso Server

      cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-stop.sh dbpedia

    • Restart the DBpedia Virtuoso Server

      cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-stop.sh dbpedia virtuoso-start.sh dbpedia

    Command-line Interaction with the Virtuoso Database Instance

    1. Set the Virtuoso environment variables by running the command below. Note: This does and must start with dot-space-slash.

      . /opt/virtuoso/virtuoso-enterprise.sh

    2. Run the Virtuoso "isql" command line tool to connect to the database. Note: your EC2 AMI's instance-id will be the dba user's password, until you change it (as recommended above).

      $ isql 1111 -U dba -P {Password} Connected to OpenLink Virtuoso Driver: 07.10.3214 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. SQL>

    3. Run the "tables" command to obtain a list of tables in the default schema

      SQL> tables; Showing SQLTables of tables like 'NULL.NULL.NULL', tabletype/colname like 'NULL' TABLE_QUALIFIER TABLE_OWNER TABLE_NAME TABLE_TYPE REMARKS VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR _______________________________________________________________________________ DB DBA ADMIN_SESSION SYSTEM TABLE NULL DB DBA ADM_OPT_ARRAY_TO_RS_PVIEW SYSTEM TABLE NULL DB DBA ADM_XML_VIEWS SYSTEM TABLE NULL . . . DB DBA SYS_SQL_INVERSE SYSTEM TABLE NULL DB DBA SYS_TRIGGERS SYSTEM TABLE NULL DB DBA SYS_VIEWS SYSTEM TABLE NULL 209 Rows. -- 1890 msec. SQL>

    4. You can stop the Virtuoso Database Server by running —

      virtuoso-stop.sh dbpedia

    5. You can restart the Virtuoso Database Server by running —

      virtuoso-start.sh dbpedia

    Performance Notes

    Please be aware of the following, which impact the performance and utility of your AMI:

    • This AMI includes a bundled Virtuoso license which enables 10 Database Sessions and the use of 4 logical processors. Licenses that upgrade these attributes are available as paid upgrade options.

    • Virtuoso always takes full advantage of the memory it's configured to use. This may be much less than is found in its host environment/AMI. This AMI is pre-configured for an m3.large EC2 Instance Type, so will use 7GB of RAM. If you choose a larger EC2 Instance Type (such as the recommended m3.2xlarge), then the NumberOfBuffer and MaxDirtyBuffers parameters in the /opt/virtuoso/dbpedia/dbpedia.ini configuration file should be increased to correspond to the chosen Instance Type's available memory, as detailed in the Virtuoso Performance Tuning Guide. A few examples are shown below. After changing these or any other settings in the INI file, restart the Virtuoso server as described above.

      EC2 Instance Type System RAM Number Of Buffers Max Dirty Buffers
      m3.large 7 GB 680000 500000
      r3.large 15 GB 1360000 1000000
      r3.xlarge 30.5 GB 2720000 2000000
      r3.2xlarge 61 GB 5450000 4000000


    • There are a wide range of AMI choices, offering various combinations of system memory and logical processors. To improve performance, use an EC2 Instance Type with more memory and more logical processors. To make use of additional processors, you will need to also acquire an upgraded Virtuoso license.

    Troubleshooting

    If you encounter any problems resolving the sample DBpedia URIs listed in the steps above, please:

    1. Determine whether Virtuoso is running, with this command

      ps -ef | grep "virt*" | grep -v grep

    2. Check the log of Virtuoso's most recent activity with one of these commands
      • for the DBpedia instance

        tail /dbpedia/*.log

      • for the basic instance

        tail /opt/virtuoso/dbpedia/*.log

        The output of those commands will show you whether the initial Virtuoso DBpedia DB setup (which can take a while due to DB size) is still in progress, the setup encountered some error, or the setup has completed but Virtuoso awaits one of the following commands:
    • Startup Commands
      • Startup of the Basic Instance

        cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-start.sh database

      • Startup of the DBpedia Instance

        cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-start.sh dbpedia

    • Restart commands
      • Restart the Basic Instance

        cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-stop.sh database virtuoso-start.sh database

      • Restart the DBpedia Instance

        cd /opt/virtuoso . ./virtuoso-environment.sh # this must start dot-space-dot-slash virtuoso-stop.sh dbpedia virtuoso-start.sh dbpedia

    Related Items

    CategoryHowTo CategoryODS CategoryVirtuoso CategoryDocumentation CategoryEC2