DataTXT? Metacartridge Configuration guidelines

Migration from Spaziodati

Since the end of March, 2014, the old Spaziodati Metacartridge has been replaced with dataTXT to reflect the change of service-name upstream. If you are new to dataTXT, you don't need to worry about migrating from Spaziodati; just proceed with the regular documentation below. Otherwise,

References:

Getting Started with dataTXT

Overview

  1. Register and log in to dandelion/dataTXT
  2. From the dashboard, copy your appid and appkey
  3. In the Virtuoso Conductor, edit the dataTXT Metacartridge options and add the appid and appkey
  4. review the other options while you're there.

Screenshots

Registering with dandelion: go to the dandelion/dataTXT registration page and fill in your details.

Log in

Go to the dashboard

The AppID and AppKey will be presented.

In another tab, open the Virtuoso Conductor (http://localhost:8890/conductor/).

Navigate the menus to Linked Data / Sponger / Metacartridges.

Ensure the dataTXT metacartridge is enabled, and click the apply button at the bottom of the list.

Edit the dataTXT metacartridge and set the options to taste, adding the AppID? and AppKey? from the dataTXT-NEX dashboard.

Metacartridge Options


app_id=
app_key=

These identify you and the Sponger application with the dataTXT-NEX service.


include_types=true
include_categories=false
include_lod=false

These enable links to DBPedia and LOD categories where possible.


parse_hashtag=true

This enables parsing of hashtags, e.g. in tweet source texts.


abstract=true

This allows the inclusion of an abstract of the text in the returned annotations.


min_confidence=0.6

A lower bound: entities with a confidence less than this will not be returned.


min_length=2

Entities whose spot is a string shorter than this number of characters will not be returned.


epsilon=0.3

Epsilon controls the balance between choosing contexts biassed toward the local document (low values) or more globally common contexts (higher values).


max-entities=50

An overall limit on the number of entities (by decreasing confidence) to be included per document being sponged.