DataTXT? Metacartridge Configuration guidelines
Migration from Spaziodati
Since the end of March, 2014, the old Spaziodati Metacartridge has been replaced with dataTXT to reflect the change of service-name upstream. If you are new to dataTXT, you don't need to worry about migrating from Spaziodati; just proceed with the regular documentation below. Otherwise,
- The Spaziodati Metacartridge will be disabled and its description updated with a warning
- Your old appid and appkey parameters are no longer valid; you need to generate a new account with dataTXT
- Some old API parameters have undergone a slight change in meaning or been renamed
- Some new API parameters have been added
- The new dataTXT Metacartridge has reasonable defaults reflecting the upstream API; just generate and add your appid and appkey
- if you had specifically customized the options with Spaziodati you will need to review the dataTXT settings.
References:
Getting Started with dataTXT
Overview
- Register and log in to dandelion/dataTXT
- From the dashboard, copy your appid and appkey
- In the Virtuoso Conductor, edit the dataTXT Metacartridge options and add the appid and appkey
- review the other options while you're there.
Screenshots
Registering with dandelion: go to the dandelion/dataTXT registration page and fill in your details.
Log in
Go to the dashboard
The AppID and AppKey will be presented.
In another tab, open the Virtuoso Conductor (http://localhost:8890/conductor/).
Navigate the menus to Linked Data / Sponger / Metacartridges.
Ensure the dataTXT metacartridge is enabled, and click the apply button at the bottom of the list.
Edit the dataTXT metacartridge and set the options to taste, adding the AppID? and AppKey? from the dataTXT-NEX dashboard.
Metacartridge Options
app_id= app_key=
These identify you and the Sponger application with the dataTXT-NEX service.
include_types=true include_categories=false include_lod=false
These enable links to DBPedia and LOD categories where possible.
parse_hashtag=true
This enables parsing of hashtags, e.g. in tweet source texts.
abstract=true
This allows the inclusion of an abstract of the text in the returned annotations.
min_confidence=0.6
A lower bound: entities with a confidence less than this will not be returned.
min_length=2
Entities whose spot is a string shorter than this number of characters will not be returned.
epsilon=0.3
Epsilon controls the balance between choosing contexts biassed toward the local document (low values) or more globally common contexts (higher values).
max-entities=50
An overall limit on the number of entities (by decreasing confidence) to be included per document being sponged.