Today, we've evolved from early literal tagging (which didn't scale an iota) to wide use of @handles
and #hastags
.
Basically, @handles are social network specific HTTP URIs that denote Agents (People, Organizations, and Bots) while #hashtags are HTTP URIs that denote Topics.
#hashtags
are basically the digital equivalents of nouns.
And as a consequence, we are basically trying to replicate the power of natural language sentences without critical components such as verbs (connectors) and adjectives (classifiers).This is where RDF comes into play. It is an open standards based language for constructing digital sentences that pack the same (or even more power) its natural language equivalents. Through the power of RDF it is possible to create micro-annotations (aka. Nanotations) that are embeddable in any kind of text based documents. Naturally, the aforementioned claim doesn't apply to every RDF notation, which is why RDF-Turtle is the vehicle we've chosen to unleash the full power of RDF and the Semantic Web it enables when digital sentences take the form of Linked Open Data.
## Turtle Start ## {Trutle-based-RDF-statements} ## Turtle End ##
As per natural language sentences we have the following parts:
Each sentence subject, predicate, object is denoted (named or referred to) using an identifier (word, phrase, or term). If you want to generate Linked Data that flows across data spaces your best bet is to denote (refer to) sentence subject, predicate, and object (optionally) using identifiers that function like terms -- by using HTTP URIs.
Use prefixes to shorten RDF-Turtle statements:
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
Enables statements such as:
<> a foaf:Document .
<> a <#Document> . <> <#topic> <#Nanotation>.
<> a foaf:Document . <> foaf:topic <#Nanotation> .
<> a foaf:Document; foaf:topic <#Nanotation>.
<> a foaf:Document; foaf:topic <#Nanotation>, <#SemanticWeb>, <#LinkedData>.
<> a foaf:Document . <> foaf:maker [a foaf:Person; foaf:name "Kingsley Idehen" ] .
## Turtle Start ## <> a foaf:Document; foaf:topic <#Nanotation>, <#SemanticWeb>, <#LinkedData>. ## Turtle End ##
A nanotation processer is an application or service that's capable of consuming text content enhanced with RDF-Turtle based nanotations. Virtuoso's in-built Linked Data Transformation middleware (aka "Sponger") is an example of an application that supports nanotation. Likewise, our URIBurner service which is a free public service driven by an instance of Virtuoso with the Sponger module enabled:
http://{cname-of-virtuoso-host-machine}/sponger
http://linkeddata.uriburner.com/
http://{cname-of-virtuoso-host-machine}/about/html/{document-http-uri}
In all cases, you will end up with an HTML document that includes RDF statements that describe the processed document in a manner that also reveals all the embedded nanotations.
The Sponger treats resources transferred over HTTP as a duality of both a container document and a primary entity. When a resource is deemed to be an HTML document, the document is treated as the primary entity. Otherwise, where the domain is well known, a custom extractor cartridge populates the primary entity with data arising from API calls and the HTML content is regarded as secondary, relegated to the container document. For example, G+ posts are recognized and the Sponger concentrates on presenting the timestamp, body, tags and links and other features of a post.
When sponging an HTTP resource, multiple extractor cartridges might be brought to bear. Consequently, there may be multiple triples containing the entity's content.
The Turtle Sniffer is implemented as a Metacartridge, i.e it runs after all the extractor cartridges have run, augmenting data in the graph. It uses SPARQL inference to collate predicates that constitute "content" for this purpose, along with the HTTP request content (if any), flattening each to plain text.
Currently, the list of potential content predicates is:
bibo:content
(e.g.
arising from the HTML+Variants extractor cartridge) bibo:abstract
oplgplus:annotation
(used by Google+ for text when sharing items)For each of these contents, it checks if it matches the patterns:
## Nanotation Start ## .... ## Nannotation End ##
## Turtle Start ## .... ## Turtle Stop ##
{.... } (note: only applies to tweets on Twitter)
If a content contains one or more nanotation blocks, each block is parsed in turn as Turtle; if not, it attempts to parse the content item as Turtle in entirety.
The HTTP document content is only inspected in case of no triples having been extracted by prior means.
Optionally (enabled by default) each triple may be reified, i.e an rdf:Statement entity created to describe its subject, predicate and object, so you can identify triples arising from nanotations as entities labelled 'Embedded Turtle Statement' and a number in the graph.
The Turtle Sniffer expands the patterns #word and @word when they appear in URI (<>) or double quotes (""), in the context of the domain of the URI being sponged.
For example, a Tweet containing the nanotation:
## Nannotation Start ## <@kidehen> foaf:name "Kingsley Idehen" ; foaf:knows <@openlink> ; scot:has_tag <#Data> . ## Nannotation End ##
will be expanded to a Turtle string:
<https://twitter.com/kidehen> foaf:name "Kingsley Idehen" ; foaf:knows <https://twitter.com/openlink> ; scot:has_tag <https://twitter.com/hashtag/Data#this> .
We recognize custom URI formats for users and tags in the contexts of Facebook, Twitter, G+, LinkedIn? and Delicious.
Note that the word must appear within quotes; this is to avoid confusion with Turtle's @prefix directive (which is not a user!) and problems that would be caused by performing similar expansions within a quoted sentence.