Faceted Browsing Service

What

The Virtuoso Faceted Browser service is a general purpose RDF data query facility for faceted browsing over entity relationship types (i.e. relations). It takes an XML or JSON description of the desired view, and generates the reply in the form of XML or JSON documents comprised of the requested data. For The user agent or a local web page can use XSLT for rendering out for end users, when XML documents are returned. Likewise, JSON oriented tools can be used to transform JSON documents as part of user experience via Javascript based applications.

Why

The selection of facets and values is represented as an XML tree. Such a representation is easier to process in an application than the SPARQL source text or a parse tree of SPARQL, and more compactly captures the specific subset of SPARQL needed for faceted browsing. The Web service also returns the SPARQL source text, which can serve as a basis for hand-crafted queries.

How

The top element of the tree is <query>. It must be in namespace "http://openlinksw.com/services/facets/1.0/".

The <query> element has the following attributes:

AttributeDescription
graph="graph_iri"installed default is to search in all graphs, but system defaults may override this
timeout="no_of_msec"installed default is no timeout, but system defaults may override this
inference="name"where name is the name of an inference context declared with rdfs_rule_set.
same-as="boolean"If "boolean" is "yes", then owl:sameAs links will be considered in the query evaluation.

The result is a tree of the form:


<facets xmlns="http://openlinksw.com/services/facets/1.0/">
  <result>
    <row>
      <column datatype="..." shortform="..." xml:lang="...">...</column>
    </row>
  </result> 
  <time>msecs</time>
  <complete>yes or no</complete>
  <db-activity>resource use string</db-activity>
  <sparql>sparql statement text</sparql>

By convention, the first column is the subject selected by the view element, typically a URI; the second is a label of the URI; and the third, if present, is either a count or a search summary.

The first column's text child is the text form of the value. The column element has the following attributes qualifying this further:

AttributeDescription
datatypeThe xsd type of the value. If this is a URI, the datatype is "uri".
shortformIf the value is a URI, this is an abbreviated form where known namespaces are replaced with their prefixes, and very long URIs are middle-truncated, preserving start and end.
xml:langif the value is a language tagged string, this is the language.

The query has the top level element <query>. The child elements of this represent conditions pertaining to a single subject. A join is expressed with the property or property-of element. This has in turn children which state conditions on a property of the first subject. property and property-of elements can be nested to an arbitrary depth and many can occur inside one containing element. In this way, tree-shaped structures of joins can be expressed.

Expressing more complex relationships, such as intermediate grouping, subqueries, arithmetic or such requires writing the query in SPARQL. The XML format is a shorthand for easy automatic composition of queries needed for showing facets, not a replacement for SPARQL.

A facet query contains a single view element. This specifies which subject of the joined subjects is shown. Its attributes specify the manner of viewing, e.g., a list of distinct values, a list of distinct values with occurrence counts, a list of properties or classes of the selected subjects, etc.

The top query element or any property or property-of element can have the following types of children:

Type of childrenDescription
<text property="iri">text pattern</text>The subject has an O that matches the text pattern. If property is given, the text pattern must occur in a value of this property. If not specified, any property will do. The value "none" for property is the same as not specifying a property. This is restricted to occurring directly under the top level query element.
<class iri="iri" inference="ctx_name" /> The S must be an instance of this class. If inference is specified then option (input:inference "ctx_name" is added and applies to this pattern alone.
<property iri="iri" same_as="yes" inference="ctx_name">The child elements of this are conditions that apply to the value of this property of the S that is in scope in the enclosing <query> or <property> element. If same_as is present, then option (input:same-as "yes") is added to the triple pattern which specifies this property. If inference is present, then option (input:inference "ctx_name") is added to the triple pattern for the property.
<property-of iri="iri" same_as="yes" inference="ctx_name" >The child elements of this are conditions that apply to an S which has property "iri" whose object is the S in scope in the enclosing <query> or <property> element. The options are otherwise the same as with property.
<value datatype="type" xml:lang="lng" op="= | < | > | >= | <=">value </value>| When this occurs inside <property> or <property-of> this means that the property in scope has the specified relation to the value. type and language can be used for XML typed or language tagged literals. The "uri" type means that the value is a qualified name of a URI. If this occurs directly under the <query>element, this means that the query starts with a fixed subject. If this is so, then there must be property or propertyof elements or the view element must specify properties or classes, list is not allowed as a view type. This is so because the query must have at least one triple pattern.
<view type="view" limit="n" offset="n" > This may occur once inside a <query> element but may occur either at top level or inside property or property-of elements. This specifies what which subject is presented in the result set.

The type can be:

LabelSyntax Usage
properties
SELECT ?p COUNT (*)
WHERE { ?this_s ?p ?any_o ... }
GROUP BY ?p
ORDER BY DESC 2
LIMIT 1
OFFSET 0
properties-inSELECT ?p
COUNT (*)
WHERE { ?any_s ?p ?this_s ... }
GROUP BY ?p
ORDER BY DESC 2
LIMIT 1
OFFSET 0
classesSELECT ?c COUNT (*)
WHERE {?xx a ?c ... }
GROUP BY ?c
ORDER BY DESC 2
LIMIT 1
OFFSET 0
text
SELECT DISTINCT ?s (bif:search_excerpt (sql:search_terms ("pattern"), ?o))
WHERE { ... }
ORDER BY ?s
LIMIT 1
OFFSET 0
list
SELECT DISTINCT ?s long::sql:fct_label (?s)
WHERE { ... }
ORDER BY ?s
LIMIT 1
OFFSET 0
list-count
SELECT ?s COUNT (*)
WHERE { ... }
GROUP BY ?s
ORDER BY DESC 2
alphabet
SELECT (sql:subseq (?s, 0, 1)) COUNT (*)
WHERE { ... }
GROUP BY (sql:subseq (?s, 0, 1))
ORDER BY 1
geo
SELECT DISTINCT ?lat ?long ?s
WHERE { ?s geo:lat ?lat .
?s geo:long ?long .
... }
years
SELECT sql::year (?s) COUNT (*)
WHERE { ... }
GROUP BY (bif:year (?s))
ORDER BY 1
LIMIT 1
OFFSET 0
months
SELECT sql::round_month (?s) COUNT (*)
WHERE { ... }
GROUP BY (sql:round_month (?s))
ORDER BY 1
LIMIT 1
OFFSET 0
weeks
SELECT sql::round_week (?s) COUNT (*)
WHERE { ... }
GROUP BY (sql:round_week (?s))
ORDER BY 1
LIMIT 1
OFFSET 0
describe
DESCRIBE ?s
WHERE { ... }
LIMIT 1
OFFSET 0

Customizing

The following types of customization will be generally useful:

The source code is deivided in two SQL files and a number of XSLT sheets. The file facet.sql has the code for the web service. The facet_view.sql file contains the procedures for the sample HTML interface.

Examples for Customizing

See a detailed collection of examples here.

Choice of Labels

The Virtuoso Facets web service offers the feature of using label inferencing for better labels handling. This enables the automated use labels for anchor text in faceted browser pages across property names and their values, when value is a URI reference.

We use inference for labels called "facets". The inference is:


-- The SPARUL statement for loading it into the Quad Store:
SPARQL INSERT INTO GRAPH <facets> 
  {
    rdfs:label     rdfs:subPropertyOf  virtrdf:label  .
    dc:title       rdfs:subPropertyOf  virtrdf:label  .
    foaf:name      rdfs:subPropertyOf  virtrdf:label  .
    foaf:nick      rdfs:subPropertyOf  virtrdf:label  .
    geonames:name  rdfs:subPropertyOf  virtrdf:label  .
  }

--Making the rule from the graph:
rdfs_rule_set ('facets', 'facets');

Note: In case a labels oriented inference rule already exists e.g., for virtrdf-label, then all you need to do is run the procedure:


rdfs_rule_set('facets','virtrdf-label');

Examples for Choice of Labels

Enable Labels

To enable the labels, user "dba"should do:
registry_set ('fct_desc_value_labels', '1');

Otherwise the labels will be off by default.

WebService Interfaces

REST Interface

The Virtuoso Facets web service provide the following REST interface:

ComponentDescription
Service description* Endpoint: http://cname/fct/service or http://lod.openlinksw.com/fct/service
* HTTP method: POST
* Content-Type: MUST be 'text/xml'
* The entity body must be XML document with top element 'query' as described above.
* The request response namespace MUST be "http://openlinksw.com/services/facets/1.0"
Error conditionsAll error conditions are reported via '<error>Error explanation</error>'
FilesThe facet_svc.sql contains web service code and virtual directory mapping, and it uses fct_req.xsl and fct_resp.xsl as request and response filters.
Examples Using CURL program

SOAP Interface

The facet web service is also available via SOAP protocol.

The request message contains single element 'query' with syntax explained earlier. Also the SOAPAction HTTP header should be '#query'. After successful evaluation of the query, the service will return a SOAP envelope containing in the Body element single 'facets' element described above.

Example for SOAP Interface

See a detailed example here.

Virtuoso Facets API for REST services

A full list of Facets API calls for REST services can be viewed here.

Tutorials

Related