Running an aggregate like COUNT over multiple columns in SPARQL 1.1

SPARQL 1.1 added aggregates, but it only permits them to be run over a single column, or over all columns in the result set.

For instance, this query gets a large number of rows, and you might want to know how many without running the full query --


SELECT 
   DISTINCT  ?s ?p ?o 
WHERE
  { 
    VALUES ?s { <http://dbpedia.org/resource/Treaty_of_Bern> } 
    ?s ?p ?o
    OPTIONAL { ?s a ?type } .
    FILTER ( BOUND ( ?type ) ) .
  }

A simple COUNT(DISTINCT *) includes all variables bound in the WHERE clause -- which here means a Cartesian result set, multiplying the count of DISTINCT ?s ?p ?o by the count of DISTINCT ?type.

A workaround is to wrap the aggregate around a sub-query. The query below returns an accurate count of DISTINCT ?s ?p ?o rows.


SELECT 
   ( COUNT( DISTINCT * ) AS ?HowManyTriples )
WHERE
  { 
    { 
      SELECT 
         DISTINCT ?s ?p ?o 
      WHERE
       { 
         VALUES ?s { <http://dbpedia.org/resource/Treaty_of_Bern> } 
         ?s ?p ?o
         OPTIONAL { ?s a ?type } .
         FILTER ( BOUND ( ?type ) ) .
       }
    }
  }