Running an aggregate like COUNT over multiple columns in SPARQL 1.1
SPARQL 1.1 added aggregates, but it only permits them to be run over a single column, or over all columns in the result set.
For instance, this query gets a large number of rows, and you might want to know how many without running the full query --
SELECT
DISTINCT ?s ?p ?o
WHERE
{
VALUES ?s { <http://dbpedia.org/resource/Treaty_of_Bern> }
?s ?p ?o
OPTIONAL { ?s a ?type } .
FILTER ( BOUND ( ?type ) ) .
}
A simple COUNT(DISTINCT *) includes all variables bound in the WHERE clause -- which here means a Cartesian result set, multiplying the count of DISTINCT ?s ?p ?o by the count of DISTINCT ?type.
A workaround is to wrap the aggregate around a sub-query.
The query below returns an accurate count of DISTINCT ?s ?p ?o rows.
SELECT
( COUNT( DISTINCT * ) AS ?HowManyTriples )
WHERE
{
{
SELECT
DISTINCT ?s ?p ?o
WHERE
{
VALUES ?s { <http://dbpedia.org/resource/Treaty_of_Bern> }
?s ?p ?o
OPTIONAL { ?s a ?type } .
FILTER ( BOUND ( ?type ) ) .
}
}
}