• Topic
  • Discussion
  • VOS.VirtAggregateOverMultipleColumnsInSparql(Last) -- Owiki? , 2018-04-13 12:05:42 Edit owiki 2018-04-13 12:05:42

    Running an aggregate like COUNT over multiple columns in SPARQL 1.1

    SPARQL 1.1 added aggregates, but it only permits them to be run over a single column, or over all columns in the result set.

    For instance, this query gets a large number of rows, and you might want to know how many without running the full query --


    SELECT 
       DISTINCT  ?s ?p ?o 
    WHERE
      { 
        VALUES ?s { <http://dbpedia.org/resource/Treaty_of_Bern> } 
        ?s ?p ?o
        OPTIONAL { ?s a ?type } .
        FILTER ( BOUND ( ?type ) ) .
      }
    

    A simple COUNT(DISTINCT *) includes all variables bound in the WHERE clause -- which here means a Cartesian result set, multiplying the count of DISTINCT ?s ?p ?o by the count of DISTINCT ?type.

    A workaround is to wrap the aggregate around a sub-query. The query below returns an accurate count of DISTINCT ?s ?p ?o rows.


    SELECT 
       ( COUNT( DISTINCT * ) AS ?HowManyTriples )
    WHERE
      { 
        { 
          SELECT 
             DISTINCT ?s ?p ?o 
          WHERE
           { 
             VALUES ?s { <http://dbpedia.org/resource/Treaty_of_Bern> } 
             ?s ?p ?o
             OPTIONAL { ?s a ?type } .
             FILTER ( BOUND ( ?type ) ) .
           }
        }
      }