What is the difference between the functions SAMPLE, GROUP_CONCAT and GROUP_DIGEST?

This example demonstrates the differences between the functions SAMPLE, GROUP_CONCAT and GROUP_DIGEST.

Assume the following query that should get all ?assets as a list with delimiters:


SPARQL SELECT ?view ?path (sql:GROUP_CONCAT (?asset, ' ')) as ?asset_list
FROM <http://mygraph.com> 
WHERE
 { 
   ?view <viewPath> ?path ; 
     <viewContent> ?asset ; 
     <viewType>  'phyview'. 
 }
;

This method is not universal, because conversion to strings will eliminate the difference between strings and IRIs and there should be some delimiter that never appears in values of ?asset. In addition, the query may fail with "row too long" error if values of ?asset are lengthy and/or numerous enough. It is also possible the query not work completely free from duplicates if more than one list is desired. E.g.:


SPARQL 
SELECT ?view (sql:GROUP_CONCAT (?path, ' ')) as ?path_list
  (sql:GROUP_CONCAT (?asset, ' ')) as ?asset_list
FROM <http://mygraph.com>
WHERE 
  { 
    ?view <viewPath> ?path ; 
      <viewContent> ?asset ; 
      <viewType> 'phyview' . 
  }

will not contain duplicates in lists only if either ?path or ?asset is unique for every found ?view; but if it's so unique then there's no need in the corresponding sql:GROUP_CONCAT() .

If there are many values per property but it's enough to return any single value and ignore the rest then use sql:SAMPLE() function instead of sql:GROUP_CONCAT() .

If there are many values per property and it's better to show more than one value but "row too long" error happens, then the sql:GROUP_DIGEST function can be used:


SPARQL 
SELECT ?view (sql:GROUP_DIGEST (?path, ' ', 1000, 1)) as ?path_list 
  (sql:GROUP_DIGEST (?asset, ' ', 1000, 1)) as ?asset_list
FROM <http://mygraph.com> 
WHERE
  { 
    ?view <viewPath> ?path ; 
      <viewContent> ?asset ; 
      <viewType> 'phyview' . 
  }

Related