Working with SPARQL endpoint constraints via LIMIT & OFFSET
The DBpedia SPARQL endpoint is configured with the following Virtuoso INI setting:
MaxSortedTopRows = 40000
This setting sets a threshold on sorted rows. SPARQL queries that include OFFSET and LIMIT will feel the effect of the hard limit set in the INI. For instance, this query --
DEFINE sql:big-data-const 0 SELECT DISTINCT ?p ?s FROM <http://dbpedia.org> WHERE { ?s ?p <http://dbpedia.org/resource/Germany> } ORDER BY ASC(?p) OFFSET 40000 LIMIT 1000
-- will return the following error on execution --
HttpException: 500 SPARQL Request Failed Virtuoso 22023 Error SR353: Sorted TOP clause specifies more then 41000 rows to sort. Only 40000 are allowed. Either decrease the offset and/or row count or use a scrollable cursor
To prevent this error, you can leverage the use of subqueries which make better use of temporary storage. For example --
SELECT ?p ?s WHERE { { SELECT DISTINCT ?p ?s FROM <http://dbpedia.org> WHERE { ?s ?p <http://dbpedia.org/resource/Germany> } ORDER BY ASC(?p) } } OFFSET 50000 LIMIT 1000