Reification Options for Linked Data Publication

  1. Our recommendation is to keep RDF_QUAD unchanged and use RDF Views to keep n-ary things in separate tables. The reason is that the access to RDF_QUAD is heavily optimized, and any changes may result in severe penalties in scalability.
  2. Triggers should be possible as well, but it is relatively cheap to "redirect" data manipulations to other tables.
  3. Both the loader of files and SPARUL internals are flexible enough so it may be more convenient to change different tables depending on parameters:
    • The loader can call arbitrary callback functions for each parsed triple and SPARUL manipulations are configurable via "define output:route" pragma at the beginning of the query.
  4. Usually there is no need in writing special SQL to "triplify" data from that "wide" tables, because RDF Views will do that automatically. Moreover, it's possible to automatically create triggers by RDF Views that will materialize changes in "wide" tables in RDF_QUAD (say, if you need inference).
  5. So instead of editing RDF_QUAD and let triggers on RDF_QUAD reproduce the changes in wide tables, you may edit wide tables and let triggers reproduce the changes in RDF_QUAD.
  6. The second approach is much more flexible and it promises better performance due to much smaller activity in triggers. For cluster the second variant is the only possible thing, because fast manipulations with RDF_QUAD are really complicated there.