Skip to content

Language processing in Linked Data authorities

E. Lynette Rayle edited this page Mar 7, 2019 · 13 revisions

Overview

Some linked data authorites tag literals with a language (e.g. 'milk@en', 'Milch@de', 'Lait@fr'). When this literals are tagged, it is desirable to be able to request literals for a specific language for two reasons: 1) provide users terms in their desired language, 2) avoid long results that include the term in every language. This document describes language processing for the linked data module in QA.

Where can language be specified?

Language can be specified in multiple places. They are listed here in priority order with highest priority first. If the language is not specified at a higher priority location, then the next highest language specification that does exist will be used.

  1. Passed as part of the request URL
  2. Defined in the authority configuration
  3. Sitewide preferred language

Configuring for language

Specifying language is a way to filter out results based on language. When passing the language as a parameter to the external authority, the filtering out of literals happens at the authority. In that case, the authority will pass back only literals in the specified language. Not all authorities support passing a language to their API.

In all other methods described, the filtering out of literals happens in the QA normalization process. When a language is specified in a way that QA does the filtering, literals will be returned if the literal is not tagged (e.g. 'milk') or if the literal matches the language.

Examples:

['milk@en', 'Milch@de', 'lait@fr'] requesting 'de' will return only ['Milch@de']
['milk', 'Milch@de', 'lait@fr'] requesting 'de' will return only ['Milch@de', 'milk']
['milk@en', 'Milch@de', 'lait@fr'] requesting ['fr','de'] will return only ['Milch@de', 'lait@fr']

Allowing language to be passed as part of the request URL as a language parameter

TODO... Add config of lang param

Example: The following example searches the cached Agrovoc vocabulary in French and receives back results in French.

https://lookup.ld4l.org/authorities/search/linked_data/agrovoc_ld4l_cache?q=lait&maxRecords=4&lang=fr)

[{"uri":"http://aims.fao.org/aos/agrovoc/c_1a3a6e9a","id":"http://aims.fao.org/aos/agrovoc/c_1a3a6e9a","label":"lait d'avoine"},
 {"uri":"http://aims.fao.org/aos/agrovoc/c_54e9f6e0","id":"http://aims.fao.org/aos/agrovoc/c_54e9f6e0","label":"lait d'amande"}, 
 {"uri":"http://aims.fao.org/aos/agrovoc/c_16076","id":"http://aims.fao.org/aos/agrovoc/c_16076","label":"Lait de bufflesse"},
 {"uri":"http://aims.fao.org/aos/agrovoc/c_4826","id":"http://aims.fao.org/aos/agrovoc/c_4826","label":"Lait"}]

NOTES:

  • The search term (e.g. q=lait) is in the specified language (e.g. lang=fr). Support for this is determined by the authority.
  • The authority must be configured to process the lang parameter so that it can pass it on to the external authority.
  • The external authority must support a language parameter (e.g. lang) as part of its access API.
  • This parameter can be applied to search and term requests based on configuration

Defining a default language or set of languages for an authority

TODO

Setting the sitewide preferred language

TODO

Disabling language for a specific authority

Some authorities do not tag languages or tag in a way that they are not sufficiently useful for filtering results. In those cases, you can turn off language processing.

TODO

Sorting of when multiple literals match requested language(s)

TODO

Clone this wiki locally