This paper proposes (1) a general scheme for specifying interoperability; (2) tentative specific instantiations for two data types (text and audio) of the general interoperability scheme, as illustration of the general scheme; (3) a policy to be carried out by CLARIN ERIC to ensure continued efforts on interoperability, including specific incentives for individual researchers and CLARIN
national consortia.


PDF Towards Interoperability in CLARIN
group41 Jan Odijk


Building Web APIs on top of SPARQL endpoints is becoming a common practice to enable universal access to the integration favorable dataspace of Linked Data. However, the Linked Data community cannot expect users to learn SPARQL to query this dataspace, and Web APIs are the most extended way of enabling programmatic access to data on the Web. However, the implementation of Web APIs around Linked Data is often a tedious and repetitive process. Recent work speeds up this Linked Data API construction by wrapping it around SPARQL queries, which carry out the API functionality under the hood. Inspired by this, in this paper we present grlc, a lightweight server that translates SPARQL queries curated in GitHub repositories to Linked Data APIs on the fly.


PDF grlc Makes GitHub Taste Like Linked Data APIs
iconwww Salad2016: ESWC2016 workshop May 29th, 2016
group41 Albert Meroño-Peñuela and Rinke Hoekstra


More and more knowledge bases are publicly available as linked data. Since these knowledge bases contain structured descriptions of real-world entities, they can be exploited by entity linking systems that anchor entity mentions from text to the most relevant resources describing those entities. In this paper, we investigate adaptation of the entity linking task using contextual knowledge. The key intuition is that entity linking can be customized depending on the textual content, as well as on the application that would make use of the extracted information. We present an adaptive approach that relies on contextual knowledge from text to enhance the performance of ADEL, a hybrid linguistic and graph-based entity linking system. We evaluate our approach on a domain-specific corpus consisting of annotated WikiNews articles.


PDF Context-enhanced Adaptive Entity Linking.
iconwww Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Portoroz, Slovenia. 23-28 May 2016.
group41 Filip Ilievski, Giuseppe Rizzo, Marieke van Erp, Julien Plu and Raphael Troncy


The literary studies field has a longstanding tradition of detailed analysis of literary works. This results in fine-grained, but usually small-scoped studies. The advent of computational methods makes it possible to scale up the subject of analysis and start for instance comparing entire oeuvres of authors or even genres. Before we do so though, it is important to evaluate the precision and impact of such computational methods, for which we have carried out a small study in which we automatically analysed Harry Mulisch’s The Discovery of Heaven (1992) using DBpedia Spotlight (Daiber et al., 2013). We chose to investigate this novel as is considered by many as Mulisch’s masterpiece (Brems, 2006), it is a fair body of work (nearly 1.000 pages, containing ≈270.000 words) and contains many references to disciplines such as the natural sciences, theology, humanities and politics. The novel embodies and uses encyclopaedic knowledge and could therefore be seen as an encyclopaedic novel. One could say it strives to capture the ideas and opinions of its time into its narrative, and shows a variety of means to interpret the world (Mendelson, 1976). These aspects make this kind of novel ideal to be analysed using computational methods, given the fact that the overwhelming amount of information it grants is hard to be grasped by the novel’s reader. (Van Ewijk, 2011, p. 214)


PDF Discovering an Encyclopaedic Novel: a case study in automatically analysing Harry Mulisch’s The Discovery of Heaven (1992). In:
iconwww DHBenelux 2016. Belval, Luxemburg, 9-10 June 2016.
group41 Leon van Wissen, Marieke van Erp, Ben Peperkamp


Diachronous conceptual lexicons and thesauri describe historical language use through time and provide historians with invaluable insights into the past. As more and more research is done in the area of the automatic processing of historical texts, the need for machine-readable historical lexicons and ontologies is growing. Up till now these lexical resources were not available in digital format, but for this project volunteers have made transcripƟons of the printed books. However, the result is in text format, and information and ontologies vary for each resource which hampers simultaneous access to the information contained in them.
In CLARIAH WP3, we are working on integrating and enriching several historical conceptual lexicons using linked open data principles The data are modelled according to the LMF and Lemon standards that are specifically tailored to represent lexical and ontological data. The model provides possibilities to take care of the lack of standardized orthography, and to model notionsof time, duration and place as properties of word usages thus enabling the heterogeneous resources to interoperate


PDF Integrating Diachronous Conceptual Lexicons through Linked Open Data.
iconwww DHBenelux 2016. Belval, Luxemburg, 9-10 June 2016.
group41 Isa Maks, Marieke van Erp, Piek Vossen, Rinke Hoekstra, Nicoline van der Sijs