WP6: Text

Work package 6 creates data and tools aimed at all scholars working with textual data, such as researchers in literary studies, history, philosophy and religious studies.

Work package 6 creates data and tools aimed at all scholars working with textual data, such as researchers in literary studies, history, philosophy and religious studies.

Work package 6 is building an online environment where researchers can process text in all possible phases of research. It aims to provide data, tools, and manuals for a wide array of tasks, including but not limited to:

  • handwritten text recognition
  • optical character recognition
  • part-of-speech tagging for historical and present-day Dutch
  • named entity recognition
  • preparing scholarly digital editions of texts
  • annotating texts
  • analyzing texts with computational stylistic tools

The infrastructural-technical challenge for work package 6 is to increase compatibility and interoperability between existing resources (such as those available via Nederlab), and a variety of new tools and datasets for text analysis. To realize this goal, we are working towards shared conventions for practices and formats in digital text analysis.

Furthermore, dissemination through documentation and instruction are important in package 6, as dissemination can increase accessibility and can make it easier to combine and collaborate on tools and datasets. Digital data and tools require sufficient documentation and training material to provide opportunities for researchers at every level of digital literacy and skill, including those without programming experience.

Another key aim of package 6 is to explore, comprehend, and explain the possibilities and limitations of textual data and its analysis with computer tools through data and tool criticism so that tools and data can be used and improved responsibly, insightfully and effectively.

Finally, like all packages in CLARIAH, work package 6 is committed to making its products, both tools and data, available in a sustainable and responsible manner that ensures usefulness and accessibility in the long term.

Examples of research from WP6 Text