CLARIN 101 - Introduction to CLARIN

CLARIN 101 - Introduction to CLARIN: CLARIN is a distributed European Research Infrastructure offering access to language resources and technologies for researchers across the humanities and social sciences. This workshop is designed for conference participants new to CLARIN, aiming to provide an overview of its central services and their applications in research practices. By the end of the workshop, participants will gain an understanding of how CLARIN aligns with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability), and its approach towards enabling responsible data science. More specifically, the session will demonstrate how to:

  • Discover Language Resources using the Virtual Language Observatory (VLO), a powerful search portal for metadata on over 1.6 million language resources
  • Use the Federated Content Search(FCS) for direct data searches across digital collections
  • Process and analyse digital text collections using matching tools from the Language Resource Switchboard
  • Use LLMs for language research
  • Select an appropriate repository to deposit and share new language resources.

 

The workshop targets individuals new to CLARIN, including PhD students, researchers, committee members, ambassadors, trainers, and non-academics interested in learning more about the infrastructure. 

Programme 

Duration: 1 hour 30 minutes

14:00 - 14:10 Introduction to CLARIN by Cristina Grisot (10 minutes)

14:10 - 14:35 Discovering Language Resources (25 minutes)

  • How to search for resources in the Virtual Language Observatory () (10 min)
  • Federated Content Search () by Erik Körner and Thomas Eckart (15 min)

14:35 - 15:00 Using Language Resources (25 minutes)

  • Processing language resources from the VLO with matching tools from Switchboard
  • CLARIN Resources Families, with focus on Language Models

15:00 - 15:25 Sharing and Archiving Language Resources by Mietta Lennes (25 minutes)

15:25 - 15:30 Q&A and Wrap-up by Iulianna van der Lek (10 minutes)