Name Type Created  

For linguistic analysis


Full text for historical newspapers from Estonia via the Europeana newspapers collections


A collection of novels in plain text format to share with the students.


This collection contains existing tutorials and guides for the tools and services listed in the Switchboard Tool Inventory. The collection is useful for researchers, teachers and students who are interested in using the Switchboard tools in research, teaching and learning.


A collection of Jane Austen's works - plain text files.


Collection of resources that teachers and trainers can use to develop learning content and training activities on data management-related topics.


Test collection of short stories in English, generated by the VLO


Lists the main Switchboard publications


Micro X-ray fluorescence (µXRF) is an elemental analysis technique which allows for the examination of very small sample areas. Like conventional XRF instrumentation, micro X-ray fluorescence uses direct X-ray excitation to induce characteristic X-ray fluorescence emission from the sample for elemental analysis. Unlike conventional XRF, which has a typical spatial resolution ranging in diameter from several hundred micrometers up to several millimeters, µXRF uses X-ray optics to restrict the excitation beam size or focus the excitation beam to a small spot on the sample surface so that small features on the sample can be analyzed. Possibility to map large surfaces (over 10x10 cm2).



Hyperspectral imaging is a chemical imaging technique based on reflectance spectroscopy (the light reflected by materials). This device makes the collection of reflectance spectra in each point of the field of view for the Near Infrared range (it is complementary to another device for the visible range). The hyperspectral image cube obtained can be considered both as a stack of wavelength-resolved images and as a series of spectra.
The near infrared spectra consist of vibrational overtones and combination absorption features where spectral signatures can allow to identify and map different materials.
This technique is well adapted to characterize organic compounds (binding media, plastic materials…) and some minerals.
Hyperspectral Imaging provides spatially resolved information on the nature of chemical species that can be interesting to locate damages (moisture, chemical transformations…), restorations, pentimenti or underdrawings on/in an artwork.
Hyperspectral imaging is well adapted for flat or slightly embossed artworks (manuscript, drawing, paintings,…)
Near Infrared hyperspectral imaging is a non-invasive, in situ technique that allows to collect data cube in few minutes without any preparation of the artwork.

Technical details

The system consists of an ImSpector N25E imaging spectrograph (Specimcorp, Finland) and a cooled, temperature stabilized MCT detector (9.6 mm detector having 320 (spatial) x 256 (spectral) pixels).The camera operates from 970 to 2500 nm with a spectral resolution of 10 nm. It works as a line scan camera providing full, contiguous spectral data for each pixel.
The cooling system (dual Peltier solution, forced convection coolers) is designed to minimize dark current and ensure high stability in the detector operations in a wide ambient temperature range.
Two objective lenses are available:
A telecentric lens with a focal length of 15 mm (corresponding to a minimal field of view of 20 cm and a maximal resolution of 600 µm)
A macroscopic lens with a 1:1 magnification (corresponding to a field of view of 9.6 mm and a spatial resolution of 30 µm)
The camera is moving along a motorized bar of around 1.6 m, resting on a portal frame structure. The height of the structure is around 2 m. The artwork is illuminated with 6 halogen lamps (three on each side) placed at 45 degrees from the vertical. The current structure is designed to scan artefact laid flat on a tablebut the configuration can possibly be changedto scan artefact in a vertical position.


This collections gathers DH specific tools, which are listed in the SSH Open Marketplace.


This virtual collection demonstrates the use of signposts in a virtual collection. It contains all 15 recordings of subject 2608 from the study Bonner Gerontologische Längsschnittstudie (BOLSA). Please note that access to the sound files is restricted.


Links to data sets and services that are connected via the Language Resource Switchboard.


A high-quality orthographic transcript is the basis for all types of analyses of spoken language data. However, transcribing speech is a time-consuming and tedious task. But automatic speech recognition as well as NLP and text annotation tools can make this task much quicker and save you a lot of time and frustration.

In this first of a series of SSHOC webinars, organised by the consortium partner CLARIN ERIC, we will discuss the theoretical basis and the technology available for transcribing spoken language. In particular, we will focus on the role of automatic speech recognition – what are the opportunities, what are the pitfalls and to where can it be applied successfully.

This virtual collection bundles the Webinar's slides, video recording and event page.


This collection contains the data generated by the STARS4ALL photometers (TESS-W) deployed by the Extremadura Buenas Noches project. These photometers are used to measure the light pollution in Extremadura, Spain. You can find more information here:


List of all PubMed-listed publications for the query "face mask" AND "influenza" as refered to by Prof. dr. Van Dissel in the interview with NRC Handelsblad of 8 May 2020.

The interview (in Dutch) can be accessed from

The query used is


Collection of measurements of the project Pirineos La Nuit


This collection has been created to demonstrate the integration with the Language Resource Switchboard (LRS).


DuFLOR is a parallel corpus which collects the links to the script of 15 dubbed films from English distributed in Italy between 1964 and 2003 alongside their original English version. The films are: Mary Poppins (1964), Dr Strangelove (1964), A Space Odyssey (1968), The Andromeda Strain (1971) Young Frankenstein (1974), Monty Python and the Holy Grail (1975), Life of Brian (1979), Shining (1980), Back to the Future (1985), Pulp Fiction (1994), Apollo 13 (1995), Titanic (1997), The Big Kahuna (1999), Spider-man (2001), and The Matrix Reloaded (2003).
These films have been chosen based on their high box office sales and therefore on their potential impact on their viewers language. DuFLOR has been built as a research tool for studying both film dialogues and film translations, with Italian dubbed translations from English as a case study. The corpus has for example been used to study the use of discourse markers in Italian films dubbed from English, and to compare it contrastively with their use in real use Italian (cf. Viola 2016, 2017, in press). However, several other dimensions can be explored, e.g., diachronic, linguistic, sociolinguistic, pragmatic, translational, to name but a few.
Moreover, DuFLOR has been created with a historical dimensional structure in mind as on the whole, it covers the period 1964-2003. At the same time, the corpus is conveniently divided into five time slots: 1964-1968; 1971-1979; 1980-1985; 1994-1997; 2001-2003 thus allowing for diachronic comparisons. The corpus contains 223,343 words (115,166 in Italian, 108,177 in English); the metadata document gives more detailed information on the different sections of the corpus as well as on the whole corpus.


Viola, Lorella. In press. "On the diachrony of giusto? (right?) in Italian: A new discoursivization". Journal of Historical Pragmatics.
Viola, Lorella. 2017. "A corpus-based investigation of language change in Italian: The case of grazie di and grazie per." Journal of Historical Linguistics 7:3 (2017), 371-388.
Viola, Lorella. 2016. "Stai scherzando? Are you kidding? Investigating the influence of dubbing on the Italian progressive". Italian Journal of Linguistics 28(2): 181-202.


This virtual collection is intended to be used for demonstrating the creation and publishing of virtual collections. It contains German data from the Bavarian Archive of Speech Data.


We select a number of Treebanks as an example of heterogenoues data


Online corpus supplement to van Sluijs, van den Berg & Muysken (to appear). Exploring genealogical blends: the Surinamese Creole Cluster and the Virgin Island Dutch Creole Cluster


A collection of works by and secondary sources about Henrik Ibsen. Inspired by the Hathi Trust collection (;c=1024421342) and extended with resources available via CLARIN.


Digital references for De Vos, C. (2014). Absolute spatial deixis and proto-toponyms in Kata Kolok. NUSA: Linguistic studies of languages in and around Indonesia, 56, 3-26.


Digital references for the book "The Trobriand Islanders' Ways of Speaking" by Gunter Senft (De Gruyter Mouton, 2010)