Speech Data & Tech - CLARIN Knowledge Centre for Oral Archives in Italy

This section of the K-OAr Center’s site is a porting of some content from the Speech Data & Tech website, formerly at https://speechandtech.eu/ (you can find its latest Internet Archive copy here).

About Speech & Tech

Speech & Tech was an interdisciplinary initiative launched in 2016 by a group of scholars from diverse fields – including oral history, computational linguistics, anthropology, sociolinguistics, phonetics and phonology – who shared a common interest in integrating digital technologies into research on spoken data.

The project developed in a context marked by the growing relevance of Human Language Technologies (HLT) and the expansion of digital humanities, positioning itself as a space for mediation between technical tools and research practices involving spoken narratives. Its main aim was to provide a structured overview of technologies for processing spoken content, ranging from the digitization of analogue recordings to more advanced applications such as Automatic Speech Recognition (ASR), speaker diarization, and emotion detection.

The website combined epistemological reflection with practical guidance. It offered introductory materials on research domains dealing with speech data, descriptions of relevant tools and technologies, and examples of their application through workshops, showcases and publications. In this sense, Speech & Tech functioned as a lightweight knowledge-transfer infrastructure for both researchers and cultural heritage practitioners.

One of its main outcomes was the Transcription Portal, an open-source service for automatic transcription in multiple languages (English, German, Dutch and Italian), reflecting the project’s shift towards more operational and infrastructure-oriented solutions.

The initiative has now come to an end. Its contents, originally hosted on the Speech & Tech website, have been preserved and are now collected on this page, where they remain accessible within the framework of the Knowledge Centre for Oral Archives in Italy (K-OAr).

Members

Speech & Tech was developed by an interdisciplinary group of researchers and professionals working on speech data from different perspectives.

The team included:

**Arjan van Hessen**, whose work focuses on Human Language Technology and its application in both research and industry

**Stefania Scagliola**, a historian specializing in digital audiovisual archives and oral history collections

**Christoph Draxler**, based at the Bavarian Archive for Speech Signals (LMU Munich), working on speech corpora and tools for speech data collection and analysis

**Henk van den Heuvel**, involved in the production, curation and analysis of language and speech data, with a focus on automatic speech recognition

**Silvia Calamai**, professor of linguistics and sociolinguistics at the University of Siena, working on sociophonetics, dialectology and oral archives

The group — The composition of the group reflects the combination of technical and humanities-oriented approaches that characterized the project

Research domains

Speech & Tech addressed several research domains that engage with spoken data from different methodological perspectives:

Oral History focuses on the use of spoken narratives as sources for historical research. It includes the collection, analysis and archiving of interviews, often centered on personal experiences and lived histories. With the advent of digital technologies, the field has undergone significant changes, including the increasing availability of online archives and the use of digital tools for recording, transcription and analysis.

Computational Linguistics and Natural Language Processing (NLP) provide methods for processing and analyzing language data using computational approaches. In the context of speech data, these include tools for automatic transcription (ASR), information extraction (e.g. named entities, keywords), and the transformation of unstructured data into structured formats that can be explored through statistical and machine learning techniques.

Sociolinguistics investigates the relationship between language and society, examining how linguistic variation is influenced by factors such as region, social class, age and context. As an empirical discipline, it relies heavily on spoken data and oral archives, which enable both synchronic and diachronic analyses, including the study of language change in real time.

Language and Speech Technology develops computational methods for processing spoken language as a signal carrying linguistic and paralinguistic information. Core areas include Automatic Speech Recognition (ASR), Text-to-Speech (TTS) and dialog systems, as well as speaker diarization, emotion recognition and segmentation. These technologies make it possible to convert speech into analyzable data and to model different aspects of spoken communication.

Together, these domains reflect the range of approaches involved in working with speech data and the role of digital technologies in supporting their integration.