There is a growing body of studies investigating whether interlocutors become more similar to eac... more There is a growing body of studies investigating whether interlocutors become more similar to each other during a dialogue. The present study contributes to this research line by investigating the pitch profiles, and convergence and synchrony in mean F0 values, in Polish dyadic conversations on provocative art between students and between a student and a teacher. We found different pitch profiles for the different scenarios and identities of the interlocutors. In the studentstudent and student-teacher conversations where both interlocutors agree (whether or not they accept the provocative art), the students show higher long-term distributional (LTD) F0 values for level, span and standard deviation when interacting with a teacher than with a fellow student. In the student–teacher conversations the students achieve significantly higher LTD F0 values when the two interlocutors do not share the same opinion about the provocative art. Regarding convergence we applied a variety of measure...
The matter of shadowing natural speech has been discussed in many studies and papers. However, th... more The matter of shadowing natural speech has been discussed in many studies and papers. However, there is very little knowledge of human phonetical convergence to synthesized speech. To find out more about this issue an experiment in the Polish language was conducted. Two types of stimuli were used – natural speech and synthesised speech. Five sets of sentences with various phonetic phenomena in Polish were prepared. A group of twenty persons were recorded which gave the total number of 100 samples for each phenomenon. The summary of results shows convergence in both natural and synthesised speech in set number 1, 2, 4 while in group 3 and 5 the convergence was not observed. The baseline production shown that the great majority of participants prefer ɛn/ɛm version of phonetic feature which was reflected in 83 out of 100 sentences. In the shadowing natural speech participants changed ɛn/ɛm to ɛw/ɛ̃ in 26 cases and in 4 ɛw/ɛ̃ to ɛn/ɛm. When shadowing synthesised speech shift from ɛn/ɛm ...
The paper presents the system of automatic synthesis of the Polish speech signal from text, devel... more The paper presents the system of automatic synthesis of the Polish speech signal from text, developed over the last three years at the Department of Acoustic Phonetics in Poznan. The element generating the acoustic signal is a special-purpose IC controlled from a PC AT by means of original software comprizing the modules for text editing, phonemic transcription and synthesis of digital parameters. The speech signal, produced in real time, is highly intelligible, which opens up before the system a prospect of concrete applications in man-man and machine-man communication.
The present paper describes spoken dialogue corpus creation and its annotation specification for ... more The present paper describes spoken dialogue corpus creation and its annotation specification for analysis and objective evaluation of phonetic convergence in human-human communication. The analysis of the corpus will serve for creation of convergence models which could be implemented in spoken dialogue systems based on spontaneous, expressive speech. The corpus consists of 13 hours of dialogues between 16 pairs of Polish native speakers and controlled dialogues with a teacher. The speakers knew each other and were at similar age, but during the recording could not see each other. In each recording session the pair of speakers conducted 4 dialogues in neutral scenarios and 6 dialogues in expressive scenarios, 3 dialogues with the teacher, 2 repetition tasks and 1 reading, which provided about 1 hour of speech for each pair. The corpus is being annotated on several layers: orthographic transcription of text, prosody, noise, flow of speaking turns, dialogue acts, agreement and disagree...
For modern man-machine communication traditional methods based on a keyboard and a mouse are defi... more For modern man-machine communication traditional methods based on a keyboard and a mouse are definitely insufficient. Especially thinking about poor and bad-qualified citizens of Information Society we must try to find the easiest and most comfortable method for man-machine communication. This optimal method of communication going from a man to a machine is (or should be) speech communication. In the
The paper presents technical, linguistic and didactic specifications for Euronounce project which... more The paper presents technical, linguistic and didactic specifications for Euronounce project which aims at creating an intelligent tutoring system with multimodal feedback functions for acquiring foreign languages’ pronunciation and prosody. In response to the European Union’s call for promoting less widely spoken languages the project focuses on German as a target language for speakers whose mother tongue is Polish, Slovak, Czech or Russian and vice versa – it is to enable German native speakers to acquire the pronunciation and prosody of Polish, Slovak, Czech or Russian. Beside specifications concerning corpora design, speech databases, recordings, structure of exercises and feedback system the article outlines theoretical underpinning of the project as well as the baseline for the project, AzAR, created in two preceding projects.
1 Dept. of Linguistics, Adam Mickiewicz University, Poznan, Poland 2 Dept. of English Linguistics... more 1 Dept. of Linguistics, Adam Mickiewicz University, Poznan, Poland 2 Dept. of English Linguistics, University of Stuttgart, Germany 3 Institute of Natural Language Processing, University of Stuttgart, Germany [email protected], [email protected], [email protected], ...
This paper reports on the improvement of Polish speech synthesis obtained by applying new techniq... more This paper reports on the improvement of Polish speech synthesis obtained by applying new techniques to BOSS (The Bonn Open Synthesis System) for Polish. In order to enhance the system's performance a variety of set-ups for the cost function, types of units used for concatenation (uniform vs. non-uniform unit selection) and the corpus alignment were tested. Three configurations for segment duration weights were chosen and tested with a mean opinion score perception test to investigate the impact of the applied segmental duration model on the perceived speech quality.
There is a growing body of studies investigating whether interlocutors become more similar to eac... more There is a growing body of studies investigating whether interlocutors become more similar to each other during a dialogue. The present study contributes to this research line by investigating the pitch profiles, and convergence and synchrony in mean F0 values, in Polish dyadic conversations on provocative art between students and between a student and a teacher. We found different pitch profiles for the different scenarios and identities of the interlocutors. In the studentstudent and student-teacher conversations where both interlocutors agree (whether or not they accept the provocative art), the students show higher long-term distributional (LTD) F0 values for level, span and standard deviation when interacting with a teacher than with a fellow student. In the student–teacher conversations the students achieve significantly higher LTD F0 values when the two interlocutors do not share the same opinion about the provocative art. Regarding convergence we applied a variety of measure...
The matter of shadowing natural speech has been discussed in many studies and papers. However, th... more The matter of shadowing natural speech has been discussed in many studies and papers. However, there is very little knowledge of human phonetical convergence to synthesized speech. To find out more about this issue an experiment in the Polish language was conducted. Two types of stimuli were used – natural speech and synthesised speech. Five sets of sentences with various phonetic phenomena in Polish were prepared. A group of twenty persons were recorded which gave the total number of 100 samples for each phenomenon. The summary of results shows convergence in both natural and synthesised speech in set number 1, 2, 4 while in group 3 and 5 the convergence was not observed. The baseline production shown that the great majority of participants prefer ɛn/ɛm version of phonetic feature which was reflected in 83 out of 100 sentences. In the shadowing natural speech participants changed ɛn/ɛm to ɛw/ɛ̃ in 26 cases and in 4 ɛw/ɛ̃ to ɛn/ɛm. When shadowing synthesised speech shift from ɛn/ɛm ...
The paper presents the system of automatic synthesis of the Polish speech signal from text, devel... more The paper presents the system of automatic synthesis of the Polish speech signal from text, developed over the last three years at the Department of Acoustic Phonetics in Poznan. The element generating the acoustic signal is a special-purpose IC controlled from a PC AT by means of original software comprizing the modules for text editing, phonemic transcription and synthesis of digital parameters. The speech signal, produced in real time, is highly intelligible, which opens up before the system a prospect of concrete applications in man-man and machine-man communication.
The present paper describes spoken dialogue corpus creation and its annotation specification for ... more The present paper describes spoken dialogue corpus creation and its annotation specification for analysis and objective evaluation of phonetic convergence in human-human communication. The analysis of the corpus will serve for creation of convergence models which could be implemented in spoken dialogue systems based on spontaneous, expressive speech. The corpus consists of 13 hours of dialogues between 16 pairs of Polish native speakers and controlled dialogues with a teacher. The speakers knew each other and were at similar age, but during the recording could not see each other. In each recording session the pair of speakers conducted 4 dialogues in neutral scenarios and 6 dialogues in expressive scenarios, 3 dialogues with the teacher, 2 repetition tasks and 1 reading, which provided about 1 hour of speech for each pair. The corpus is being annotated on several layers: orthographic transcription of text, prosody, noise, flow of speaking turns, dialogue acts, agreement and disagree...
For modern man-machine communication traditional methods based on a keyboard and a mouse are defi... more For modern man-machine communication traditional methods based on a keyboard and a mouse are definitely insufficient. Especially thinking about poor and bad-qualified citizens of Information Society we must try to find the easiest and most comfortable method for man-machine communication. This optimal method of communication going from a man to a machine is (or should be) speech communication. In the
The paper presents technical, linguistic and didactic specifications for Euronounce project which... more The paper presents technical, linguistic and didactic specifications for Euronounce project which aims at creating an intelligent tutoring system with multimodal feedback functions for acquiring foreign languages’ pronunciation and prosody. In response to the European Union’s call for promoting less widely spoken languages the project focuses on German as a target language for speakers whose mother tongue is Polish, Slovak, Czech or Russian and vice versa – it is to enable German native speakers to acquire the pronunciation and prosody of Polish, Slovak, Czech or Russian. Beside specifications concerning corpora design, speech databases, recordings, structure of exercises and feedback system the article outlines theoretical underpinning of the project as well as the baseline for the project, AzAR, created in two preceding projects.
1 Dept. of Linguistics, Adam Mickiewicz University, Poznan, Poland 2 Dept. of English Linguistics... more 1 Dept. of Linguistics, Adam Mickiewicz University, Poznan, Poland 2 Dept. of English Linguistics, University of Stuttgart, Germany 3 Institute of Natural Language Processing, University of Stuttgart, Germany [email protected], [email protected], [email protected], ...
This paper reports on the improvement of Polish speech synthesis obtained by applying new techniq... more This paper reports on the improvement of Polish speech synthesis obtained by applying new techniques to BOSS (The Bonn Open Synthesis System) for Polish. In order to enhance the system's performance a variety of set-ups for the cost function, types of units used for concatenation (uniform vs. non-uniform unit selection) and the corpus alignment were tested. Three configurations for segment duration weights were chosen and tested with a mean opinion score perception test to investigate the impact of the applied segmental duration model on the perceived speech quality.
Uploads