Unified Transcription and Translation for Extended Reality UTTER

Excerpt

IT - Instituto de Telecomunicações, exists to create and disseminate scientific knowledge in the field of telecommunications.


Main Objective:
The research in UTTER is motivated by the following two use-cases:
- Virtual Assistant for Online Multilingual Meetings The assistant should be able to translate from the speaker
into the language of the listener, producing a summary of the meeting, and noting action points (minuting).
- Multilingual Customer Service Dialogue Tool The tool should enable a customer service agent to provide support to global users (by text or voice) in cases where the customer and agent speak di erent languages, using the context of the conversation to assist the agent, with guidance to provide helpful, personalized answers that take into account the formality level and the cultural context of the customer and the brand, as well as monitoring the satisfaction
and engagement of the customer.

We tackle these use-cases by extending the state-of-the-art in translation and summarisation in the following ways:
- Translation should be multimodal, i.e. equally strong for speech input as it is for text input.
- All language technologies should be multilingual – in this project we will cover 6 languages: English, French,
German, Portuguese, Dutch, and Korean.
- Dialogue generation and translation should take into account the context of the conversation and its history, as well as other forms of context, such as the meeting notes, desired politeness level, and the speakers’ emotional status
(e.g. the sentiment of the customer).
- Translation of speech should be able to take into account paralinguistic aspects such as intonation, as this can often
change the meaning of the utterance, and should track the identity of the speaker.
- A summary of a meeting should include the action points generated by the meeting (i.e. the minutes).
- Summarisation and minuting should be explainable, in other words everything included in the meeting summary
should be relatable to the content.
- Translation and summarisation should be e cient; in particular, speech translation should be real-time.
- Systems should be robust and confidence-aware: they should be resilient to typos, acoustic noise, and recognition
errors, and they should be able to report their uncertainty.

We will achieve these through the use of pre-trained XR models, but in an fully open pipeline, where questions on
bias, fairness, risk etc. can be examined.

Associated Publications

  • 9Papers in Conferences
  • A. F. Farinhas, D. Ulmer, C Zerva, A. Martins, Non-Exchangeable Conformal Risk Control, International Conference on Learning Representations ICLR, Vienna, Austria, Vol., pp. -, May, 2024,| Abstract | BibTex
  • N. Guerreiro, D. A. Alves, J. Waldendorf, J. Waldendorf, B. Haddow, A. Birch, P. Colombo, A. Martins, Hallucinations in Large Multilingual Translation Models, Empirical Methods in Language Processing - EMNLP, Singapore, Singapore, Vol., pp. -, December, 2023,| Abstract | BibTex
  • D. A. Alves, N. Guerreiro, J. Alves, J. Pombal, R. Rei, J. G. de Souza, P. Colombo, A. Martins, Steering Large Language Models for Machine Translationwith Finetuning and In-Context Learning, Empirical Methods in Language Processing - EMNLP, Singapore, Singapore, Vol., pp. -, December, 2023,| Abstract | BibTex
  • P. Fernandes, A. Madaan, E. L. Liu, A. F. Farinhas, P. H. Martins, A, B. Bertsch, J. G. de Souza, S. Z. Zhou, S. W. Wu, G. Neubig, A. Martins, Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation, Empirical Methods in Language Processing - EMNLP, Singapore, Singapore, Vol., pp. -, December, 2023,| Abstract | BibTex
  • F. Blain, C Zerva, R. Rei, N. Guerreiro, D. Kanojia, J. G. Sousa, B. Silva, T. Vaz, Y. Jinxuan, F. Azadi, C. Orasan, A. Martins, Findings of the WMT 2023 Shared Task on Quality Estimation, Conference on Machine Translation WMT, Singapore, Singapore, Vol., pp. -, December, 2023,| Abstract | BibTex
  • M. Freitag, N. Mathur, C. Lo, E. Avramidis, R. Rei, B. Thompson, T. Kocmi, F. Blain, D. Deutsch, C. Stuart, C Zerva, S. Castilho, A. Lavie, G. Foster, Results of WMT23 Metrics Shared Task: Metrics might be Guilty but References are not Innocent, Conference on Machine Translation WMT, Singapore, Singapore, December, 2023 | BibTex
  • S. Honda, P. Fernandes, C Zerva, Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues, Machine Translation Summit MT Summit, Macau, China, Vol., pp. -, September, 2023,| Abstract | BibTex
  • R. Rei, J. G. Sousa, D. M. A. Alves, C Zerva, A. C. Farinha, T. Glushkova, A. Lavie, L. C. Coheur, A. Martins, COMET-22: Unbabel-IST 2022 submission for the metrics shared task, Conference on Machine Translation WMT, Abu Dhabi, United Arab Emirates, Vol., pp. -, December, 2022,| Abstract | BibTex
  • C Zerva, F. Blain, R. Rei, P. Lertvittayakumjorn, J. G. Sousa, S. Eger, D. Kanojia, D. A. Alves, C. Orasan, M. Fomicheva, A. Martins, L. Specia, Findings of the wmt 2022 shared task on quality estimation, Conference on Machine Translation WMT, Abu Dhabi, United Arab Emirates, Vol., pp. -, December, 2022,| Abstract | BibTex