Abstract
This talk presents the core component of a developing framework for audio analysis in education. The focus is on a deep learning model trained to estimate how clear and intelligible speech sounds in recorded lectures.
The model forms the central part of a system called RAfAEL (Real Audio for Analysis in Educational Lectures). It uses a multi-head architecture to predict perceptual features like clarity, definition, and reverberation; directly from raw audio i.e., it does not require a clean reference signal.
Other parts of the full system, including a classifier that filters non-speech segments and a noise evaluation module, are still in development and are not covered in this session.
The aim is to show how machine learning can help identify when recorded speech may be hard to understand and to support better AV and acoustic design in learning spaces.
The talk will be followed by a short Q&A session.
Bio
Rodrigo Sanchez-Pizani is the Audio-Visual Solutions Architect Lead at King’s College London, where he oversees the design and implementation of AV systems that support teaching, learning, and operational objectives across the university. His role blends strategic planning with hands-on technical work, ensuring that the AV infrastructure evolves in step with the needs of higher education.
Born in Cambridge, UK, and raised in Caracas, Venezuela, Rodrigo has maintained a lifelong passion for sound and audiovisual technology. His formal training in acoustics began in 2014 with enrolment in the MSc Acoustics programme at London South Bank University. After completing his MSc and taking a brief break, he progressed into a PhD, where his research explores lecture capture environments and the classification of educational recordings using machine learning.
In addition to his academic and professional pursuits, Rodrigo has actively contributed to the development of industry standards through his volunteer work with AVIXA. His involvement includes serving on the Steering Committee and participating in the System Verification Standards Working Group.
His primary areas of interest include speech and intelligibility, acoustics, sound engineering, applied AI and machine learning, and the design of AV systems tailored for higher education.
Details
Details on the talk can be found below:
https://www.cpdtag.com/app.php?event=1A28CCA56AD209870A4AE6C321DCEC1
Please note that due to the tube strikes, this talk will be online only, with no in person attendance.