A A A Volume : 45 Part : 2 Proceedings of the Institute of Acoustics herisSon – an innovative tool for spatial room impulse response (SRIR) measurements Serafino Di Rosario, LINK Acoustique, Senior Consultant, Lyon, France Johan Sick, Matlab Programmer, Lyon, France Clement Royon, LINK Acoustique, Consultant and Matlab Programmer, Lyon, France Romain Darracq, LINK Acoustique, Consultant, Lyon, France Sylvain Guitton, LINK Acoustique, Founder, Lyon, France Summary Measured spatial room impulse responses are a very useful tool to understand the behaviour of acoustic spaces by analysing them in both temporal and spatial domains. Different parametric methods exist to implement this analysis with the Spatial Decomposition Method (SDM)1 being the one employed in the presented tool. herisSon is a series of MATLAB routines that employ the mentioned SDM method and the exponential sweep sine test signal in the version presented by the author in 20102. The tool expands the previous research by providing a 3D visualisation of the IRs in the form of a hedgehog graph (the name hérisson stands for hedgehog in French) and a series of time domain interactive graphics (EDC curve, IR graph) that could be used to identify the time of arrival of reflections and their direction of arrival (DoA) with a simple click. The tool also implemented a different windowing algorithm, compared to the original SDM implementation, that suits better the 3D visualisation of the DoA of acoustic reflections and the possibility of precisely calibrate the impulse response’s level in order to provide results that could be used to analyse the Dynamic Spatial Responsiveness and other parameters with similar requirements. 1 INTRODUCTION This paper describes the implementation of a series of MATLAB routines for the measurement and analysis of Spatial Room Impulse Reponses (SRIRs) that employ the Spatial Decomposition Method (SDM)1) and the exponential sweep sine test signal in the version presented by the author in 20102. Although both methods are provided as open-source software, the authors found that a comprehensive tool for the analysis of the SRIRs was missing, especially since the provided MATLAB’s routines for the SDM method are not helping the analysis of the results showing only 2D graphs of the estimated DOA. herisSon (the name of our tool) expands the previous research by providing a 3D visualisation of the IRs in the form of a hedgehog graph and a series of time domain interactive graphics (EDC curve, IR graph) that could be used to identify the time of arrival of reflections and their direction of arrival (DoA) with a simple click. It also implements a level calibration for the measured IRs, taking inspiration from [6], that would help in the analysis of Dynamic Spatial Responsiveness and any other acoustic parameter that is dependent of relative sound pressure level (i.e. Strength (G)). The paper also presents several measurements in different conditions (from a large field to concert halls) that investigates the quality and shortcomings of the SDM method as presented in [3] and [4]. 2 HERISSON 2.1 Overview herisSon is composed of two main sections: Measurements: this is the tool’s section that is used for the actual SRIR measurements, implementing the ESS method as described in [2]. This section is hard coded to use a Zylia ZM-1 microphone as input and a single source or multichannel source as output. Analysis: this is the section where the SRIRs are analysed using the SDM method, described in [1], for microphone arrays and for B-format signals. This is also the section where the main visualisation tool helps investigate the SRIR in details. 2.2 Measurements section When opening this section, firstly the user should define the common parameters of the ESS signal used for the measurements, as shown in Figure 1: Chirp amplitude: amplitude of the test signal referred to the full digital scale (0 to 1); Sweep duration: duration in seconds of the test signal; Silence duration: duration of the silence after the complete emission of the test signal (this should be at least equal to the longest expected reverberation time at any frequency); First central frequency: this is the starting frequency of the signal, according to research in [2], as the chirp final frequency is fixed to the Nyquist frequency; Channel number: this is the number of recording channels based on the microphone in use. Figure 1: Input window for the test signal variables and number of input channel Once all the input variables have been defined, the user is presented with the main GUI that provides the following controls: Buffer size Audio interface selection Play: this button allows the user to play the test signal and check the output and input levels, thanks to a graphical interface as shown below; Figure 2: GUI for the input and output level checking Record: this button starts the measurement and provides the same graphical interface for sound levels checking; Convolver: this button allows the convolution of the recorded chirp with the inverse filter to generate the IRs needed for the analysis, it also provides information about the convolution gain (described below) and the ambisonics components. After the convolution the derived IRs are visualised on the GUI; Export: this button allows you to export and save the results of convolution as a multichannel wave file, it also creates a text file with all the test signal parameters as well as the value of the convolution gain; Analysis: this button starts the analysis module (it opens a new MATLAB window). Figure 3: GUI for the Impulse response measurements It is important to discuss some elements that works under the hood in the measurement tool. We have investigated different audio drivers to be able to reduce latency at the maximum and to have a stable latency between the reproduction and the recording of the signal; the driver PsychPortAudio5 resulted as the most appropriate candidate for this task. Latency reduction is an important aspect in IR measurements, particularly if you would like to keep the information of the “Time of flight” of your measurement, basically maintaining the right information of distance between source and receiver. Audio drivers have an inherent latency and we have evaluated that we have a constant latency of 1044 samples with a 4096 buffer size with our measurement system, allowing us to correct the IR files directly during the convolution process. The convolution engine has two other main features applied to the calculation the IRs. It calculates the gain that should be applied to the convolution for the peak value of the loudest IR to be 1 on the full digital scale and consequently applying the same gain to all other IRs (19 from the Zylia). This value is also stored in a text file, which is as well saved when exporting the IRs, that includes information about all the parameters of the test signal, too. Figure 4: Text file showing information of the convolved IR During convolution, we also convert the 19ch Zylia IRs into a 3rd order ambisonics file using matrix convolution and the filter provided by Prof. Farina7; this conversion serves two purposes: To provide the IR of a virtual omni directional microphone at the centre of the microphone array (Zylia) while performing analysis with the SDM method for open arrays; To provide the first order ambisonics file while performing analysis with the SDM method for ambisonics; The need to provide a reference pressure microphone at the centre of the array, as described in [1], is tricky when using the ambisonics conversion as this process create a file with a large silence at the start of the file that does not respect the idea of having the correct “time of flight” for a microphone placed at the centre of the array. A correction to this has been provided manually by asking the software to cut the derived W channel by a value we have manually measured on the digital audio file. 2.3 Analysis section Once the IRs have been calculated and saved though the export function, the user can access the analysis section to start the calculation of the visualisation of the DOAs of individual reflection as well as the directional energy distribution of the sound field. The current version of the software is hardcoded for the case of our measurement set-up (Zylia Microphone as input). Figure 5 below shows the main analysis results window that includes the 3D hedgehog visualisation, the ETC (Energy Time Curve) and the reference pressure IR with the time windows. Figure 5: Analysis window – 3D Hedgehog visualisation (Right), Energy Time Curve (ETC, Top Left) and the reference pressure IR in time domain with windowing (Bottom Left) The analysis window is the main core of the software and the reason why we have decided to produce our own version of this analysis tool. It is an interactive element, user can: Interact with the 3D visualisation by turning it, change the dB dynamics visualisation (meaning we can reduce the number of reflections shown on the basis of their relative sound pressure level) and select each peak to have information on time of arrival and relative sound pressure level; Interact with the ETC curve by selecting peaks on the curve that are also shown as thick purple lines on the 3D hedgehog, this helps identify reflections based on the time of arrival and amplitude on a graph commonly used by acousticians; Check the time windows on the reference pressure IR and select peaks that are equally shown on the 3D hedgehog; Using the checkbox at the centre of the window to visualise only the time windows of interest. The 3D Hedgehog visualisation allows the user to clearly identify the estimated Direct sound (thick red line) and the reflections of a specific time window by their colour. This visualisation idea is taken from the original SDM code but expanded and modified to improve the visualisation tool compared to the 2D graphs shown in the original code1. Our visualisation tool is based on a series of modifications we have applied to the original SDM code: Original SDM code does not have a clear identification of the direct sound, instead it uses an onset value (it takes the first IR peak that passes the level of 0.2 on the digital scale) to define the time windows starting point; herisSon identifies the direct sound as the loudest peak in the first part of the IR (it searches for the loudest peak starting from the time when the IR level passes 0.1, on the digital scale, up to 100ms) and it creates a small window of 1 sample each side of this peak; Original SDM code uses overlapping time windows where each window starts at the onset value and ends to the time value set-up by the user in the code; herisSon does not use overlapping windows, each window starts at the end of the previous one and its duration is set by the time value set-up by the user in the code; Both codes use an angle resolution for the analysis that sums up all reflections coming within this angle resolution, herisSon defines this angle resolution as a circular sector with the angle resolution as its centre (i.e. for an angle resolution of 1 degree, we have a circular sector with boundaries at + 0.5 deg et – 0.5 deg around the 0,0 point (front of the microphone), Figure 6); Figure 6: angle resolution alpha and the circular sector in 2D • herisSon implements in one code both the analysis with the SDM method for open arrays as well as the SDM method for first order ambisonics signals, the user will have a dialogue box that helps decides which method to use (Figure 7). Figure 7: dialog box for the choice of the SDM method for open arrays or first order ambisonics 3 MEASUREMENTS This section presents a series of measurements in different conditions; it is important to mention that, unfortunately, we did not manage to perform any measurements in controlled conditions (anechoic or semi-anechoic chamber) because of the lack of availability of these rooms at the universities in Lyon. This has been programmed for the future. Measurement locations: Grand Parc de Miribel-Jonage (LYON): we had access to this scenic park on the North east of Lyon where we could easily find a large field with no obstacles (trees or man-made constructions) in a radius of more than 100m around the measurement point. Link Acoustique’s office meeting room (LYON): we have been able to use our meeting room to perform some tests where source and receiver are at about 1cm distance, this has been used to validate our direct sound and windowing algorithms’ choice; Salle Moliere (Lyon): Early 20th century concert hall and theatre used mainly for chamber music concerts; Collégiale de Saint-Symphorien (Saint-Symphorien-sur-Coise): Early 15th century gothic church that was original used for liturgical and Gregorian chants, it has a very long reverberation time with a trademark running reverberation. All results are presented for both the analysis with the SDM method for open arrays and for the SDM method for first order ambisonics. 3.1 Grand Parc de Miribel-Jonage The site provided us with measurement conditions that could be considered similar to a semi anechoic room with a reflecting floor. Source and receiver have been placed in the middle of a large field, with no reflecting surface in a radius of 100m around, at 2m distance between them and 2 m from the floor. Two different measurements configuration have been used, one as described above (Figure 8) and another in the same configuration but with a reflective surface to the right of the microphone at about 2.5m from source and receiver (Figure 9). Figure 8: Grand parc de Miribel-Jonage – Measurements without reflector Figure 9: Grand parc de Miribel-Jonage – Measurements with reflector The following figures shows the results of the measurements using the 3D hedgehog visualisation, with the SDM method for open arrays on the left and the SDM method for first order ambisonics on the right; the first couple of figures shows the case without reflector with the entire set of time windows, the second couple of figures show the same case but with only the direct sound and the first time window (Direct – 2ms), the third and the forth figures follow the same order as the previous two but for the case with reflector. Figure 10: Grand parc de Miribel-Jonage – Results without reflector, all time windows Figure 11: Grand parc de Miribel-Jonage – Results without reflector, direct sound + direct sound-2ms time window Figure 12: Grand parc de Miribel-Jonage – Results with reflector, all time windows Figure 13: Grand parc de Miribel-Jonage – Results with reflector, direct sound + direct sound-2ms time window The results show that in the case of the SDM method for open arrays, some unexpected reflections appear to the back, to the right and the bottom of the microphone in the first 2ms from the direct sound. This is an issue with the SDM analysis of open arrays when there are some sound events in short succession, as described in [3], while it is not the case for the first order ambisonics analysis, as described in [4] in the case without reflector. In the case with the reflector, the problems of parasite reflection in the first 20ms are present in the 1st order ambisonics results but not in the open array results; instead, if we look at the 5-10ms window (time of arrival of the reflections from the reflector), we can see that the SDM open array analysis is incorrect with several parasite reflections on the back of the microphone and to the opposite side of the reflector, while the first order ambisonics result is correct. The measurements shows as well that the detection of the direct sound is correct. 3.2 LINK Acoustique’s meeting room Our room has been used as our main testing room, but we present exclusively the test we have done placing source and receiver at a distance <1cm to check if even in this case we could find some parasites reflections in the case of the SDM analysis for open arrays. Measurement conditions are shown in Figure 14 below; in this case we only present the results for the direct sound and the first time window (direct sound – 2ms). Figure 14: LINK Acoustique’s meeting room Figure 15: LINK Acoustique’s meeting room – Results direct sound + direct sound-2ms time window Even in this case it is easy to spot the presence of parasite reflections on the back and sides of the microphone, even if we think part of those are coming from the microphone support (a large luggage). The measurements shows as well that the detection of the direct sound is correct. 3.3 Salle Moliere (Lyon) The Salle Moliere is quite renowned for its acoustics among the chamber music’s audience in Lyon, it has a balcony area on 3 sides and a large stage. Every surface in the room is made of painted wood veneer with an average RTmid-freq of 1,4s that increases to 2,5s in the low frequency range (63Hz to 250Hz). Figure 16: Salle Moliere (Lyon): Measurements positions Several measurements have been performed in different locations with the source fixed at the centre of the stage but, for brevity, we will only present one measurement done in row H seat 1 (8th row, centre of the audience). Figure 17: Salle Moliere – all time windows Figure 18: Salle Moliere – direct sound + direct sound-2ms time window Even in this case it is easy to spot the presence of parasite reflections on the back and sides of the microphone, even if we think part of those are reflections from the row of seats to the back. On another hand, the measurements shows that the detection of the direct sound is correct and that the representation of the anisotropic late field as a directional energy distribution is matching the perceptual listening experience. 3.4 Collégiale de Saint-Symphorien-sur-Coise This gothic church has been an important landmark for the region even before its construction in the early 15th century, as its site was occupied by a castle during the Middle Ages. Its position on top of a steep hill allowed to watch the surroundings and impose taxation on the commercial trade of the valley. It has been used in the past as a school for clerics that used to learn worshipping and chants in its particularly lively acoustics. Figure 19: Collégiale de Saint-Symphorien-sur-Coise Figure 20: Collegiale de Saint-Symphorien-sur-Coise - Measurements locations Figure 20 shows all measurement locations that have been performed with an omnidirectional source placed on the altar and with the PA system installed in the church in the 80s. For brevity, we present only one measurement using the omnidirectional source and the microphone in the position P5 (towards the bottom right side of the church). Figure 21: Collegiale de Saint-Symphorien-sur-Coise – all time windows Figure 22: Collegiale de Saint-Symphorien-sur-Coise – direct sound + direct sound-2ms time window In the case of this measurement, we cannot spot the issue with parasite reflections in the case of the SDM analysis for open arrays or in the case of the ambisonics analysis. The only differences compared to the other measurements are that the distance between source and receiver was about 15m and that there are several reflections that arrive quite late to the microphone due to the geometry of the church. Also, the microphone is not in direct sight of the source, but the detection of the direct sound is still correct. The representation of the isotropic late field as a directional energy distribution is matching the perceptual listening experience of being immersed in a late running reverberation field very useful for liturgical and Gregorian chants. 4 CONCLUSIONS The paper shows the development of herisSon, measuring and analysis tool for SRIR (Spatial Room Impulse Responses), proposing a novel GUI for 3D visualisation and analysis of the results. The tool demonstrates that this technique is very useful in the detailed analysis of room acoustics and that its results match the perceptual listening experience as well the expected DOA of reflections. We have also shown that some shortcomings of the chosen SDM methos still exist and that a more performative solution needs to be investigated. In particular, we have investigated the analysis of the direct sound and the DoA of reflections in the first 2ms showing that in many cases the SDM method for open microphone arrays presents a series of parasite reflections particularly towards the back and side of the micriphone when only energy from the front is expected. Additional analysis is needed, including measurements in more controlled conditions, in order to evaluate a more performative algorithm. 5 FURTHER WORK Next steps in the development should include an improved and more robust algorithm for the detection of the direction of arrival of reflections reducing the artefacts that creates phantom parasite reflections in some cases. Our idea is to follow the High order ambisonics path described in [4]. Additionally, as the tool offers the possibility of calibrating the relative level of the IR measurements, we would like to investigate more in detail its use for the analysis of the Dynamic Spatial Responsiveness and any other acoustic parameter that is dependent of relative sound pressure level (i.e. Strength (G)). 6 ACKNOWLEDGEMENTS We would like to thank: Direction des affaires culturelles de la mairie de Lyon for allowing us to access the Salle Moliere. Prof. Angelo Farina of the University of Parma in Italy to help us find the best way to apply matrix convolution in MATLAB and to his incredible availability to discuss any topic in acoustics. Our colleagues at LINK Acoustique for their patience during long days of exponential sine sweep testing at some time unbearable levels for a normal office. 7 REFERENCES J Pätynen, S Tervo, T Lokki, “Analysis of concert hall acoustics via visualizations of time frequency and spatiotemporal responses”, The Journal of the Acoustical Society of America 133 (2), 842-857 Di Rosario S., Vetter K., “Expochirp Toolbox: A Pure Data Implementation of Exponential Sweep Sine (Ess) Impulse Response Measurement”, Proceedings of the Institute of Acoustics, Vol.33 Pt.6, 2011. N Meyer-Kahlen, S V.Amenagual Gari, T Lokki, “What the spatial decomposition method can and cannot do”, Proceedings of the 24th International Congress on Acoustics October 24 to 28 2022, ABS-0825 S. Tervo and A. Politis, "Direction of Arrival Estimation of Reflections from Room Impulse Responses Using a Spherical Microphone Array”, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 10, pp. 1539-1551, Oct. 2015, doi: 10.1109/TASLP.2015.2439573. https://psychtoolbox.org/docs/PsychPortAudio https://pcfarina.eng.unipr.it/Aurora/ https://pcfarina.eng.unipr.it/Public/Xvolver/Filter-Matrices/Aformat-2-Bformat/Zylia-Gen-2023/ Previous Paper 6 of 37 Next