Welcome to the new IOA website! Please reset your password to access your account.

Proceedings of the Institute of Acoustics

 

Radial filter design for open spherical microphone arrays

 

Nara Hahn, Institute of Sound and Vibration Research, University of Southampton, UK
Filippo M. Fazi, Institute of Sound and Vibration Research, University of Southampton, UK
Vlad S. Paul, Institute of Sound and Vibration Research, University of Southampton, UK

 

1 INTRODUCTION

 

A sound field captured by a spherical microphone array can be expressed in terms of spherical harmonics, constituting a scene-based representation1,2. Decomposing the sound field into spherical harmonic components requires the so-called radial filtering, where the frequency response of each harmonic mode at the microphone radius is equalised3,4. For open spherical microphone arrays, the radial frequency responses are characterised by the spherical Bessel functions which exhibit spectral notches. This introduces additional challenges to the filter design as compensating the notches with peaking filters is prone to instability and noise boost issues5,6. In this study, we design a radial filter by combining IIR filters in parallel, where each IIR section is used to equalise one of the above-mentioned notches. The peak frequencies of the IIR filters are matched with the notch frequencies of the spherical Bessel functions. A regularisation scheme is applied in order to limit the peak magnitude response7,8. It is shown that this shifts the filter’s poles away from the unit circle, thereby reducing the maximum boost level of the filter. Finally, the gains of the individual IIR sections are obtained by a least-squares fit to the desired transfer function at control frequencies. The spectral properties of the proposed radial filters are evaluated by examining the magnitude and phase responses. The radial filters are further tested in a simulated setup, where a modal beamforming strategy is applied to the spherical microphone arrays signals. The spatio-temporal and spatio-spectral properties of the beamformer are investigated.

 

2 SPHERICAL HARMONIC ANALYSIS

 

In this paper, we use the spherical harmonic expansion to represent sound fields. A plane wave propagating in a free-field, for instance, can be expressed as9 [Eq. (6.175)]



 

where x = (r, θ, φ) is the evaluation point and npw = (1, θpw, φpw) the unit vector pointing towards the direction-of-arrival of the wave. The spherical coordinate representation (r, θ, φ) is employed where r ≥ 0 denotes the radius, θ ∈ [0, π] the colatitude, and φ ∈ [0, 2π) the azimuth. The inner product of the two vectors are denoted by h·, ·i. The spherical harmonics Ynm(θ, φ) describe the directional dependency of each order n = 0, 1, . . . and degree m = −n, . . . , n. The radial dependency of the sound field is described by the spherical Bessel functions jn(·). A short-hand notation  is used to represent the double summation The complex conjugate is denoted by asterisk in superscript (·), the imaginary unit by i and the speed of sound in m/s by c. The angular frequency ω is related to the temporal frequency f by ω = 2πf. The time harmonic term eiωt is omitted for brevity.

 

We assume that the sound field of interest S(x, ω) can be expressed as a superposition of plane waves arriving from various directions. The spherical harmonic representation then reads1 [Sec. 2.4]

 

 

where are the spherical harmonic coefficients of the plane wave amplitude distribution ,

 

 

Both   and  fully describe the sound field at any point within the homogeneous region. In the context of spherical array processing, it is of interest to extract the coefficients  from an array measurement or recording. Note from (2) that, irrespective of the plane wave distribution, the radial dependency of the individual modes is characterised by . If an array of omnidirectional microphones are placed at a fixed radius r = r0, the spherical harmonic coefficients of the captured sound field are given as

 

 

Extracting  from the measurement thus requires the design of an inverse filter of  , which is referred to as the radial filtering in the literature.

 

In this study, we consider an open spherical array, where the pressure sensors (omni-directional microphones) are placed at r = r0. It is assumed that the microphone capsules, cables and the array structure have negligible effect on the sound field, which allows us to use the free-field boundary condition. We further assume that the spherical volume surrounded by the microphone array is free of sound source and scatterer. The sound field captured by the microphones therefore only consists of sound waves originating from the exterior region.

 

3 RADIAL FILTER DESIGN

 

In this section, we present the proposed radial filter design. We attempt to equalise the modal transfer functions of an interior sound field which are characterised by . The magnitude and phase are depicted in Fig. 1 for n = 0, . . . , 3. Irrespective of the order n, the envelope of the magnitude decays with the rate of 6 dB/octave at high frequencies (). For n ≥ 1, the low-frequency responses are dominated by the n repeated zeros at ω = 0, resulting in a spectral slope of +6 · n dB/octave. The spherical Bessel functions also have infinite number of simple zeros spanning the entire frequency axis10 [Sec. 10.58]. Except for n = 0, the zeros do not have an analytical expression and thus have to be computed numerically. Since jn(·) are real-valued, the phase responses are primarily determined by the term n, which corresponds to n · 90. The zero-crossings of the magnitude responses introduce an additional phase of 180at the null frequencies.

 

The proposed radial filter has a parallel structure and can be expressed by a summation,



 

where are the individual IIR filters and the corresponding weights. The rationale behind this design is that each IIR section  is used to equalise the spectral notch that occurs at frequency . The IIR filters will therefore have the shape of a peaking equaliser, whose centre frequencies are matched to the notches of the spherical Bessel functions. The size of the parallel structure M is determined by the number of   that lie within the frequency band of interest. The first filter is used to equalise the nth-order nulls at f = 0 whereas the other filters are used to equalise the zeros at . Since the filter design process is the same for different harmonic orders, the superscript (·)(n) will be omitted in the remainder of this paper.

 

An important consideration in the filter design process is limiting the maximum magnitude boost (or dynamic range) of the frequency response function. Otherwise, the radial filter would not only equalise the nulls but also excessively increase unwanted components, such as the self-noise of the microphone capsules, errors due to position inaccuracies, or sound waves that might be caused by scattering inside the open sphere. Moreover, the computation of the null frequencies  relies on an estimate of the speed of sound c. A deviation from the actual value might introduce additional spectral distortions. In order to achieve robustness against such noise, errors and inaccuracies, the Tikhonov regularisation scheme is commonly employed in inverse filter design. The maximum gain of the filter is explicitly controlled by the regularisation parameter as will be described below.

 

In the proposed method, the regularisation of each IIR filter is performed independently. In the Laplace domain, the regularised inverse of the nth-order zero at s = 0 can be expressed as

 

 

where  is the regularisation parameter. The second equality shows that the inverse filter consist of n zeros at s = 0 and 2n poles at . It can be shown that are uniformly distributed on a circle with radius 

 

 

which coincide with the poles of a Butterworth filter11 [Sec. 2.2]. As β0 0, H0(s) approaches the direct inverse . The same regularisation can be applied for the complex conjugate poles at

 

 

which consist of 2 zeros on the imaginary axis and 4 poles. The closed-form expression of reads

 

 

where the index l corresponds to each quadrant of the complex plane. In (7) and (9), increasing the regularisation parameter  pushes the poles away from the imaginary axis, thereby reducing the peak of the magnitude response. The peak magnitude of each IIR filter denoted by is related to the regularisation parameter by or equivalently by . The prescribed maximum boost is commonly given in dB which will be denoted by .

 

It is worth emphasising that half of the poles in (6) and (8) are on the right half-plane in the Laplace domain. For a stable realization, the corresponding poles have to be implemented as non-causal systems. Since the causal and noncausal poles are symmetric with respect to the imaginary axis, the forward-backward technique can be used which allows a zero-phase filtering12 [Sec. 13.12]. This constitutes an offline processing, where we (i) time-reverse the input signal, (ii) filter with the noncausal poles, (iii) time-reverse the signal again, and (iv) filter with the causal poles. A quasi-online implementation is possible by using a frame-by-frame processing and truncating the output. This is, however, beyond the scope of this paper and will not be considered.

 

The Laplace-domain filters (6) end (8) are discretised by using the matched -transform13 [Sec. 8.3.4]. The zeros and poles in the Laplace domain are mapped into the -domain by

 

 

where   and    are the -domain zeros and poles, respectively. The sampling period is denoted by   for a given sampling frequency . The matched -transform preserves the characteristic frequencies of the individual zeros and poles as long as they are within the Nyquist limit. Applying this mapping to the continuous-time filters (6) and (8) yields discrete-time IIR filters,

 

 

and

 

respectively.

 

A common practice in radial filter design is to impose a limit on the maximum magnitude boost. This can be achieved by optimising the regularisation parameters  and the weights  in the proposed filter structure (5). Instead of performing an extensive parameter search, a simple two-step optimisation is used in this paper. In the initial step, M prototype filters are constructed by using (12) and (13). The variables associated with the first step will be indicated by the hat symbol . The regularisation parameters are set to where gmax denotes the maximum allowable boost (Gmax in dB). The initial weights  are computed in such a way that the equalised response matches a target response t(ω),

 

 

where . The target response t(ω) can be a unit magnitude or a low-pass function depending on the frequency range of interest. Minimising the mean squared error between the equalised response and the target response at control frequencies ωk, k = 0, 1, . . . , K − 1 (K > M) yields the least squares solution, which can be expressed in matrix form as

 

 

where are the weights,  the spherical Bessel function filtered by each IIR filter, and  the target response. The individual peaks of the combined response can be approximated by  · gmax, on the assumption that each IIR filter  is dominant around ω = ωµ. The deviation from the prescribed boost gmax is compensated in the second step. A new set of filters () are constructed with regularisation parameters which correspond to maximum boost levels . Similar to (15), the optimal weights  in the minimum mean squared error sense are given as14,15,

 

 

where and . The final design is defined by  and () [cf. (5)]. The iterative process can be continued for a further refinement. However, the design framework described above was found to be sufficient for the purpose of radial filter design.

 

4 EVALUATION

 

In order to evaluate the proposed radial filter design, a numerical experiment is carried out for an open sphere with a radius of r0 = 0.25 m. The number of IIR sections is M = 15 for the 0th-order harmonic mode and M = 16 for higher modes (n ≥ 1). We consider different levels of regularisation that correspond to maximum boosts of Gmax = 10, 30, 40, 50 dB. The sampling frequency is set to = 48 kHz and the speed of sound is assumed to be c = 343 m/s.

 

 

Figure 1: Magnitude and phase responses of the proposed radial filters (modal orders n = 0, . . . , 3, radius r0 = 0.25 m, number of IIR sections M = 15 for n = 0 and M = 16 for n ≥ 1, sampling frequency = 48 kHz, uniform distribution of K = 1024 control frequencies in [1 Hz, 12 kHz]). The thin gray lines indicate the response of the individual IIR filters () that are scaled with the respective weights  [cf. Sec. 3]. The modal frequency responses  and equalised responses are indicated by blue and black curves, respectively.

 

4.1 Modal equalization

 

The frequency-domain properties of the radial filters are first examined. In Fig. 1, the transfer functions of the radial filters (n = 0, 1, 2, 3) are depicted in orange. Overall, the magnitude responses stay below the prescribed maximum boost Gmax = 40 dB. A slight overshoot (< 1.8dB) is observed at high frequencies. Since the regularisation becomes stronger at higher frequencies, each IIR filter (depicted with thin gray lines) is no longer dominating the combined response around the poles. This causes an underestimation of the spectral peak when determining the regularisation parameter. However, the deviation is marginal. The solid black lines in Fig. 1 indicate the equalised responses which is the frequency-domain product of the spherical Bessel functions (blue) and the radial filters (orange). A flat magnitude response can be observed over a wide frequency range. Spectral notches can be seen at discrete frequencies which coincide with the nulls of the spherical Bessel functions. The IIR sections at higher frequencies require stronger regularisation to compensate the constant decay of the spherical Bessel functions. This leads to an increased deviation from the target response. The compensation of the phase is also successful. The equalised responses exhibit zero-phase except for minor deviations around the pole frequencies. Slight fluctuations are due to the phase errors introduced by the matched -transform that was used to map the Laplace-domain poles into the -domain [cf. Sec. 3].

 

4.2 Modal beamforming

 

The radial filters are now applied to a simulated measurement, where Q = 120 omnidirectional microphones are distributed on the sphere by using a t-design of degree 15. The highest spherical harmonic order that can be computed by numerical integration is N = 7. This sets the upper bound of the operation frequency range, referred to as the spatial aliasing frequency falias = c N/(2πr0) 1528.5 Hz1[Sec. 4.1]. A sound field of a single plane wave is simulated which arrives from the positive x-axis, i.e. (θpw, φpw) = (90, 0). The plane wave carries a band-limited pulse with cut-off frequency of 2.5 kHz, whose spectrum is a zero-phase sixth-order Butterworth low-pass. The microphone signals are simulated by using fractional delay filters based on the Lagrange interpolation16.

 

The time-domain signals are transformed into the spherical harmonic domain by using the pseudo inverse of the spherical harmonic matrix. This can be expressed as a numerical integration1 [Eq. (3.32)],

 

 

where (r0, θq, φq) are the coordinates of the qth microphone. The quadrature weights denoted by ensure the orthogonality of the spherical harmonics up to the Nth order under the numerical integration. The spherical harmonic analysis (17) yields (N + 1)2 = 64 signals. The radial filtering operation, denoted by (·), is then applied to ,

 

 

yielding the coefficients of the plane wave amplitude functions . As discussed in Sec. 3, this requires an offline processing using the forward-backward filtering technique. Finally, the beamformer output is obtained by performing the spherical harmonic expansion for look directions (θl, φl),

 

 

No additional modal weighting or global equalization is applied. We evaluate the beamformer output for θl = 90and φl [180, +180). Since the configuration is nearly rotationally symmetric with respect to (θpw, φpw), this represents the results for all possible look directions reasonably.

 

Fig. 2a and Fig. 2b show the beamformer output in the time and frequency domain, respectively. The time-domain responses are normalized to 0 dB with respect to the peak value at on-axis and the frequency-domain responses to 0 dB with respect to the on-axis magnitude response at 1 kHz. It can be seen that the temporal response of the beamformer is more compact for smaller Gmax. A strong regularisation limits the spectral speak of the radial filter, thereby reducing the pronounced ringing in the time domain. Allowing a higher boost (e.g. Gmax = 50 dB) increases the temporal extent of the beamformer output. This might lead to negative perceptual effects, such as colouration.


 

Figure 2: Impulse responses and frequency responses (magnitude) of modal beamformer using an open spherical microphone array (Q = 120 microphones, radius r = 0.25 m, modal order N = 7, maximum boost level Gmax = 10, 30, 50 dB). A single plane wave arrives from (φpw, φpw) = (90, 0). The beamformer scans the horizontal plane with look directions φl [180, +180).

 

The influence of regularisation on the spatial resolution of the beamformer is well demonstrated in Fig. 2b. At each frequency, the modal attenuation depends on the spectral shape of the spherical Bessel functions and the regularisation parameter. This leads to a frequency-dependent beam width which is determined by the relative gain of the individual modes. Irrespective of Gmax, the width of the main lobe gradually increases towards the low frequencies This is mainly attributed to the nth-order zeros of jn(·) which can only be compensated to a limited extent and requires increasingly stronger regularisation. As the frequency approaches ω = 0, the higher-order modes are attenuated and the effective order of the beamformer decreases. Some local fluctuations of the beam pattern can also be observed. Two exemplary frequencies are as indicated by ‘+’ in Fig. 2b (middle). These frequencies (f = 686.0 Hz and f = 981.2 Hz) coincide with the spectral nulls of the 0th- and 1st-order spherical Bessel functions (cf. Fig. 1), where a stronger regularisation is required to keep the magnitude response within the permissible range. A relatively uniform main lobe and a flat on-axis magnitude response is achieved for Gmax = 50 dB which comes at the expense of a smeared temporal response and a risk of boosting unwanted noise.

 

5 CONCLUSION

 

A radial filter design was proposed for open spherical microphone arrays. In order to equalise the spectral notches exhibited by the modal transfer functions, a set of peaking IIR filters were used in a parallel structure. The Tikhonov regularisation was applied to the individual IIR filters, which results in repositioning of the poles. The proposed design enables an efficient implementation of the radial filtering compared to the frequency-domain processing which commonly requires a large DFT length. The spectral properties of the filters were shown to be appropriate both in terms of modal equalisation and regularisation. As a proof of concept, the proposed radial filter was used in a plane wave decomposition beamformer and the spatial properties of the output were examined. The proposed idea can be further extended. For instance, the zeros of the IIR filters can be shifted or removed to achieve a smoother magnitude response. A minimum-phase system can also be constructed by moving the non-causal poles to the left half-plane, which requires a deeper understanding of the perceptual impact of the resulting phase distortions.

 

6 REFERENCES

 

  1. B. Rafaely. Fundamentals of Spherical Array Processing. Springer, 2015.

  2. F. Zotter and M. Frank. Ambisonics. Springer, 2019.

  3. J. Meyer and G. Elko. A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield. In Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), volume 2, pages 1781–1784, Orlando, FL, USA, May 2002.

  4. S. Lösler and F. Zotter. Comprehensive radial filter design for practical higher-order Ambisonic recording. In Proc. 41st German Annu. Conf. Acoust. (DAGA), pages 452–455, 2015.

  5. H. Sun, T. D. Abhayapala, and P. N. Samarasinghe. Time domain spherical harmonic processing with open spherical microphones recording. Appl. Sci., 11(3):1074, 2021.

  6. F. Ma, T. D. Abhayapala, and P. N. Samarasinghe. Circumvent spherical bessel function nulls for open sphere microphone arrays with physics informed neural network. In Proc. 10th Forum Acusticum, pages 2169–2175, Torino, Italy, September 2023.

  7. P. C. Hansen. Rank-deficient and Discrete Ill-posed Problems: Numerical Aspects of Linear Inversion. SIAM, 1998.

  8. O. Kirkeby and P. A. Nelson. Digital filter design for inversion problems in sound reproduction. J. Audio Eng. Soc. (JAES), 47(7/8):583–595, 1999.

  9. E. Williams. Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography. Academic Press, 1999.

  10. F. W. J. Olver, D. W. Lozier, R. F. Boisvert, and C. W. Clark. NIST Handbook of Mathematical Functions Handback. Cambridge University Press, 2010.

  11. D. Schlichthärle. Digital Filters. Springer, 2014.

  12. R. G. Lyons. Understanding Digital Signal Processing. Prentice Hall, 1997.

  13. J. G. Proakis and D. G. Manolakis. Digital Signal Processing: Principles, Algorithms, and Applications. Prentice-Hall, 1996.

  14. M. Karjalainen and T. Paatero. Equalization of loudspeaker and room responses using Kautz filters: Direct least squares design. EURASIP J. Adv. Signal Process., 2007:1–13, 2006.

  15. B. Bank. Logarithmic frequency scale parallel filter design with complex and magnitude-only specifications. IEEE Signal Process. Lett., 18(2):138–141, 2011.

  16. T. I. Laakso, V. Valimaki, M. Karjalainen, and U. K. Laine. Splitting the unit delay. IEEE Sig. Process. Mag., 13(1):30–60, 1996.