A A A Volume : 46 Part : 2 Proceedings of the Institute of Acoustics Speech babble as sound masking in open-plan offices – a statistical approach P Leonard, Max Fordham LLP A Chilton, Max Fordham LLP 1 INTRODUCTION A study has been undertaken to establish the conditions under which speech babble can provide useful masking noise in open-plan office settings. Previous studies have shown that the distraction caused by speech diminishes when there are multiple simultaneous voices – effectively creating an unintelligible ‘babble’. Where speech babble is not distracting to work, it can potentially provide useful masking sound, improving privacy. This is more likely to be the case for higher occupancy offices and for work activities involving a greater proportion of speech. A simple statistical model based on a binomial distribution of talkers has been used to determine the minimum office occupancy for which speech babble masking may be effective. The same model is used to determine the quantity of acoustic absorption per occupant that would be consistent with effective speech babble masking. 2 BASIS OF APPROACH 2.1 Motivation The motivation for this study was the authors’ personal experience of working in the various Max Fordham open-plan offices around the country. The subjective impression is that in the larger (higher occupancy) main office, the open-plan work environment provides acceptable speech privacy and distraction conditions despite the absence of masking sound from either mechanical services or electronic masking systems. However, in the smaller (lower occupancy) offices, this is not the case. The hypothesis is that the noise generated by the occupants can provide an appropriate level of masking sound, but this is only likely to be successful if certain conditions are met. This study seeks to investigate the key parameters involved in determining whether successful conditions are met and to provide a simple framework that could be used for design of open-plan offices. 2.2 Speech Babble as Masking Sound A study by Jones & Macken [1] on Auditory Babble and Cognitive Efficiency investigated the disruptive effect of irrelevant speech babble on short-term memory. They found that the degree of disruption due to speech babble started to decrease (relative to that caused by a single voice) when the number of simultaneous voices increased beyond three. They speculate that this is because it might still be possible to ‘perceptually separate or stream out’ the individual voices when the auditory stimulus comprised a small number of voices but this is not possible as the number of voices increases. They conclude that “a certain degree of clamor, say, in a large office, will be less disruptive than in a smaller office, in which few sources of sound do not mask each other”. A review of research on speech intelligibility in multiple-talker conditions [2] showed that speech is not an effective masker and to achieve a 50% intelligibility level, interfering speech needs to be almost 10dB louder than steady-state noise. It was noted that this release from masking disappears when the number of interfering talkers is increased and that this occurs in a number of studies when there are four or more talkers. Therefore, we can say that if there are a sufficient number of interfering voices then they will no longer be distracting and will indeed perform the function of sound masking. The required number is likely to be at least 4 but may depend upon the spatial location of the voices. 2.3 Masking Sound Level If speech babble is to act as a masking sound within an office environment, then it must be at a high enough level such that it can effectively mask a speech signal. BS EN ISO 3382-3:2012 [3] provides a model for the effect of speech intelligibility on the performance of cognitively demanding tasks and states that “the negative effects of speech on work performance start to vanish rapidly if the STI is below 0.5”. STI is difficult to predict with accuracy, however signal to noise ratio can be used as a proxy for STI and is a much simpler parameter to work with in a basic model. A study by Lazarus [4] provides a correlation of signal to noise ratio with STI and the value of which are reproduced in Table 1. Table 1: Correlation of Speech Intelligibility with Signal to Noise Ratio, taken fr om Lazarus [4] In a study of the Speech Transmission Index in large spaces Liu, Maa, Kang & Wanga [5] measured the relation between speech intelligibility scores and the signal to noise ratio. The correlation coefficient R2 between the two parameters was 0.927 and speech intelligibility crossed the threshold of 50% at a signal to noise ratio of ~0dB. It is suggested that distraction due to direct speech sounds is unlikely if the level of direct speech rarely exceeds the ambient noise level. 2.4 Noise Levels High ambient noise levels need to be avoided in open-plan offices to prevent discomfort, interference with tasks requiring concentration and ultimately speech communication between occupants. The French Standards Institute [6] recommend different average levels of ambient noise depending on the type of open-plan office. These are summarised in Table 2. Table 2: Target average activity noise levels from NFS 31-199 [6] The values in Table 2 relate to average noise levels. It is reasonable to expect that noise levels can be higher in the short term without causing excessive interference to working tasks but the authors are not aware of any specific information covering this situation. Reference [7] gives an overview of studies measuring ambient noise level in open-plan offices and contact centres. The measurements are summarised in Table 3. As an upper limit, it is suggested that short-term noise levels should not be so high as to have an unacceptable effect on speech communication between nearby co-workers. Table 4 shows the maximum steady noise levels for reliable speech communication at a distance of 1m, taken from British Standard 8233 [8]. For the purposes of this study, a level of 62dBA is adopted as the level above which the effect on speech communication is likely to be unacceptable. Table 3: Overview of studies measuring ambient noise in open-plan offices and contact centres [7] Table 4: Maximum steady noise levels for reliable speech communication at 1m (from BS 8233 [8]) 2.5 Conditions to be Satisfied It is suggested that the following conditions are necessary for speech babble to provide effective masking: Condition 1: It must be the case for most of the time that either a minimum of four people are talking or alternatively that no-one is talking. This is based on the finding 1 that at least three voices must be present simultaneously to act as an unintelligible masking babble. A minimum of four talkers are required such that at least three voices are present to mask the fourth. Condition 2: The ambient noise level should be less than 62dBA for most of the time. This condition is intended to avoid unacceptable interference with speech communication at 1m. It may be that a lower noise limit may be considered appropriate, in which case the calculations can be adjusted accordingly. Condition 3: It should be the case for most of the time that there is no one talking within a distance of a particular listener that would result in the direct speech sound level exceeding the total reverberant sound level at the listener. This is to avoid excessive speech disturbance and is similar to the concept of distraction distance [3]. Again, it would be possible to adapt this approach to a different signal-to-noise threshold. It is important to note that each condition introduces a threshold. These relate respectively to the minimum number of simultaneous talkers, the ambient noise level and the limiting signal-to-noise ratio for distraction. These thresholds are put forward by the authors of this paper for the purposes of developing the analytical equations into design relationships. It would be equally possible to develop the approach with a different set of thresholds. It is also important to note that each condition is to be met ‘most of the time’ as opposed to ‘always’ or ‘on average’. This relates to the statistical nature of open-plan office acoustics which are dependent on a population of occupants. This statistical basis is a central aspect of the approach presented in this paper. The threshold for ‘most of the time’ is again put forward by the authors (as detailed in the subsequent sections). The methodology can again be adapted to alternative thresholds if considered more appropriate. 2.6 Simplifying Assumptions The main simplifying assumptions of the model presented are as follows: It is assumed that all occupants of the office are identical in terms of their speech level and the amount of time spent talking. This is clearly a simplification of a real environment where there will be significant diversity. The assumption is made to allow a simple statistical model to be used. Sources of noise other than speech are not included in the model. It was noticeable in studies of our own office environment that noise from keyboard typing and computer equipment are significant. For the purposes of evaluating ambient noise level, the office is treated as a reverberant space. This assumption is made because it significantly simplifies the calculations, whilst acknowledging that large office spaces with highly absorbent ceilings are not fully reverberant spaces. It is assumed that the Lombard Effect is applicable i.e. occupants increase their speech level in response to higher ambient noise levels. This was found to be the case in a previous study [7] by the same authors. Occupants are randomly distributed throughout the space. 3 CONDITION 1 – MINIMUM NUMBER OF SIMULTANEOUS TALKERS 3.1 Evaluating Minimum Occupancy In order to estimate the probability that at least four people are talking simultaneously, we can assume that each of the office occupants speaks, on average, for a fraction, 𝑡 , of the time. The value 𝑡 is termed “average fractional talk time”. For example, an occupant who speaks for 12% of the time on average during the course of their work has 𝑡= 0.12 . The total occupancy of the office is 𝑁 . The probability, 𝑃(𝑁𝑠 = 𝑘) , of the number of simultaneous talkers, 𝑁𝑠 , being exactly 𝑘 at any given time can be evaluated using a binomial distribution function. For Condition 1 to be met, the number of simultaneous talkers must be either zero or at least four. We are therefore looking to evaluate the cumulative total probability, 𝑃𝑏𝑎𝑏𝑏𝑙𝑒 , that 𝑘 is any value excluding 1, 2 and 3. Where: 𝑃(𝑁𝑠 = 1) , 𝑃(𝑁𝑠 = 2) , and 𝑃(𝑁𝑠 = 3) are evaluated using Eq. 1. We now require that this condition should be met for ‘most of the time’ so that speech babble can generally be relied on to provide masking. We can do this by evaluating the conditions for which 𝑃𝑏𝑎𝑏𝑏𝑙𝑒 exceeds 0.9 (i.e. 90% of the time). Given that 𝑃𝑏𝑎𝑏𝑏𝑙𝑒 is a function only of 𝑁 and 𝑡 , we can evaluate the minimum office occupancy, 𝑁𝑚𝑖𝑛 , for which 𝑃𝑏𝑎𝑏𝑏𝑙𝑒 will exceed 0.9. These values are tabulated in Table 5 for various values of 𝑡 . Table 5: Minimum office occupancy, 𝑁𝑚𝑖𝑛 , for which speech babble masking can be effective, shown for various values of fractional talk time, 𝑡 It can be seen that speech babble cannot be relied upon to provide speech masking unless there are a sufficient number of people in the office. Unsurprisingly, the minimum occupancy required decreases as the amount that people tend to speak increases. 3.2 Developing an Analytical Solution The tabulated values in Table 5 were calculated using Eq. 1 and Eq. 2 but they do not give a direct analytical relationship between 𝑁𝑚𝑖𝑛 and 𝑡 . In order to find an approximate analytical solution, an alternative definition of ‘most of the time’ can be used. In this method, it is proposed that the mean of the probability distribution minus one standard deviation should be equal to four. With reference to Figure 1, there is a large proportion of the area under the probability graph (i.e. the cumulative probability) associated with the number of simultaneous talkers being at least four. Note that this method ignores the situation where there are no speakers at all ( 𝑘= 0 ). Figure 1: Binomial probability distribution for N=50 and t=0.125 showing the proportions of the distribution lying either side of one standard deviation (1SD) below the mean. In the example distribution shown, 1SD below the mean occurs at a value of around 3.91 For the binomial distribution described by Eq. 1, the mean is equal to 𝑁. 𝑡 and the variance is equal to 𝑁. 𝑡. (1 −𝑡) . Noting that the standard deviation is the square root of the variance, we get the following equation. This can be rearranged to a quadratic format and solved as: Given that 𝑡 is small (unlikely to exceed 0.5), √(1 −𝑡) can be approximated by the first two terms of its Taylor Series (1−t/2). Also using (17 −𝑡) ≈17 , we can produce the following approximated version of Eq. 4. Comparing the values given by Eq. 5 to those evaluated previously using Eq. 1 and Eq. 2 shows that this is a useful, simplified approximation. Table 6: A comparison between Nmin,approx values calculated using Eq. 5 and those, Nmin , calculated long-hand using Eq. 1 and Eq . 2 4 CONDITION 2 – LIMIT ON AMBIENT NOISE LEVEL Using the model developed by Rindel [9], and assuming a Lombard coefficient (slope) of 0.5, the ambient noise level, 𝐿𝑁,𝐴 , can be evaluated using the following relationship: Where 𝑁𝑆 is the number of people speaking simultaneously. To evaluate an ambient noise level that is not normally exceeded, 𝐿𝑁,𝐴[𝑀+1𝑆𝐷] , we will use a value for 𝑁𝑆 that is one standard deviation above the mean number of simultaneous talkers. With reference to the discussion leading to Eq. 3, we get: Where 𝑁𝑆[𝑀+1𝑆𝐷] is the number of simultaneous speakers a one standard deviation above the mean, 𝑁 is the total occupancy of the office and 𝑡 is the fractional talk time. Rearranging Eq. 6 and using Eq. 7, we get: or, alternatively: Condition 2 is that the ambient noise level does not normally exceed 62dBA. This gives us the following expression for the minimum quantity of absorption (expressed per occupant): Table 7: Minimum Sabines per occupant required to keep ambient noise levels within 62dBA for most of the time. Values shown for different values of t and at extreme values of N. 5 CONDITION 3 – LIMIT ON SPEECH DISTRACTION Using the model developed by Rindel [9], and assuming a Lombard coefficient (slope) of 0.5, the ambient noise level, 𝐿𝑁,𝐴 can be described by Eq. 6. The estimated speech level at distance r, considering the Lombard effect, is described by Eq. 11. For any given number of simultaneous speakers 𝑁𝑆 , where 𝑁𝑆 > 0, the ambient noise level from people speaking is given by Eq. 6 in the previous section. The distance, 𝑟𝑆𝑁𝑅=0 , at which the direct speech level is equal to the reverberant speech level (Signal to Noise, SNR, equal to 0dB) can be calculated by setting Eq. 6 equal to Eq. 11 and making 𝑟 the subject. The expected number of people talking, 𝐸(𝑁𝑆𝑁𝑅≥0 ) , within the area given by radius 𝑟𝑆𝑁𝑅=0 is found by multiplying that area by the number of people speaking per unit area ( 𝑁𝑠 /𝑁.𝑆𝑝𝑝) where 𝑆𝑝𝑝 is floor area per person, or occupancy density. Substituting Eq. 12 into Eq. 13 and simplifying ( 1/𝜋. 10 −1.4 ≅8 ). Assuming a Poisson distribution, the probability that the number of people talking within the radius is equal to zero is: This accounts for the case where the number of people speaking within the room, 𝑁𝑆 , is >0, but the number of people speaking within the radius, 𝑟𝑆𝑁𝑅=0 , is 0. The case for the number of people speaking within the room, 𝑁𝑆 , being zero (where, by definition, the number of people speaking within the radius is also zero) must also be accounted for. With reference to Eq. 1 the probability of there being no talkers within the room is: Therefore, the total probability of there being no talkers within the radius, including the case of there being no talkers within the room as whole is equal to: For a given value of 𝑁 , 𝑡 and 𝑆𝑝𝑝 , the maximum value of 𝐴 /𝑁 can be found such that the probability of no-one speaking within the radius 𝑟𝑆𝑁𝑅=0 achieves some specific value. An initial suggestion is to achieve this for 90% of the time. Setting Eq. 17 to equal 90% and rearranging for 𝐴𝑚𝑎𝑥 /𝑁 : Table 8: Maximum Sabines per occupant required to limit speech distraction within direct sound distraction radius for most of the time. Values shown for different values of t and at extreme values of N. Calculated for 𝑆𝑝𝑝 =10m² The value for 𝐴 𝑚𝑎𝑥 /𝑁 does not meaningfully change for varying values of 𝑡 or 𝑁 and the following simplifications can be made: Given that 𝑡 is small (unlikely to exceed 0.5) and N is large (usually greater than 10) then (1 −𝑡)𝑁 is negligible. Also noting that ln(1) = 0 Eq. 18 simplifies to: Therefore, in order to ensure that 𝐴𝑚𝑎𝑥 > 𝐴𝑚𝑖𝑛 then a requirement for 𝑆𝑝𝑝 can be defined as: 6 CONCLUSIONS By using a simple statistical model and setting out three conditions under which speech babble may provide effective masking in an open-plan office, it has been possible to establish design limits on the minimum occupancy, 𝑁𝑚𝑖𝑛 , the minimum floor area per person 𝑆𝑝𝑝𝑚𝑖𝑛 and on the amount of acoustic absorption per occupant, 𝐴 /𝑁 . These limits could potentially be used to inform the design of open- plan offices. For example, if providing a new open-plan office space for a known number (and talk time behaviour) of occupants, the sequential design steps would be as follows: Check if 𝑁 exceeds 𝑁𝑚𝑖𝑛 (using Eq. 4). If it does, then continue to Step 2. If it does not, then provide a quantity of acoustic treatment exceeding 𝐴 𝑚𝑖𝑛 /𝑁 and consider the use of electronic or mechanical services sound masking. Check if 𝑆𝑝𝑝 exceeds 𝑆𝑝𝑝𝑚𝑖𝑛 (using Eq. 20). If it does, then continue to Step 3. If it does not, then either increase the office floor area appropriately and continue to Step 3 or provide a quantity of acoustic treatment exceeding 𝐴𝑚𝑖𝑛 /𝑁 and consider the use of electronic or mechanical services sound masking. (See also Note A, below). Provide a quantity of acoustic treatment exceeding 𝐴𝑚𝑖𝑛 /𝑁 but not exceeding 𝐴𝑚𝑎𝑥 ⁄𝑁 (refer to Eq. 10 and Eq. 19). (See also Note A, below). Note A: This study takes no account of the effect of barriers or screening, which may offer alternative design solutions. For example, with appropriate use of barriers, it may not be necessary for 𝑆𝑝𝑝 to exceed 𝑆𝑝𝑝𝑚𝑖𝑛 and may be acceptable for 𝐴𝑚𝑎𝑥 ⁄𝑁 to be exceeded. It is suggested that further work should be undertaken to validate the model against real life situations, preferably using post-occupancy evaluation to determine if occupants consider the acoustics to adversely affect their work activities. Validations of this type will inform the thresholds applied to each of the three conditions on which the design relationships are based. 7 REFERENCES D. M. Jones and W. J. Macken, “Auditory Babble and Cognitive Efficiency: Role of Number of Voices and Their Location,” Journal of Experimental Psychology Applied, vol. 1, no. 3, pp. 216- 226, 1995. A. W. Bronkhorst, “The Cocktail Party Phenomenon: A Review of Research on Speech Intellgibility in Multiple-Talker Conditions,” Acta Acustica, vol. 86, pp. 117-128, 2000. BSI, BS EN ISO 3382-3:2012: Acoustics - Measurement of Room Acoustics Parameters, 2012. H. Lazarus, “Prediction of verbal communication in noise - A development of generalized SIL curves and the quality of communication (Part 2),” Applied Acoustics, vol. 20, no. 4, pp. 245-261, 1987. H. Liu, H. Ma, J. Kang and C. Wang, “The speech intelligibility and applicability of the speech transmission index in large spaces,” Applied Acoustics, vol. 167, 2020. AFNOR, NF S31-199: 2016. Acoustique – Performances acoustiques des espaces ouverts de bureaux., 2016. P. Leonard and A. Chilton, “The Lombard Effect in Open Plan Offices,” in Proceedings of the Institute of Acoustics , 2019. BSI, BS 8233:2014: Guidance on sound insulation and noise reduction for buildings, 2014. J. H. Rindel, “Verbal communication and noise in eating establishments,” Applied Acoustics, vol. 71, no. 12, pp. 1156-1161, 2010. Previous Paper 37 of 57 Next