Welcome to the new IOA website! Please reset your password to access your account.

Proceedings of the Institute of Acoustics

 

Evaluating Human Response to Air Source Heat Pump Noise for Sustainable Domestic Heating

 

V Acun1  Acoustics Research Centre, University of Salford, Greater Manchester, UK1

L Barton1

T Cox1

S Graetzer1

J Hargreaves1

M Radivan1

A J Torija Martinez1

D Waddington1

D Wong-Mcsweeney1

 

1 INTRODUCTION

 

Air Source Heat Pumps (ASHPs) are a key technology for reducing domestic carbon emissions in the UK. They use ambient air for heating and cooling, significantly reducing fossil fuel dependence and promoting energy conservation. Their efficiency across various climates makes them suitable for residential and commercial use, aiding the transition to a sustainable future. ASHPs extract heat from outdoor air using a refrigerant in a closed loop. This heat is transferred indoors via a coil or heat exchanger. The main noise sources are the fan and compressor, producing both broadband and tonal noise components, particularly at the blade passing frequency and its harmonics1. Despite their benefits, noise emissions hinder ASHP adoption in the UK due to policy and regulatory constraints2. Significant issues include more consideration for the acoustic character, especially tonality influenced by transient operating conditions, and the absence of differentiation in background noise levels (day vs. night, seasonal variability).

 

This paper reports the findings of a listening experiment concerned with evaluating human response to steady-state and transient ASHP noise under different ambient background noise levels and source distances. The study aims to provide insights to inform and enhance UK policy and regulations, facilitating the broader adoption of ASHPs and contributing to the nation’s net zero goals. Ultimately, our research aims to improve public living conditions and promote environmental sustainability.

 

2 METHODS

 

The experiment consisted of two parts. Part One focused on assessing human responses to different steady-state ASHP operating conditions. Participants listened to 20-second recordings of distinct ASHP operating states and evaluated their experience after each recording. In Part Two, participants listened to 60-second recordings of transient ASHP noise.

 

2.1 ASHP Recordings

 

Excerpts were taken from two recordings, each lasting eight minutes and thirty-four seconds, and were used as a basis to for the audio stimuli. These recordings were chosen from a larger database compiled during an acoustic holography project, which utilised a circular array of 64 microphones 3. The recordings were conducted in a semi-anechoic chamber, and the two chosen recordings were acquired by a microphone positioned in front of the fan outlet at a height of 126.9 cm. Selected recordings included transitions between different ASHP operating cycles. These recordings were calibrated to match their original sound levels. The recordings were then trimmed to 20-second and 60-second durations for different parts of the experiment. The 20-second samples, referred to as “Short,” were used in Part One of the study. Each sample featured a continuous recording of one of the distinct ASHP operating conditions: Minimum ( Defrost), Moderate, or Maximum capacity. Multiple samples were prepared per operating condition using Audacity for each of the two original eight-minute recordings. This resulted in a total of twenty-six samples for Part One.

 

The 60-second samples were used in Part Two of the study. These samples, referred to as “Long”, were selected from sections of the original eight-minute recordings that featured distinct transitions between operating conditions. The two transient conditions examined were the heat pump transitions from Maximum to Minimum ( Defrost ) capacity and from Minimum (Defrost) to Maximum capacity. Additionally, samples of steady-state conditions, referred to as “Stationary,” were prepared for comparison against the transient conditions. Ten samples were prepared for Part Two, with each transient condition represented by two samples from the original recordings.

 

2.2 Ambient Background Noise

 

The Short and Long samples of ASHP noise recordings needed to be combined with ambient background noise to evaluate the effect of the variations in background noise. For this purpose, a shaped-pink noise is used instead of a soundscape recording. This approach avoids introducing any uncontrolled variable to the experiment, such as bird songs or speech. Audacity was used to generate pink noise and for applying a filter curve to simulate the spectrum of traffic noise, following the BS EN ISO 717-2020 standard4. Two different durations (20 and 60 seconds long) of this signal are prepared for different parts of the experiment.

 

2.3 Audio Reproduction and Calibration

 

A listening room within the Acoustic Laboratories of the University of Salford was used to calibrate the ASHP and ambient background noise samples for the audio stimuli. During this process, two key factors were considered: distance attenuation and outdoor-to-indoor sound propagation. In terms of distance attenuation, three source distances and corresponding sound levels were selected for the ASHP noise: 36.5 dB(A) at 15 meters, 40 dB(A) at 10 meters, and 46 dB(A) at 5 meters. The ASHP unit was assumed to be ground-mounted with a directivity factor of Q=4. Two target levels were chosen for the ambient background noise samples to reflect rural nighttime and daytime conditions: 31.5 dB(A) for nighttime and 39.5 dB(A) for daytime5.

 

The listening experiment’s audio system comprised an HP Omen 16 laptop with Python and Audacity, a Motu 4Pre audio interface, a Genelec 8030A loudspeaker, and a Genelec 7020B subwoofer. Calibration equipment comprised a Brüel & Kjaer 2250 Sound Level Meter (SLM), a Dewesoft SIRIUSi-8xACC data acquisition unit, and a PCB Piezotronics microphone. Calibration involved playing ASHP recordings and shaped pink noise, measuring LAeq,20s and LAeq,60s levels, and adjusting gains with Audacity to meet the distance attenuation target values. These levels were verified with the SLM, resulting in three versions of each sample for different distance attenuations for the ASHP signal. On the other hand, the shaped pink noise is calibrated according to the daytime and nighttime rural environmental background noise levels rather than the distance attenuation targets.

 

After calibration, the ASHP and shaped-pink noise samples were combined using Python, resulting in 86 short samples (84 test samples and 2 control samples) for Part One and 31 long samples (30 test samples and 1 control sample) for Part Two, all at a 96 kHz sampling rate and 32-bit resolution. These samples were further processed to simulate sound propagation through a partially open window (0.05 m²) following a specified filter curve 6. Once the stimuli were generated, their sound levels were verified by remeasuring and comparing them with the calculated levels using the Dewesoft system and SLM. Before each session, pseudo-randomly selected samples were checked for any discrepancies greater than 0.5 dB(A) and were adjusted as needed.

 

2.4 Experimental Procedure

 

The experimental procedure for this study was approved by the University of Salford Ethics Committee (reference no. 2024-0145-228). Each participant listened to 60 audio stimuli in total. Forty-two Short stimuli were pseudo-randomly selected from a pool of 84 for Part One, and 15 Long stimuli from a pool of 30 were selected for Part Two per participant. Additionally, each participant's playlist included three control samples that contained only ambient background noise with no ASHP signal, two for Part One and one for Part Two.

 

The experiment took place in the Listening Room using the audio reproduction system detailed in Section 2.3. The audio stimuli were played through a single Genelec 8030A loudspeaker paired with a 7020B subwoofer. Participants were seated 2 meters from the loudspeaker. The experiment's Graphic User Interface (GUI) was developed using Python's PyQt 5 library and displayed on a screen slightly above the loudspeaker (Figure 1).

 

Before starting the experiment, participants filled out a consent form, a general demographic information questionnaire, and the NoiseQ questionnaire7. A short training session was conducted to familiarise the participants with the GUI and the terms used in the experiment. The experiment used three questions to evaluate participants’ reactions to ASHP noise. The first two questions were based on the Valence and Arousal dimensions of the pictorial Self-Assessment Manikin (SAM)8 and rated on a nine-point scale. The third question was based on PB ISO/TS 15666:20219, asking the participants, ‘Thinking about the sound environment you just listened to, what number from 0 to 10 best represents how bothered, annoyed, or disturbed you are by the air source heat pump noise? (0-Not Annoying at all, 10-Extremely Annoying).’ The aim was to measure the participants’ annoyance levels.

 

The experiment was divided into two parts. As explained in Section 2.1, 20-second Short stimuli were used in Part One to understand people’s responses to steady-state air source heat pump

 

 

Figure 1: The experimental setup for the listening room. A Genelac 8030A loudspeaker was mounted on a stand in the middle of the room, with a Genelac 7030B subwoofer on the floor. Participants were seated 2 metres from the speakers and interacted with the survey interface via a screen above the loudspeaker

 

(ASHP) noise at different background noise levels and source distances. After listening to each Short audio stimulus, participants answered the three questions about their experience. Part Two used 60-second-long stimuli to evaluate how people respond to transient noise in ASHP operating cycles. The stimuli for Part Two included a change in the operating condition near the temporal mid- point of the recording. For simplicity's sake, the stimuli used in this part only included the nighttime background noise levels of 31.5 dB(A). Unlike Part One, participants evaluated their experience at two time points in this part. The playback automatically stopped 10 seconds after the transition, and participants were asked to rate their experience up to that point. After submitting their responses, they resumed the playback and were asked to rate their overall reaction to the stimuli once the playback ended. Participants were explicitly instructed to consider the entirety of the stimuli, including the part before the transition, before submitting their overall response. The experiment took, on average, over one hour, with participants taking a short break between the two parts.

 

2.5 Participants

 

The participants were recruited from the Psychoacoustics Participant Database at the University of Salford, and via social media platforms, and internal advertising. In total, 50 individuals participated in the experiment: 35 male (70%) and 15 female (30%), aged between 19 and 57 years (mean = 32.1, SD = 8.95). All participants were asked if they were aware of any hearing loss or had any hearing-related medical conditions such as tinnitus. Participants who self-reported a hearing loss or hearing-related medical condition did not proceed to the experiment.

 

3 RESULTS AND DISCUSSION

 

This section reports subjective responses to the noise produced by ASHPs, examining the effects of operating conditions, sound levels, and changes in psychoacoustic parameters. All statistical analyses are conducted using the R programming language. As previously mentioned, the experiment consisted of two parts: Part One focused on steady-state operating conditions and Part Two on transient conditions. Due to this, except for the correlation analysis, statistical analysis is concerned with a specific part of the experiment rather than the combined data.

 

 

Figure 2: Distribution of subjective responses to ASHP Noise across different parts of the experiment

 

Figure 2 presents the distribution of participants' Valence, Arousal, and Annoyance responses using a histogram and a Kernel Density Estimate (KDE) plot. The responses for the two variables were measured on an integer scale while Annoyance used a continuous one. Valence responses were neutral towards ASHP noise, with similar distributions in both experiment parts. The noise generally caused low arousal, with infrequent higher responses. Annoyance responses showed a multimodal distribution with lower and middle-range peaks. Arousal responses were skewed towards lower values, indicating most participants found the noise less annoying. The consistent distributions between Part One and Part Two suggest stable subjective responses throughout the experiment.

 

One of the main goals of this study was to examine people’s subjective responses under different operating conditions. Figure 3 presents a histogram and KDE plots showing the distribution of Valence, Arousal, and Annoyance levels during various operating conditions in Part One of the experiment. This section evaluated responses to three distinct conditions based on operating capacity: Minimum (Defrost), Moderate, and Maximum. The figure indicates that participants’ perception of ASHP noise differs slightly between the Minimum (Defrost) and the Moderate and Maximum Capacity conditions.

 

Before conducting further statistical analysis to confirm the observed patterns for Part One, Levene’s Test of Homogeneity of Variance was performed on the three subjective response variables. Of these, the assumption of equal variances was only met for Valence (F(7) = 1.664, p = 0.11), while it was violated for Arousal (F(7) = 11.023, p < 0.0001) and Annoyance (F(7) = 3.503, p = 0.0009). Consequently, a one-way ANOVA was conducted to compare the effect of operating conditions on Valence. Despite the differences observed in Figure 3, the ANOVA indicated that these differences between operating conditions did not reach statistical significance in Valence responses across the operating conditions (F(2,1929) = 2.411, p = 0.09). The non-parametric Kruskal-Wallis test was used for Arousal and Annoyance as they violated the homogeneity of variance assumption. Similar to Valence, the test showed no statistically significant differences in Arousal (χ²(2) = 4.44, p = 0.11) and Annoyance (χ²(2) = 5.29, p = 0.06) across the different operating conditions.

 

Figure 4: Distribution of subjective responses of Part One to ASHP noise across different operating conditions which are Minimum (blue), Moderate (green), and Maximum (orange) capacity.

 

 

 

 

Figure 5: Impact of ASHP SPL on subjective response—Valence, Arousal, and Annoyance at different background noise levels (Continuous line: Nighttime, Dash Line: Daytime) used in Part One.

 

In Part Two, participants evaluated the ASHP noise at two specific time points: 10 seconds after a transition in operating condition (Transition) and at the end of the recording (End), assessing the entire recording. The Shapiro-Wilk Test indicated that the distribution of all three subjective response variables — Valence , Arousal , and Annoyance — significantly deviated from normality at both the Transition and End time points (p < 0.0001 for all variables). Therefore, the non-parametric Wilcoxon signed-rank test was performed to compare responses between these two time points. The results showed no statistically significant differences in Valence (W = 108514.5, p = .613), Arousal (W = 97283, p = .744), or Annoyance (W = 114777, p = .555) between the Transition and End points. This finding is open to different interpretations. While it could suggest that participants’ reactions remained consistent regardless of a transition in the operation cycle, it could also suggest that the perception of the transition dominates and largely determines the response to the overall signal. It should also be noted that the transition used in this experiment — from minimum(defrost) to maximum operating capacity and vice versa — represents the most extreme change in sound levels and psychoacoustic parameters. This suggests that transitions between less extreme operating conditions (e.g., from minimum capacity to moderate) are even less likely to elicit a significant reaction.

 

Another aim of the experiment was to investigate the impact of relative levels of ASHP and ambient background noise, as the current policy does not consider background noise. Figure 4 illustrates the Valence, Arousal, and Annoyance responses as a function of ASHP sound levels at daytime (39.5 dB(A)) and nighttime (31.5 dB(A)) background noise levels. These responses were recorded in Part One of the experiment under steady-state operating conditions. The figure shows that as ASHP sound levels increase, both Arousal and Annoyance levels rise, indicating a positive correlation: higher ASHP noise levels are perceived as more annoying and stimulating. Conversely, Valence decreases with higher ASHP noise levels, indicating a negative correlation, as higher ASHP noise levels are perceived as less pleasant.

 

The green dashed lines in Figure 4 represent the daytime background noise level of 39.5 dB(A), while the solid line represents the nighttime level of 31.5 dB(A). The figure suggests that background noise can influence human responses to ASHP noise, with higher Valence and lower Annoyance and Arousal levels at higher background noise levels. This implies that background noise may mask some ASHP noise, leading to a more positive perception. The current policy, which determines ASHP noise levels, follows a simplistic approach that does not account for background noise levels, acoustic character, contextual factors, or seasonal variations. This oversight could create barriers to implementation and acceptance.

 

Table 1: Spearman’s correlation results between subjective response and psychoacoustic metrics and ASHP Sound Pressure Level (SPL).

 

 

 

The results of Spearman’s rho correlation analysis between participants’ subjective responses and the acoustic characteristics of the ASHP noise emissions are presented in Table 1. The psychoacoustic metrics were calculated using ArtemiS SUITE. Although all the psychoacoustic metrics were statistically significant, Fluctuation exhibited a very low association with the subjective response variables. The remaining psychoacoustic metrics demonstrated trends consistent with those illustrated in Figure 4.

 

Arousal showed a weak to moderate positive correlation with Loudness (r(3518) = 0.359, p < 0.01), Roughness (r(3518) = 0.332, p < 0.01), and Tonality (r(3518) = 0.323, p < 0.01), and a weak negative correlation with Sharpness (r(3518) = -0.259, p < 0.01). Similarly, Annoyance had a statistically significant low to moderate positive correlation with Loudness (r(3518) = 0.388, p < 0.01), Roughness (r(3518) = 0.366, p < 0.01), and Tonality (r(3518) = 0.305, p < 0.01), as well as a weak negative correlation with Sharpness. In contrast, Valence exhibited a moderate negative correlation with Loudness (r(3518) = -0.381, p < 0.01) and Roughness (r(3518) = -0.367, p < 0.01), a low negative correlation with Tonality (r(3518) = -0.298, p < 0.01) and a low positive correlation with Sharpness (r(3518) = 0.296, p < 0.01). Fluctuation did not have more than a weak correlation with Valence, Arousal or Annoyance.

 

Additionally, correlation analysis between the three subjective response variables and ASHP sound levels supported the trends observed in Figure 4. There was a negative correlation between ASHP SPL and Valence (i.e. as the ASHP SPL increased, the perceived pleasantness decreased) r(3518) = -0.323, p < 0.01) and moderate positive correlations between ASHP SPL and Arousal (r(3518) = 0.323, p < 0.01) and Annoyance (r(3518) = 0.359, p < 0.01).

 

Lastly, a difference in perception between the two different background noise levels was observed in Part One, as shown in Figure 4, such that the response tended to be more negative when the ASHP noise was more audible relative to the background noise (i.e. less energetically masked). To confirm this pattern, further statistical analysis was conducted. The Shapiro-Wilk test indicated that none of the three subjective response variables were normally distributed across the two background noise levels. Consequently, a Wilcoxon signed-rank test was used to compare responses.

 

The test revealed no significant difference in Valence scores between the two background noise levels (W = 97,419.5, p = .357). However, a significant difference was found in Arousal scores (W = 119,609.5, p = .015). For Annoyance scores, the test indicated no significant difference (W = 149,564.5, p = .118). These findings suggest that background noise levels do not significantly affect Valence and Annoyance, but they do have a notable impact on Arousal, such that arousal is higher when the ASHP sound emissions are more audible over ambient background noise, which could be an important consideration in assessing the overall human response to ASHP noise.

 

4 CONCLUSION

 

This paper reported the findings from a two-part experiment on human responses to Air Source Heat Pump (ASHP) noise. The study compared participants’ Valence, Arousal, and Annoyance levels across different operating conditions, background noise levels, sound levels, and psychoacoustic metrics. Results indicated that the background noise level relative to the ASHP sound level significantly influenced Arousal but not Valence or Annoyance. Metrics like Loudness, Sharpness, Roughness, and Tonality correlated with subjective responses, while Fluctuation was only weakly correlated.

 

In Part Two, participants evaluated ASHP noise at two time points: after a transition in operating condition and at the end of the recording. Reactions to ASHP noise remained consistent, suggesting that either the perception of the transition dominated and determined participants’ overall response or their overall responses were not significantly altered even after the extreme transition.

 

The study also explored the relative levels of ASHP and ambient background levels, finding that higher background noise masks ASHP noise, leading to more positive perceptions. This also suggests that ASHP noise emissions are potentially perceived more negatively in rural areas, where the background noises are significantly lower compared to urban areas, which should be considered by policymakers. Visual inspection showed slight differences in responses during the Minimum (Defrost) compared to Moderate and Maximum capacity, but further analysis revealed no significant differences in overall subjective responses. These findings highlight the importance of considering background noise and acoustic character in ASHP design and placement to minimize their impact on human comfort and well-being, ensuring more positive perception and acceptance.

 

5 REFERENCES

 

  1. Wagner S, Carniel X, Rohlfing J, Bay K, Hellgren H. 3: Overview on Heat Pump Component Noise and Noise Control Techniques. 2020;
  2. Review of Air Source Heat Pump Noise Emissions, Permitted Development Guidance and Regulations. Department for Energy Security and Net Zero; 2023.
  3. Kasess CH, Reichl C, Waubke H, Majdak P. Perception Rating of the Acoustic Emissions of Heat Pumps. In: Forum Acusticum [Internet]. Lyon, France; 2020 [cited 2024 Apr 9]. p. 2453–8. Available from: https://hal.science/hal-03234210
  4. BS EN ISO 717-1:2020 Acoustics. Rating of sound insulation in buildings and of building elements - Airborne sound insulation. BSI; 2020.
  5. Skinner C, Grimwood, Colin. The National Noise Incidence Study 2000/2001 (United Kingdom): Volume 1 - Noise Levels. United Kingdom: BRE Environment; 2000. (The National Noise Incidence Study 2000/2001). Report No.: 206344f.
  6. Waters-Fuller T, Lurcock D. NANR116: “Open/Closed Window Research” Sound Insulation Through Ventilated Domestic Windows. The Building Performance Centre, School of the Built Environment, Napier University; 2007 Apr.
  7. Schütte M, Marks A, Wenning E, Griefahn B. The development of the noise sensitivity questionnaire. Noise Health. 2007;9(34):15–24.
  8. Bradley MM, Lang PJ. Measuring emotion: The self-assessment manikin and the semantic differential. J Behav Ther Exp Psychiatry. 1994 Mar;25(1):49–59.
  9. PD ISO-TS 15666-2021- Acoustics — Assessment of noise annoyance by means of social and socio-acoustic surveys. BSI Standards Limited; 2021.

 

6 ACKNOWLEDGEMENTS

 

This research is being done as a part of the Innovate UK-funded Future Homes Project. We are grateful to Christian Kasess of the Austrian Academy of Sciences and Christoph Reichl of the Austrian Institute of Technology for their advice and for providing access to their Air Source Heat Pump recording database.