Chapter Sustainability and artificial intelligence – an energy problem or big data opportunity? A A A As part of the sustainability series of articles in Acoustics Bulletin, it seems time to discuss AI, which is gathering pace in its influence in the work acousticians do. By Peter Rogers in discussion with Dr Andrew Mitchell (Lecturer in Artificial Intelligence and Machine Learning for Sustainable Construction at the Bartlett School of Sustainable Construction) at University College London Above: Dr Andrew Mitchell Our digital energy footprint is a difficult thing to trace and measure. It considers the total energy consumption of any machine being used and the energy consumed while running software. It also includes any energy that has been expended in the development of the tool being utilised. Therefore, every email, text, message or call has associated upstream emissions which contribute to the overall Scope 1 and Scope 2 emissions carbon footprint of a business or organisation. Quantifying that meaningfully becomes a challenge for accurate and responsible sustainability practices and these all must be considered. This is why I asked a well-respected acoustician, who now lectures in AI, with his area of work otherwise being in soundscapes. Following the IOA and UKAN+ meetings held over the past year or so, I’ve had a growing concern about the associated energy use. I felt that exploring this topic through the lens of sustainability in a conversational style may be helpful to members. In this article we unpack this particular ‘can of digital worms’. Questions I have included: • should the providers in the supply chain bear the responsibility of addressing the the emissions generated by their services (such as Microsoft, Apple, Google, Meta etc); or • should emissions related to AI use remain a relevant concern for the acoustician, to consider in order to take responsibility for digital footprints when robustly considering baseline emissions? Above: ChatGPT-4o-generated Above: Claude AI-generated Above: Gemini generated Artificial Intelligence is a potential game-changer, from big data processing to image creation. This article is dedicated to exploring the use and efficacy of AI in acoustics as a credible part of the industry tool kit, which can also be a part of delivering sustainable outcomes. This is a question I will return to once hearing from Dr Mitchell. I began by asking three generative AI tools (Gemini, Claude, ChatGPT) the same (very polite for some reason) question: "Please generate an image I can use in an article about acoustics and AI and also tell me what energy it took to create it in GHG equivalents”. The responses on energy used in their generation were reported as varying from ‘negligible’ from Gemini, to ‘far less than 0.001 kg CO₂e’ by Claude, and ‘~0.024 kg CO₂e (or 24 grams CO₂e)’ by ChatGPT-3, with a comparison to driving 150m in a petrol car. So, the best guess would be that the above images could be equivalent emissions of a ½ km drive in a petrol car. That’s worth thinking about given the scale of the use now taking place across our industry. I asked Dr Mitchell to set out how the energy use for different generative AI models varies, and which are the most energy intensive in his specific fields of acoustics? AI and energy consumption, Dr Mitchell’s response: AI is quickly infiltrating our daily professional lives, whether through large language models (LLMs) like ChatGPT that summarise documents, or through text-to-image models that help to add colour to our presentations. Concerns have been raised about the energy requirements of these generative AI models and whether they have a place alongside sustainability initiatives. Articles including those in the BBC¹, MIT Technology Review², and Nature³ have all pointed to the complicated issue of AI energy consumption, and how energy demands will only grow as generative AI becomes more commonplace. In acoustics, my work is dedicated to developing tools for advancing soundscape engineering – the application of soundscape principles to acoustics engineering practice. Towards this, I develop models which can describe urban soundscapes and predict people’s perception and emotional response to sounds. We’ve used classic machine learning and statistical techniques (which typically have minimal computational and energy costs) such as clustering analysis, random forest models and multi-level regression, as well as more complex and modern AI models. We’re also working on generative AI models which allow users to feed in an image or video of a location and generate audio of the soundscape it represents. This works in a similar way to image generation models like Stable Diffusion, which transform a text prompt to produce an image. When researchers talk about AI models in acoustics, they could be talking about generative AI (like the examples above), but more likely they are talking about much smaller and more specialised models. These might range from machine learning models for data analysis – often effectively no different or more power-intensive than a linear regression – to BirdNet, a neural network for AI-powered bird sound recognition that can run on smartphones. AI is a very broad umbrella term and most of its applications within acoustics will have very little energy use or carbon emissions implications beyond any other computer models. Generative AI When referring to AI in construction or acoustics, criticism is often directed towards general purpose ‘AI tools’ (like ChatGPT) rather than industry-specific AI models. The growing size of these generative AI tools since their breakthrough in 2021, is measured using parameters or storage space. When released in 2021 ChatGPT’s most powerful model, GPT-3, was made up of 175 billion parameters (requiring 350GB of storage). For reference, GPT-2 comprised ‘only’ 1.5 billion parameters. The largest LLM available now, GPT-4o, has additional functions such as image, video and audio generation, leading to a total of 1.5 trillion parameters. This means that the amount of computing power needed to train and run these models has vastly increased. They also take vastly more computing power than essentially any other type of AI model. Data centres Data centres (which are not just for AI but also power the internet and streaming services) are the lifeblood of these generative AI models and require massive amounts of power. Often this power requirement is clearly nationally significant, with data centres forecasted to account for more than 20% of Ireland’s total energy consumption for example⁴. The energy and carbon emission impacts of AI data centres are also quite complex to determine, especially if not paired with low-emission energy sources. An in-depth study from Hugging Face estimated that training a customer LLM known as BLOOM produced 25 metric tonnes of CO₂e – roughly equivalent to 30 flights between London and New York⁵. However, the true figure is minimised as it was trained on a French supercomputer, powered primarily with nuclear energy. Training models on data centres in countries with more carbon-intensive power systems results in significantly increased emissions, by a factor of three to 20 times, depending on factors such as the model itself, the computing hardware and energy sources. As an example it goes on to say that ChatGPT-3 is estimated to have used 500 metric tonnes CO₂e during training. There’s an important distinction to make between training and use, which is especially relevant for this conversation about sustainability concerns of companies using these models as consumers. Energy costs Training is the single most computationally and energy intensive part of producing these models, but it is effectively ‘one- and-done’; all of us using ChatGPT are now benefiting from that same training effort. On the other hand, each use of a model incurs its own new energy cost. Every time we send a query to a model in the cloud, a data centre is consuming energy to run it through the trained model. For 1,000 queries, this is estimated to range from about 7 Wh for text summarisation, to 288 Wh for text generation, to 519 Wh for image generation⁴. For an individual user, this may not be a concerning amount in itself – based on these numbers, relatively heavy use (like mine) could amount to the equivalent of running a laptop for about 20-30 hours – but regular use across a whole company or industry or sustained use for things like AI agents certainly adds up. Ultimately, the energy system and carbon impacts of these AI tools are complex and difficult to determine. They depend on the specific model and its size, the energy source for the data centres, the balance between training and usage, the task being performed and the pattern of use. What I’d most like to see is industries examining what it is they want these AI tools to achieve and, where possible, developing or identifying smaller, more targeted models for their specific needs. In many cases this can have other benefits, such as better task-specific performance or the ability to run smaller models locally (on your own computer or company server) which potentially reduce costs as well as ensuring data privacy (an extremely important topic for industrial AI use which we haven’t touched on). That is one for another time I think Dr Mitchell, but something for all to think about what they would consider to be ‘targeted use’ of AI in acoustics. Conclusions The main point I took after speaking to Dr Mitchell when thinking about AI in acoustics is yes, ‘generative AI’ tools are massively energy intensive and are likely to become even more so in the future, unless optimised for the task they are targeted to do. At first, it may seem that the impact is relatively small, but the scale of use makes it worth quantifying when taking a deep dive into the carbon equivalent costs of your own digital footprint. It strikes me that an estimated twice the energy use for image creation verses text generation is something to pay particular attention to while fossil fuels are in the energy mix for data centres, also to challenge why this is needed or justified. We must be careful not ignore the true impacts or tar all forms of artificial intelligence with the same oily brush, but ChatGPT and other large language models (LLM) are currently generally there for convenience and time-saving rather than substantial technical analysis, but that is likely to change it seems, and embracing that change by at least attempting to quantify the carbon footprint seems to be the responsible step. Industry- specific models, such as the ones that Mitchell himself uses, are not comparable in scale or energy consumption to the massive data centres that power LLMs, and direct energy use of computers is covered by scope 1 and 2 foot printing, but it may be a sensible approach to make a science-based estimate of your digital footprint as scope 3 impact and report that to add pressure on tech companies to clean up the energy that they use to run their data centres as part of their supply chain responsibilities. Mitchell says (as Casey Crownhart states in her article for the MIT Technology Review⁵),“rising electricity demand from AI is in some ways no different to rising demand from EVs, heat pumps or factory growth. It’s how we meet that demand that matters.” He concludes: “If we can move to generate energy from exclusively renewable sources, then the energy consumption of LLMs would not add to our digital energy footprint”. We are clearly not there yet, but I’ll be adding an estimate for our digital footprint to our scope 3 carbon footprint reporting for sure. Digital tools should serve people When it comes to the usage of small-scale, industry-specific AI models, consumption need not provide a barrier to their use and development in acoustics. As LLMs continue to expand we must though be conscious of their impact and ask ourselves whether their benefits outweigh the significant challenges that they pose to sustainable developments within acoustics and across broader society. If the growth of AI leads to a fossil fuel boom, then these concerns will only continue to plague this exciting and dynamic development, whereas a renewable energy-driven infrastructure will offer a friction-free route ahead. We leave the last word to AI (represented here by OpenAI’s ChatGPT-4o as the winner of the detail given in its answer on energy related to the image created). I uploaded the article and asked it for a conclusion and after some coaxing to be frank it came up with this: “The future of acoustics delivering sustainable outcomes won’t be shaped by Big Tech’s trillion-parameter models, but by professionals who demand that digital tools serve people and the planet. If tech giants won’t clean up their energy act, maybe it’s time your industry started asking louder questions” OpenAI’s ChatGPT-4o[5]. Well said! Dr. Andrew Mitchell is a lecturer in artificial intelligence and machine learning for sustainable construction at University College London. His work ranges from the development and training of machine learning models for analysing urban soundscapes to the application of AI in the construction industry such as its use in digital twins and construction management. Peter Rogers is the co-chair of the Sustainable Advisory Group for the IOA and Managing Director of Sustainable Acoustics Ltd (a carbon neutral company). References 1 https://www.bbc.co.uk/news/articles/c20g3dr4n4no# 2 https://www.technologyreview.com/2024/05/23/1092777/ 3 https://www.nature.com/articles/d41586-024-00478-x 4 https://www.nature.com/articles/d41586-025-00616-z#ref-CR2[5] https://www.technologyreview.com/2022/11/14/1063192/ 5 https://chatgpt.com/ Previous Chapter 3 of 6 Next