The BBC Virtual Audience and CEDAR
Interview with Matthew Page and Mark MacDonald of UK Operations For BBC News
The COVID-19 pandemic has had an impact on almost everyone, and those working in the fields of radio and television production are no exception. Not only have they had to adopt new working practices, but the nature of the job has often changed because of the need for social distancing. Nowhere is this more obvious than in TV studios, from which audiences have been banned for more than a year.
One team who have overcome this is UK Operations For BBC News, who have created a system for providing an audio Virtual Audience to shows including The News Quiz, Mock The Week, Children in Need and the BAFTAs. We talked to Matthew Page, Outside Broadcast Engineering Manager at the BBC and to sound engineer Mark MacDonald who explained how they are bringing remote audiences into the studio, and how they overcome some of the problems that they encounter in doing so.
Matthew explained, “We have a team of ten outside broadcast engineers who are generally on the road, but in March 2020 all of our work disappeared. But we have various skills and we redeployed them, and we’re now recording up to five programmes a week with virtual audiences. For each show, BBC Audience Services sends out links to an audience who can watch the programme being recorded, and we hear them responding in real time as do the contributors to the programme. The members of the audience also hear themselves, which makes it a very immersive experience for them. So we’ve got hundreds of incoming audio streams from people’s homes, each adding some sort of noise – washing machines in the background, the hum of people’s laptop computers, you know the kind of thing. If they’re all summed together without any noise reduction it sounds pretty dreadful.”
“We tried cleaning up the first virtual audiences in post, but the number of hours it took to produce a show was ridiculous. We could do just one or two programmes a week, and by the end of each we all had square eyes. We’ve even increased the sizes of the audiences since then, so we had to refine the process because we’ve now got up to 1,000 incoming streams that, even when we whittle them down, generally leaves an audience of about 300 to 350 sources. It’s a huge sound and, to handle this, we created a system that we call the BBC Virtual Audience that can handle the hundreds of incoming audio streams at the same time.”
Mark picked up the story, “We handle 40 to 45 streams on each of eight PCs. Two of the team will submix on these, which are essentially just a giant virtual audio mixer. They can mute channels, adjust levels and flag things that we need to get rid of – for example, if someone’s having a conversation in the living room and makes too much noise, we can remove that person and hook in someone else. But there’s always someone breathing really loudly or clinking a teacup, and the bigger the audience, the more stuff gets buried in the sound. The submixes then go through a CEDAR.”
“When we started with the Virtual Audience system, we didn’t really know if it would work. All ten of us would spend two days cleaning up to 150 tracks in post and turning them into a sort of cohesive sound, which was too time consuming. We borrowed a two-channel CEDAR DNS 2 from another team, and we used that for the live element, so at least the live sound was quite clean. It was amazing what the DNS 2 could do, so we then got permission to get the eight-channel DNS 8D, which was really a game changer because, at that point, it was so good that we no longer needed to do any post production work on the sound.”
“Now, the Virtual Audience system directs the first set of feeds to one computer, those are mixed and that mix goes to one channel of the DNS 8D. The next set of feeds goes to a second computer, that’s mixed and the output goes to the second channel of the DNS 8D, and so on. We then do a final mix on the eight channels that we get back from the DNS 8D.”
“We do some simple processing before the sound goes to the CEDAR – pretty standard stuff, some gating and taking off some lows and highs. When the sound comes back from the CEDAR, we add another gate and a dynamic eq to try and reduce the sound of breathing. The mix is then passed through a couple of different reverbs. This is because the microphones are quite far away in a real audience. Mock The Week actually tried playing some of the virtual audience into the room through a PA and capturing it on mics, but it didn’t work for us. So we try to push away the virtual audience subtly, but without making it sound like there has been a giant load of reverb dumped on it.”
“The system is so effective now that, when we’re lining up all the PCs with a tone coming through, we sometimes forget to switch the CEDAR out. Then someone would complain that there was no tone, and we’d be saying that we had definitely switched it on. So we’d look around and someone would eventually say, ‘ah, right, OK’ and switch the CEDAR out. After the first couple of audience events I wrote some sound notes and, at the end of it, it says, ‘bow down and worship the CEDAR’. It’s fantastic. It’s really, really amazing.”
“When the DNS 8D arrived, I read the instructions and they said that, for most applications, I could just leave it in Learn mode. I thought, who am I to argue with the box – it was doing way more than I could. And you know, I did some tests and tried to improve on it, but I couldn’t. It’s amazing in Learn mode, even if I’m not tweaking it in any way. Maybe I’ll adjust the weighting sometimes, just because I can, but mostly I just decide on the attenuation – usually about 12dB. The whole sub-mixing and mixing system’s working well for us now.”
Matthew continued, “That’s definitely where it shines. We’re sometimes doing five programmes a week, and they all take a live audience mix. Some of the programmes are live to transmitter while most of the rest are ‘next day turnaround’. So there’s no time for any post production on the sound; what Mark and the team mix on the night goes out. It works really well, and our audiences and the producers are happy. But we always want to improve what we’re doing. I had a conversation with our research and development department about six months ago, and they’re looking at machine learning too. They’re looking at detecting the actual speech, but we would want to classify things like applause or laughter and remove everything else, or even classify tea cups clinking and then remove those. But I don’t know if that’s going to be possible.”
“Moving away from the equipment, the psychology of remote audiences is also quite interesting. Our producers have found that the performers often find it easier without the studio audience. On some programmes they give the panel some virtual audience webcams to look at if they want to, and they’re still getting the audio feedback – applause and laughter and that kind of stuff – but I think they feel less inhibited.”
Mark concluded, “That’s exactly what I found too. It’s really helped the panellists and the comedians because it was flat without an audience. Now, some of them quite like being able to hear the audience clearly without having to be a few metres away from them.”