How to Record a Podcast with People in Multiple Locations

Update [4 September 2011]: Our setup at 5by5 Studios where we record and produce multiple shows every day is nothing like this. We record everything here at 5by5 HQ using a multiple-machine setup, through a 16-channel digital mixer, and using multi-track recording software. Recording the way I describe here


I recently read Joel Spolsky’s excellent description of the StackOverflow podcast setup. Although I was impressed by his process and gear, I was also a little bit surprised by its complexity. I also realized that although I’ve written a Podcasting Equipment Guide, I haven’t explained how I actually use that equipment to record a podcast.


Over the last few years of podcasting, first with the Hivelogic Radio Show (on hiatus) and later with The Talk Show (which is actually not on hiatus), I’ve learned a lot about how to record and produce good quality audio. Unfortunately, because of storage and bandwidth limitations, most podcasts are mixed-down to mono and reduced to 32 or 64 kbps. At this level, they’ll lose most of the detail and subtle nuance you’d find in a CD (or better) quality recording. That’s part of why podcasts do well when they focus on producing excellent content.


One of the questions people often ask is how we record The Talk Show with me here in Orlando and John in Philadelphia. Many people suspect that we record using Skype, Audio Hijack, Soundflower, or something similar.


In fact, we use a much more reliable, tried-and-true method that’s so simple, it just might surprise you.


The Golden Years


Recording with good equipment, like the kind I recommend in the Podcasting Equipment Guide, and then editing and mixing down with great care, can help to create the best possible result even when you’re using less than ideal equipment.


When recording the Hivelogic Radio Show, which was essentially an interview show, I was podcasting with people who generally didn’t have recording equipment of their own. They would often have to bark into the built-in microphones in their MacBooks or use the USB headset mics they’d purchased for VOIP and Skype calls. In either case, it made for bad audio quality. With some creative editing techniques I’d learned, I could clean up their audio, boost the levels, while simultaneously bringing my own audio down a few notches. After a little bit of compression and a subtle Noise Gate to remove any hiss, the result would be a balanced, even podcast. Once reduced to a podcast-friendly bit-rate, nobody could tell the difference.


But balancing different audio outputs isn’t what this article is about.


A Simple, Direct Way


Gruber and I both use a Shure SM7B, boosted with a PreSonus Tube Pre, connected to the Mac with an M-Audio Firewire Solo. We both use Freeverse SoundStudio 3 to record the audio. This is a solid, budget-minded but professional-level setup.


We found early on that recording the output from Skype (or iChat) was less than reliable, even when using great software like Rogue Amoeba’s amazing Audio Hijack Pro. Initially, I tried using the recorded Skype channels. Then I tried recording my own audio with SoundStudio 3, pulling Gruber’s audio from Audio Hijack, and mixing them back together. I actually tried many more variations than I’m detailing here. In the end, the audio was never as nice as when I recorded directly from my mic into SoundStudio 3 (or Quicktime Pro or GarageBand).


I was talking about this with my friend Ryan Irelan, and he let me in on an old recording-industry technique with an unfortunate name: The Double-Ender. Although traditionally used for television, the double-ender works just a well with audio, and it’s growing in popularity within the podcasting community because of its simplicity. It works like this:



An interviewer […] would be videotaped conducting an interview via a long-distance phone call to the interviewee in another part of the world. This interviewee […] would be videotaped as he was being interviewed. This videotape would then be sent to the interviewer’s city and synchronized with the videotape of the interviewer […] and the higher-quality sound of the videotapes would be used instead of the telephone audio.

This is precisely what John and I do when recording The Talk Show, and it’s exactly what you should do any time you’re recording audio where the people involved are in different locations.

I record my audio. John records his audio. We talk to each other using Skype or iChat or the telephone (but it doesn’t matter how we talk to each other because we’re not recording the actual conversation, just our own side). John then zips and uploads his audio which I then download and drop into a track in SoundStudio 3 (GarageBand would also work just fine).

Recording this way saves hours of time in post-production, because we end up with two high quality audio tracks that need almost no audio editing, allowing me (or Ryan when he’s doing the editing) to focus exclusively on the content.

Wondering how we sync up the audio? You’d be amazed how well a quick “3-2-1-Start” actually works.

More articles in the Archive →