Sound and Vision

FIG. 8: A typical sound-design session can include numerous individual elements. Here you can see the large number of tracks used to hold the effects in the game Star Wars Super Bombad Racing.

When I’m ready to put sound elements to a visual, I create a new Pro Tools session and roughly organize the tracks I’ll need. Keeping the materials organized by track is critical because a high-density two-minute sound cue can include hundreds of individual elements (see Fig. 8). A typical session consists of two stereo pairs of tracks for ambience, a track for each character’s dialog, a stereo pair for music, six mono tracks for Foley (two footstep tracks, two clothing tracks, two prop tracks), and four stereo pairs and four additional mono tracks for principal sound effects. At the top of the session I also include one stereo pair and one mono track into which I can load sounds. That lets me quickly pull sounds into the load tracks, adjust their timing, and drag them into an appropriate track.

Once each sound is loaded, it must be evaluated. That may seem obvious, but you must really listen to each sound element. Listen to how it sounds by itself against the picture and how it sounds with the other sonic elements. Listen deeply to the detail of the sound; then put your “big ears” on and listen objectively to the element as part of the whole. Does the general character of the sound work with the visual? What does it need? Will editing or processing improve it? Does it fill out the visual, or could it benefit from adding another layer?

Conversely, does the sound have too many details? Is the sound so busy that it distracts from the intended object of the audience’s focus? Those decisions are made almost instantly and lead you to the options of keeping the sound as is, modifying it, or starting again. The real trick here is to balance the endless options and your desire for perfection with the need to make progress.

Now that you have the process down, repeat it dozens, hundreds, or thousands of times until all your sound elements are lined up and sounding good. You are prepared for the mix!


Once you’ve completed your work, you must deliver it to the next person in the chain of project personnel. Formats and delivery standards must be carefully specified and spelled out in advance to minimize confusion and rework. Ask the project leader exactly what format and medium the files need to be delivered in—the all-nighter you prevent may be your own. (See the sidebar “Plan of Attack” for a discussion of additional steps to consider for delivering different types of media.)

Film and video delivery is fairly standardized. The audio standard for broadcast video is 16-bit/48 kHz, but films can be either 16/44.1 or 16/48, depending on the project. If you are responsible for a finished stereo mix of all audio elements, including dialog and music, you can deliver an AIFF interleaved stereo or Sound Designer II split stereo file burned to CD-R or recorded on DAT. If you mix to DAT, add a 2-pop to the beginning of the file. That is a short beep that occurs two seconds before the audio content begins, and it’s used by the editor to align the audio with picture.

Tascam DA-88 tape is the broadcast industry standard. I like it because the time code is stable, and eight tracks provide enough room for a full 5.1 surround mix on the first six tracks and a stereo mix on the last two tracks. Currently, though, it seems that Digidesign’s Pro Tools format dominates the film industry; you can typically deliver a hard drive containing your Pro Tools sessions to the mixer. When I do this, I deliver a carefully prepared session that has all the audio elements organized and roughly balanced the way I envision them sounding in the final mix. The idea is that the sound mixer, who is not familiar with the material, should be able to bring all faders up to unity gain and have it sound reasonably good. From that point he or she can tweak levels, pans, and mutes without wasting time.

