Rosa Westfall, Nat Condit-Schultz
In this study, we wanted to explore how compression affects a listener’s perspective of mix quality and loudness in popular, commercially released music. This exploration was prompted by my experience in commercial music recording studios as an assistant engineer, allowing me to observe and question a wide range of professional recording and mix engineers on their techniques and practices. Given the extensive use of dynamic range compression (DRC), Nat Condit-Schultz and I ran an experiment over the last year to answer our research question.
Do music listeners associate dynamic range compression and loudness with the quality/fidelity of a musical mix?
We hypothesized an inverted u-shape curve—some compression (lower combined threshold/ratio parameters) will be perceived as an increase in quality from the control no-compression samples. However, we expect that at a certain higher level of compression, the perceived quality will decrease as listeners observe notable “over-compression” and lack of dynamic contrast.
Dynamic range compression (DRC) is defined as “signal processing that reduces the dynamic range of audio, applied with hardware or software devices”. This is not the same as data compression, but rather a process directly with an audio signal. Compression can be achieved by many means, including running audio through outboard gear or a digital plugin within a DAW. DRC has many uses on individual tracks, as well as multi-tracks in every step of music production. In our study, we focus on its application in the context of “mastering”, a process which optimizes the musical mix for all environments and sound systems/platforms, without compromising the intent of the mix or musical content. DRC is used in this process to allow an increase in overall loudness of a song without clipping.
There are multiple methods of decreasing the dynamic range of audio that compressors employ. Understanding the terminology of the following parameters will clarify the dependent variables that we manipulated in the study.
This presentation focuses on the first of a 2-Block experiment. Block 1 of this experiment, participants are presented with an audio clip (stimuli) and asked to rate the “mix quality” using a slider from “low quality” (0) to “high quality” (100). In order to avoid response bias, we did not disclose the processing (compression) being applied to the musical samples. They were also directly instructed to respond based on the sonic quality, and not on the musical content, nor their personal preference. Each participant completed 14 trials with randomized stimuli. The responses of 43 participants were analyzed for the final results presented here.
The stimuli was created especially for this study, using mixed, unmastered, and unreleased (at the time) songs. Two 30-second segments were chosen from each of 4 songs, each of which had varying instrumentation and dynamic range. The 8 total song clips were normalized to an average loudness of -17 LUFS before compression was applied. For each clip, 4 threshold levels (-6 dB, -12 dB, -18 dB, -24 dB) and 3 ratios (4:1, 10:1, 50:1) were applied in all combinations to create 104 total samples. No make-up gain was applied, in order to additionally observe loudness perception in the secondary block of the experiment. The plugin R-Compressor by Waves was utilized to process the stimuli.
Block 2 of the experiment played two of the same musical segments with different levels of compression in an alternating loop. Participants were asked to adjust the volume of the second sample to match the volume of the first as closely as possible. The same stimuli were used. This block of the study is still ongoing, and has only been analyzed preliminarily.
Our results proved to be insignificant. Although the average of responses do produce semblance of an inverted-U shape, the individual responses were too widely distributed to solidify these results as a trend. There does exist more significance in cases of more extreme compression thresholds, but the ratio does not seem to have any measurable effect.
One consideration we had on the overall results was the variety of songs selected and their individual musical density/instrumentation may have confounded the results. However, upon analysis between individual musical segments, few presented any more of a trend than the overall analysis.
There is much more to extract from the collected data—such as comparing between-groups of audio professionals and more casual listeners—and improvements to be made for future studies on this topic. As previously stated, the musical content of the songs could be confounding the results, so a study on more consistent musical segment choices could produce clearer results. The population of participants also is relatively narrow not only on the sheer number of participants, but also based on our recruitment methods, which targeted music students and music professionals. This means the results are not very representative of the true average music listener. On the other side of the coin, a more exclusive population could better limit the study to audiophiles specifically. Additionally, more parameters could be explored such as “attack time”, which can significantly affect how compression sounds on musical excerpts.
Subliminal data was also collected in order to analyze adjustment of volume by users during Block 1 within the user interface. As mentioned earlier, Block 2 was a more explicit study on the relationship between compression and loudness, which is still being conducted.
This study was only the tip of the iceberg, and we have a lot more to learn about how music perception can change commercial music production practices, and vice-versa. The academic paper for this study is in its final stages of drafting, and we intend on submitting it for publication in the future. For more information on this study, feel free to contact Rosa Westfall (email@example.com) or Nat Condit-Schultz (firstname.lastname@example.org).