Notes by ff123
Addendum 02-23-01: Added test of Lame 3.88 alpha 9 (Feb 15, 2001) with --r3mix option, and added test of Fraunhofer VBR 90%
The castanets sound clip (which can be found at the Lame test samples page) provides a stringent test of pre-echo. Pre-echo takes sharp transient attacks and smears them. The following table shows the results of subjective ABX (double-blind equivalent) listening tests I performed using various encoders. Tests are performed at the same volume level with headphones. I chose the default stereo mode for each player at the various bitrates. I also left the high frequency mode of Xing enabled (default).
In general, all codecs performed poorly on the castanets clip; pre-echo earns its repuation as an mp3 killer. Xing had spectacularly bad pre-echo at 128kbs, but was worsted overall (in my opinion) by QDesign's MVP, whose ringing artifacts were more annoying to me than Xing's pre-echo. At 128kbs, the FhG codecs were all pretty similar (1.263i was slightly worse than the others, producing a wavering guitar tone in addition to pre-echo). All FhG codecs were clearly better than Xing and the best FhG codecs were slightly better than Lame.
At 160kbs, the FhG "FastEnc" codec (as implemented by Sound Forge 4.5h, commp3.dll, version 1.0, build 219) surprised me by removing much of the reverberation in the clip (see my Bug in FhG's latest FastEnc page). At that point, I threw out Sound Forge's implementation in favor of MusicMatch 5's.
At 192kbs, I first noticed that I was starting to have a hard time identifying "X" from "A" or "B" using FhG's "Alternate" encoder (either implementation). With this clip and at any bitrate, I can't hear any evidence of the bug in MusicMatch 5's implementation of the "Alternate" codec (see my page Bug in older FhG Alternate Codec). QDesign starts introducing really bad artifacts at 192kbs, and at 256kbs becomes all but unlistenable.
Overall, I would say that either FhG's mp3enc or Alternate codecs (either implementation) perform the best on this particular clip.
Pre-echo can be seen as well as heard. On each of the clips below, I have zoomed in on the first click of the castanets to show what pre-echo looks like. In fact, one can pick out poor pre-echo control just by looking at graphs -- but don't get the idea that graphs tell the whole story. For example, what do you think is better -- larger amplitude artifacts close to the transient (QDesign), or smaller amplitude artifacts stretching out well before the transient (Xing)? At 128kbs, it turns out that QDesign sounds better to my ears than Xing in terms of pre-echo control, but worse overall because of the unsteady, ringing artifacts it also produces. And while it's true that gross differences in the graph would correlate with audibility, it wasn't true that smaller and shorter variations in the pre-echo graphs would correspond with what I heard. For example, at 192kbs, Lame had a better looking graph than the Alternate codecs, but the latter sounded better. For that reason, I have decided to only show what pre-echo looks like at 128kbs. Including the other graphs would tend to mislead the reader into thinking that a valid comparison could be made by looking at the graphs alone, which wasn't the case.
It's quite possible that one could arrive at a valid, objective (non-listening) test of pre-echo using Lame's graphical frame analyzer, MP3x, but I haven't tried it out to be able to say whether that would work or not.
To capture the pre-echo as clearly as I could using just a sound editor (Cool Edit 2000), I picked the left channel, then filtered out low frequencies. I created a high-pass filter in Cool Edit as shown below (click on thumbnail images to zoom):
Here is a picture of the reference signal:
Addendum (12-4-00): bAdDuDeX reports that the FhG 1.263 codec "sounds horrible with castanets.wav to me. I think both MP3Enc and LAME sound a lot better on it. It's hard to describe. Kinda like metallic hits within the castanets. If that makes any sense... It does it all the way to 256kbit/s."
At this bitrate, the artifacts bAdDuDeX describes could not be attributable to this codec's joint-stereo problems, nor to the kind of flanging that occurs when the cutoff filter is set too high. Therefore, it must be some sort of pre-echo artifact which I cannot hear because of my high-frequency limitations.
NOTICE: The listening tests performed with Lame 3.88 are for an alpha version, and are not necessarily representative of the Lame 3.88 beta version.
02-23-01: Now using Grado SR325 headphones.
| Encoder | bitrate | settings | ABX results | Comment |
| FhG mp3enc 3.1 | 256 | stereo, qual 9 | 15 of 16 | subtle difference (best) |
| Lame 3.87a (RH) | 256 | stereo, -h | 16 of 16 | obvious |
| Lame 3.88 alpha 9 (Feb 15, '01) | 256 | stereo -h | 16 of 16 | obvious, similar to Lame 3.87a (RH) |
| FhG VBR 90% using Cool Edit Pro with MP3 ME plugin, CRC disabled |
182 | j-stereo | 16 of 16 | obvious (worst of the current bunch) |
| Lame 3.88 alpha 9 (Feb 15, '01) | 187 | --r3mix | 16 of 16 | better than CBR 256; second best |