Notes (resize this window to the desired height)

Note 1: A calibration operation typically tests the hardware and the software together. In this case, the pattern of my test results makes it clear that the calibration correction applies specifically to the software, independent of the hardware.

My measurements were done using floating point FFTs; I got essentially the same results with floating point off as long as the signal amplitude was reasonable. Time measurements on the waveform appear to be accurate to the limits implied by the sampling rate and the scale of the screen display. LPC spectral measurements appear to have significant problems. I haven't attempted to calibrate other measurements.

The Sound Edit sampling rate problem discussed in the Manual p231 ought to be independent of the calibration issues discussed here.

Note 2: The different wide-band settings look at window widths of 3.3 to 8 ms. Because of this short window, there is substantial random variation between wide-band cross-section measurements made at slightly different locations in the speech signal. If you want to make wide-band measurements of anything that lasts more than a few ms, the preferred method is to average a large number of measurements and cancel out the random variation. This is done visually in a spectrogram display, or mathematically by having the system compute an averaged spectrum.

The only case where you'd want to use a single wide-band cross-section is when you were measuring some event like a stop burst that was about the size of the relevant window. These very short events are aperiodic, and typically give you a blurred display where precision of measuremnt is not possible: If you are measuring poles in a stop burst, getting it to the nearest 100 Hz is about as good as you are likely to get.


Note 3: For hand measurement of fundamental frequency for speech signals, you usually want to measure on the waveform, where you can usually get within one or two Hertz. Occasionally it's convenient to use a narrow-band display, and I've included the relevant calibration correction for those occasions, and for formant measurements.


Note 4: Signalyze 3.12 offers an option to compute spectra averaged over a user-selected interval. The system does this by first computing and displaying a spectrogram for the selected interval, and then averaging the values that were used to draw the spectrogram and displaying that average as a spectrum. The Signalyze Manual points out p169f, p175 (also the pdf User Reference document p69, p75) that this method of calculation means that averaged FFT spectra and regular spectral cross-sections use slightly different algorithms. This is not likely to be directly relevant to our calibration question.

These averaged spectra are useful for a variety of purposes; in particular, wide-band spectra averaged over intervals of 10 to 100 ms can be very useful as data for estimating formant center frequencies -- See How to find a formant.


Note 5: The calibration corrections in the table are designed to bring the spectral frequency measurements into line with measurements made directly on the waveform. A alternative approach, illustrated in How to find a formant, would be to bring averaged wide-band numbers in line with the numbers from narrow-band cross-sections, since you typically wish to compare WB and NB measurements, and the NB measurements are pretty accurate. Since the narrow-band numbers calibrate a few hertz high, the correction in How to find a formant slightly understates the actual correction. For example, at a sampling rate of 22,050 Hz, with input from a nominal 512 Hz tuning fork, we get:

  • f0 measured from waveform: 514 Hz (Take this as an accurate baseline.)
  • f0 = h1 measured from VNB spectral cross-section: 519 Hz error = +5 Hz
  • f0 = h1 measured from averaged WB spectrum: 559 Hz
  • error = +45 Hz relative to waveform; + 40 Hz relative to VNB
  • My recommendation to round all spectrum-based measurements down to the next lower 10 Hz is intended to allow for this small error in the VNB and NB measurements.

    Note 6: You will notice that the results at the 44,100 Hz sampling rate are considerably more variable than those at lower sampling rates. The Apple digitizers I have tested appear to be optimized for the 22,050 Hz sampling rate, and I recommend using that rate for most speech work.

    Note 7: Notice the small offset at the right-hand end of the averaged spectrum display. This occurs regularly in averaged spectra displays, and is presumably related to the software display problem that is causing the measurement error.