
| Home > Docs > How to find a formant | last updated Thursday, March 14, 2002 |
|
A procedure for measuring formant center frequencies for vowels and diphthongs using Signalyze 3.12.
© Guy Carden 1998-2001. All rights reserved. Guy Carden Contents 1. Background and references
1. Background and references This tutorial was designed to acompany a lab assignment in an undergraduate instrumental phonetics course. It therefore assumes general familiarity with source-filter analysis of speech, the sort of background that you would get from an introductory course in speech science or acoustic phonetics, or from standard textbooks like these:
The details of the instructions refer to Signalyze 3.12, but the same basic procedure would work with any interactive acoustic analysis system. We are concentrating on measuring F1 and F2 for the nucleus of a single syllable, but the same procedures could be used for measuring any formant in a voiced resonant. Do not be put off by the length of these instructions (9 pages plus notes), or by the even greater length of the sample analysis. There are lots of issues to think about in doing a formant analysis, the computer displays give you lots of data to interpret, and it takes time to talk about all this, and to learn how to do it. We spend two weeks and two lab sessions on this sort of analysis in my undergrad course. Once you have mastered the method, you'll be able to do reliable, valid measurements at about 5 to 15 minutes per syllable. You might also ask, why spend all this time doing formant analysis by hand? Aren't there acoustic analysis systems where you can push a button and get a display that says F1 = ___, F2 = ___, and so on? Yes, there certainly are such systems; and, if the parameters are set appropriately, these systems very often give you the right answers. To use such systems effectively, and to set their parameters optimally, you need to know what the system is doing, and you need to be able to look at the displays and make hand measurements to check and validate your automated results. You may also find cases, especially working with speech or voice pathologies or with children, where the automated systems can't be adjusted to give valid results; for these cases, hand analysis will be essential. 2. How to use this tutorial I assume that you already know something about acoustic phonetics, and that you want to use Signalyze (or some equivalent program) to measure formant frequencies. The best way to use this tutorial is to work interactively. If you have Signalyze, download Audio File 1 and open it in Signalyze. If you don't have Signalyze, but have some other acoustic analysis system, download Audio File 1 (for Macs) or Audio File 2 (for PCs), open it in your program, and follow along with equivalent analyses. In either case, print out Figure 1, and take notes on the Signalyze displays as you work through the tutorial. (I suggest printing in landscape mode at 80% reduction.) If you don't have an acoustic analysis system available, you can still get useful information working with the text and the figures. Begin by skimming through the description of the procedure (sections 3-5). Then work through the sample analysis (section 6), doing your own acoustic analysis as you go along. At each step, look at the relevant part of the general instructions (sections 5.1 - 5.6), study the displays in Figure 1 and on your computer screen, do your own analysis, and compare your results to my discussion in the relevant section of the sample analysis (sections 6.3.1 - 6.3.6). Figure 2 gives my final analysis with numbers, and I recommend saving it until you have done a substantial part of the analysis on your own. 3. General approach In most cases, we want to do formant measurements as a way of figuring out what the speaker's articulation was: If we know F1 and F2 for a vowel, we have a good estimate of the tongue body position on the high/low and front/back dimensions (setting aside issues related to lip rounding and larynx height and the role of F3). If we know F1 and F2 for a number of vowels or diphthongs, we can graph the results as an acoustic vowel quadrilateral and get a picture of the relevant vowel system. The data we want is the center frequencies of the F1 and F2 resonances; the best way to measure these frequencies is by using the power of an interactive acoustic analysis system to look at the speech signal in three different ways: a. A wide-band spectrogram: This gives an overview that lets you track the formants through time and select the right point or points to measure. b. A spectrum display representing an estimate of the formant values at a given measurement point. For Signalyze 3.12, this will be an averaged wide-band FFT spectrum; in another system, you might prefer an LPC spectrum. c. A spectrum display showing the harmonics at the same location; this will be a narrow-band FFT spectral cross-section. This picture of the harmonics represents the actual signal more closely than wide-band or LPC displays, and lets you estimate formant center frequencies by inspecting the amplitude of individual harmonics. Once you have these three displays, you can compare them and produce an estimate for the formant center frequencies based on your evaluation of the data in all three displays. 4. Initial set-up Signal > Scale Setup: Select Y-axis scales and horizontal grid lines, so you'll get labelled reference lines in your spectrogram. It is convenient to select a Y-axis rounding value of 500 for male voices and 600 for female voices, so that the spectrogram reference lines will correspond roughly to F1 = mid and F2 = central. (After you have displayed your spectrograms, you may wish to go back to this menu and turn off the horizontal lines so you can print uncluttered waveforms.) Spectral > Spectral Analysis Setup: Select Pre-emphasis (since this is voiced speech), Smoothing, Grayscale, and Plot Every Line. You will want to adjust Band, Range, and Darkness values to give the most informative displays as you work through your analysis. [Note 1] 5. Analysis Procedure 5.1. Step 1: Spectrogram analysis: Getting an overview (See example 6.3.1.) 5.1.a. Getting an informative display Display the waveform of the syllable that you want to analyze so that it fills some reasonable part of the screen; display an appropriate wide-band spectrogram showing the whole syllable; and evaluate your display settings: 1. Time scale: It is important to get the correct horizontal (time) scale when the formants are changing, as they usually are. Compress the time scale too much, and you can't see what's going on; spread things out too much, and the changes are so slow you can't interpret them. Try displaying segments of different lengths until you get a good result for your syllable. 2. Spectral Analysis Setup: A spectrogram display is very sensitive to your exact choice of display parameters; and different speakers, different syllables, and even different tokens of the same syllable will require different settings. Prepare to spend time fiddling among the many choices in Spectral Analysis Setup until you become familiar with how the options work. Very often you can take an uninterpretable mess and turn it into a textbook picture by some small setup adjustment. a. Range: A Signalyze spectrogram display is always two tracks high (in the standard analysis window), but you have three options for the range of frequencies displayed in those two tracks. The "full range" value varies depending on the sampling rate and on your choice in the "Band" selection; the "half range" and "quarter range" choices expand the display vertically by showing only the bottom half or quarter of the available "full-range" frequency range. In most cases, you want to expand the display vertically as much as you can without losing sight of the formants you are interested in. This will help to separate adjacent formants and make it easier to see where formants are steady or changing. If your signal was recorded at the common 22,050 Hz sampling rate, and you are looking at vowel formants, you will probably want to use 0-5512 Hz (half-range for a wide-band display) or 0-2756 Hz (quarter-range). Look at the 0-5512 Hz display first, and see if there is any relevant information in the top half of the display. If not, switch to 0-2756 Hz and redisplay your spectrogram. b. Band: Since you want to look at the formants, you should select one of the three wide-band settings, "Wide", "Very Wide", and "Extra Wide". These settings are all designed to smear the amplified harmonics together to give a representation of the formants. For male speakers, the "Wide" setting (8 ms / 125 Hz) usually gives good results; for female speakers, who usually have a higher fundamental frequency producing more widely spaced harmonics, the "Very Wide" setting (5 ms / 200 Hz) usually works better. General approach: If your formants are separating out to show you distinct harmonics, try displaying a spectrogram using a wider band-width; if two adjacent formants are hard to distinguish, try a narrower band-width within the wide-band group. c. Darkness: 30% or 40% is usually a good starting point. The Darkness control is useful in many of the same cases where you might also adjust the band-width: If two adjacent formants are smeared together into a single black band, try reducing the Darkness setting by 10 or 20%. If some expected formant isn't visible at all, try increasing the setting until it comes into sight. Very often, different parts of the same signal will call for different darkness settings: You need 70% or more to show this feeble F1, but F2 and F3 smear together at all settings above 25%. In cases like this, you often need to work with two or more spectrograms, one with each of the needed settings. (There are also other tricks to try -- for example, turning off pre-emphasis for the spectrogram might also help you see a low-amplitude F1.) 5.1.b. Interpreting the spectrogram Once you have a good spectrogram display to work with, begin your analysis. Where are the formants changing and where are they steady? The moving formants let you track how the articulators are moving; recall the relationships between F1 and F2 and articulatory configuration. What articulatory gestures produce the change in formant frequencies you observe? Which parts of the syllable represent "transitions" between consonantal closures and the vowel you are interested in? Which part of the syllable best represents the vowel you want to measure? When you are doing this analysis, it is essential to listen as well as look. [Note 2] Play the whole syllable; play selected parts of the syllable. If the formant pattern implies a change in vowel quality, listen closely and see if you can hear it. You want to work on training your ear as well as your eye: Your long-term goal is to have a pattern in your head that relates the vowel quality that you hear, the spectral pattern that you see, and the numbers that you get when you measure the spectral pattern. This analysis should let you select one or more points where you want to measure the vowel. If possible, you want to exclude the transitions where the formant values will show influence from both the vowel and the adjacent consonant. [Note 3] If the remaining formants show the vowel is basically steady-state, or if the vowel is so short that there is only one reasonable measurement point, then you can identify that single point and plan on plotting the vowel as a single dot on your graph of the acoustic vowel quadrilateral. If the formants are changing enough so that a single measurement could not fairly represent the syllable nucleus, then you need to treat the vowel as a diphthong and make two or more measurements representing the range of formant travel. You will then need to plot the vowel as a diphthong arrow on your AVQ chart. Eyeball your selected measurement points and note the approximate center frequencies for F1 and F2. For an adult male speaker, the 500 Hz and 1500 Hz reference lines on the spectrogram display will be useful: From our schwa analysis, you should recall that 500 Hz for F1 represents approximately a mid vowel on the height dimension, while 1500 Hz for F2 represents a central vowel on the front-back dimension. (The equivalent values for an adult female speaker are 600 and 1800 Hz.) Do the formants you have identified make sense in this context? If it sounds like a high front vowel for an adult male speaker, you'd better be seeing F1 below the 500Hz line and F2 well above the 1500Hz line. If your pattern isn't making sense, go back and reanalyze. 5.2. Step 2: Wide-band averaged spectrum: Looking at formants (See example 6.3.2.) 5.2.a. Preliminary numerical measurement. A wide-band spectrum or spectrogram is intended to show you formants directly. For this analysis procedure, we will use wide-band averaged spectra to get a preliminary numerical measurement of formant center frequencies. [Note 4] Use your analysis of the spectrogram to chose a suitable segment or segments to measure, and select the relevant part of the waveform by clicking and dragging. In Spectral Analysis Setup, select an appropriate wide-band setting ("Wide" or "Very Wide" or "Extra Wide"), select "Averaged spectrum", and have the system compute an averaged spectrum. (Warning: To compute an averaged spectrum in Signalyze, you need to push the spectrogram button, not the spectrum button. This is because Signalyze computes the averaged spectrum by first computing a spectrogram and then averaging the results -- cf. Manual p175.) The size of the segment to average will depend on the data you are looking at. In a long, steady-state, citation-form vowel, you might want to average 50 or even 100 ms from the middle of the syllable. In a short syllable or a syllable where the formants were changing rapidly, you might want to average a section as short as 5 or 10 ms. Usually 20-30 ms will give you a good sample. Note that the extra spectrogram display you get from computing the averaged spectrum (tracks 2a and 3a on the sample printouts Figures 1 and 2) shows you exactly what part of the signal got averaged for each averaged spectrum. Compare that extra spectrogram display (track 2a) with your spectrogram of the whole syllable (track 1b) to make sure that you chose a representative segment. Estimate the formant center frequencies by looking for the "center of gravity" of the appropriate amplitude bumps in the spectrum, and use the cursor in the spectrum display to get the numerical frequency values for those locations. Evaluate how precise your measurements are, and record your results with an appropriate rounding. When it is hard to decide on the best measurement, it can be useful to make an explicit note of the range of uncertainty: "F1 = 530 Hz [+15 Hz, -35 Hz]" 5.2.b. Calibration correction for averaged WB spectra We can calibrate our acoustic analysis system by checking the Signalyze displays against known input signals like the sound of a good-quality tuning fork. When you do that, you will find that most displays are acceptably accurate. This useful averaged-spectrum display is an exception: The frequency values you get from the averaged-spectrum display are consistently significantly higher than the input frequency. The size of the error depends on the sampling rate: For our usual 22,050 Hz sampling rate, an averaged wide-band spectrum will give you numbers that are a bit more than 40 Hz higher than the actual frequency. See Signalyze 3.12: Calibration of FFT spectra for more information about calibration and a table of recommended calibration corrections. This means that you need to correct the numbers you get from the system's averaged wide-band spectrum display by subtracting 40 Hz. This correction should bring the averaged wide-band numbers in line with the narrow-band numbers you'll get in Step 3. Record both the system number and the corrected measurement with appropriate labels. At this point in the analysis, I recommend recording numbers to the nearest Hz; at the last step, I will recommend rounding your results down to the next lower 10 Hz. 5.2.c. Evaluation Now evaluate the validity of your calibrated measurement in the same way that you did for your spectrogram interpretation. Do these numbers match up with what you expected from the spectrogram? Do they match with what you are hearing? When you look at normative data, do your numbers match up with the values you expected? 5.2.d. Possible problems: Working with hard cases In most cases, it will be clear which bumps correspond to F1 and F2 (and F3, if you are measuring F3). Sometimes, however, one formant may be very faint, or two formants may run together into one big bump in the display. Another possibility, especially with a high fundamental frequency, is that your wide-band display may be showing you one bump per harmonic instead of one bump per formant. These problems are sometimes genuinely difficult, even impossible to solve. For example, once f0 is high enough in a static signal, there's no way to identify formant resonances as distinct from the harmonics. However, in most cases you can figure out what's going on. The first thing to try is display adjustments, using the same approach as with the spectrogram display: Expand the vertical scale in the spectrum track to see if faint formants come into view. (Shift-click on the up-down arrow next to the spectrum to display a vertical-zoom pop-up menu.) Try "Wide" instead of "Very Wide" if two formants seem to have run together. Try "Very Wide" or "Extra Wide" instead of "Wide" if you seem to be seeing harmonics instead of formants. If you're still seeing one big bump where you expected two formants, look back at the spectrogram display: If you see your single peak splitting up into two peaks 50 msec later in the signal, then you can be pretty confident that there are really two resonances in there. You will also get additional information from looking directly at the harmonics in the next step of the analysis. 5.3. Step 3: Narrow-band spectral cross-section: Looking at harmonics Narrow-band spectra are intended to show you the harmonics. A rule-of-thumb is to use a "very-narrow-band" spectral cross-section for a male speaker (fundamental about 100-150 Hz) and a "narrow-band" section for a female speaker or a child (fundamental 200 Hz or more). If your spectra are not showing you the individual harmonics, you want to move to an analysis setting with a narrower bandwidth. Take a narrow-band or very-narrow-band spectral cross-section from a point in the middle of the window that you averaged for the wide-band display in Step 2. To make comparison easy, display this new cross-section on the same frequency range that you used for the wide-band, and try to position the spectrum tracks next to each other. [Warning: If you are using a very-narrow-band cross section, you will usually need to click a different "range" button to get the two displays to show the same frequency range.] Now estimate formant center frequencies by looking at how much the different harmonics have been amplified: Each formant corresponds to a resonance peak in the transfer function; each cluster of amplified harmonics corresponds to a formant. For each formant, ask yourself where the resonance peak must be in order to get this harmonic pattern. You would expect the resonance to be between the two biggest harmonics in the group, and closer to the bigger one of the two. [Note 5] How much closer? From this spectral cross-section display, that's a judgement call: You need to evaluate the relative amplitudes of the two harmonics, and also to look at the harmonic on the far side of the biggest one. [Note 6] The sample analysis (section 6.3.3) works you through a couple of representative examples. Deal with difficult cases using the same general approach as (2.d) above, Not all problems are solvable, but usually you can figure out what is going on. Sometimes it can help to trace how the harmonic pattern develops over time: Display a suitable narrow-band or very-narrow-band spectrogram, or display a series of cross-sections spaced 10 or 20 ms apart. Once you've decided where the resonances are, use the cursor to measure the frequencies, and evaluate your measurements in the same way you did with the averaged wide-band spectrogram. As with the averaged wide-band measurement, I recommend recording numbers to the nearest Hz; the calibration correction for narrow-band spectra is too small to worry about at this stage in the analysis. 5.4. Step 4: Combined best estimate Now evaluate the big picture: You have three partially independent estimates of the formant center frequencies. You want to fit these together to get a combined best estimate of the actual formant center frequencies, your best judgement of where the vocal tract resonances really are. Do the measurements from the spectra agree with your eyeball estimate from the spectrogram? Are the two spectrum measurements equally precise and trustworthy? Do the numbers make sense, given what the syllable sounds like? Do the numbers from the two spectra agree? You would hope to see them within 20-40 Hz, once you have made the calibration correction to the averaged wide-band. Or are there some big discrepancies? If the three estimates basically agree on numbers that make sense, you're set. You might want to report an average, or you might want to select the measurement you think is more trustworthy, or you might want to use the system to find a compromise cursor location that makes a reasonable fit for both wide-band and narrow-band displays (allowing for the averaged wide-band calibration). If there are big discrepancies, or if the numbers don't fit with what you are hearing, you want to check it out: Did you measure the same part of the signal for wide-band and narrow-band? Is there a typo in the numbers you recorded? Did you apply the right calibration correction? Was the wide-band spectrum really an averaged spectrum, or was it a cross-section over an 8 ms or 5 ms window? (The two spectrum displays will look similar on the screen, but the cross-section over the little window may be quite unrepresentative of the larger part of the signal you are looking at. Remember that you have to push the "Spectrogram" button to get the averaged spectrum: If the system didn't show you a spectrogram for the interval you averaged over, you didn't get an averaged spectrum.) Try redisplaying each spectrum, to see if you get a pattern that fits better. If it turns out that the wide-band and narrow-band spectra are really telling you something significantly different, which do you believe? This doesn't often happen; but when it does, I recommend going with the narrow-band data, as long as it is consistent with the general pattern you're seeing in the wide-band spectrogram. The narrow-band data is closer to the actual signal than the wide-band data is. Displaying a narrow-band spectrogram can also help you to figure out what's going on. Record your final best estimate for each formant center frequency. At this stage, I recommend rounding down to the nearest 10 Hz: I recommend rounding to tens, because +/- 10 Hz represents the best precision you could hope for under favorable circumstances. [Note 7] I recommend rounding down as a small additional calibration correction, because calibration data shows that your narrow-band/wide-band match is still a few Hz too high. 5.5. Step 5: Evaluation and plotting (See example 6.3.5) Step 4 gives you a combined best estimate of numerical values for the formant center frequencies that you are interested in. My instructions have been encouraging you to evaluate each measurement as you go along, asking yourself, do these numbers make sense? Now, at step 5, you want to pause and evaluate more systematically: Your combined best estimate numbers are based on measurements of wide-band and narrow-band spectra. Do these numbers match up with what you expected from the spectrogram? Do they match with what you are hearing? When you look at normative data, do your numbers match up with the values you expected? For example, if you are looking at North American English, are you somewhere reasonable within those big ellipses on the Peterson and Barney chart? [Note 8] In doing this sort of evaluation, it is often useful to plot your data point(s) on an acoustic vowel quadrilateral that is set up like the displays for the normative data you have been looking at. [Note 9] These displays often help to put measurement discrepancies into perspective: The 20 Hz that you are worrying about may be not much bigger than the size of a legible dot on the display. Graphical displays also help a lot in identifying patterns: If you're hearing a high front vowel, you'd expect the data point(s) to fall in the high front part of the quadrilateral. If your results don't match the normative data, that doesn't prove your measurements are wrong: Your speaker may be doing something interesting. But it does mean that you want to go back and double check. Crucially, checking should include listening as well as checking measurements and arithmetic. In the end, you have to believe your own data, once you've checked to make sure it really is different from the textbook values. Warning: Teachers of acoustic phonetics courses often pre-select data so that most examples look like the textbook, but a few are "interesting". Don't be shocked if your assignment contains something significantly off from textbook norms. If you are training to be a speech pathologist, this is useful practice: After all, if your client wasn't doing something interesting, he wouldn't require your services. 5.6. Step 6: Documentation: Providing the evidence In many cases, you'll just want to record your numerical data on your worksheet and continue on to the next syllable. However, some assignments will ask for documentation; and you will always want to document a case that causes problems or gives evidence for some interesting or unexpected result. To do this, arrange a pretty screen display that documents the measurements you made; Figure 2 provides a model. In the spectra, put the cursor at the most interesting formant value to display the system's reading. Leave the active cursor in the waveform display, so that the small display in the top left documents which part of the signal you were measuring. If you are using a machine with a big enough screen, use the text window to add notes explaining what the different tracks show. Make a printout, and add appropriate labels and markers. Be sure to mark the waveform display to show where your averaged spectra and cross-sections were taken. The sample analysis gives more detailed suggestions (section 6.3.6). 6. A sample analysis: /æ/ in English "had" 6.1. Materials provided: Figure 1, Fig1Had.gif: Signalyze printout plus notes, as a GIF file. This printout shows the data display you would construct to measure F1 and F2 for /æ/ in this token of "had", without actually labeling the formants or marking the formant center frequencies. I suggest printing in landscape mode, 80% reduction. Figure 2, Fig2Had.gif: Figure 1 with added information. This printout shows the same data display as Figure 1, with labels for the formants and the formant center frequencies marked and measured. This is the sort of printout you would make to document a measurement for an assignment, for a client's file, or for a research study. Audio File 1, heedHad.aiff: The data file that is analyzed in Figures 1 and 2, in Mac Audio Interchange File Format. Warning: To open an AIFF file in Signalyze, you go to File>Open Signal, and then select "Audio IFF" in the Format pop-up. You won't be able to see the file until you have selected the right format. AIFF files can be opened in Signalyze and a range of other Mac programs. Audio File 2, heedHad.wav: The same file, in Windows .wav format, in case you want to do a parallel analysis with a PC-based analysis system. 6.2. Introduction This section is intended to illustrate the measurement procedure by working through a real example that has been selected to illustrate both easy cases and hard cases. The best way to approach it is to download Audio File 1, open it in Signalyze, and follow along with the analysis. I also recommend printing out Figure 1, and taking notes on the Signalyze displays as you work through the steps. At the end, you can check yourself by printing out Figure 2, to see how well your formant identifications and numbers match with mine. The data to analyze is one token of English "had" spoken in slow citation form by an adult male with a mixed U.S. accent (Carden). The task is to get F1 and F2 values that are representative of this speaker's /æ/; for the students doing my lab assignment, this /æ/ would be one of the corner vowels in the acoustic vowel quadrilateral that they need to develop. 6.3. Step-by-step analysis Refer to Figure 1. This printout is a screenshot at 1152 x 870, taken from a 21" monitor; it shows 12 tracks of Signalyze data and an adjacent text window for taking notes on the data. To print the Signalyze display on the same page with the text window while you are doing analysis, it is most convenient to use a screen capture utility like Flash-It. With a significantly smaller monitor, you would need to use a lower resolution and spread the equivalent analysis over two Signalyze pages and two pages of printout. It's very useful to have a monitor big enough to allow you to display 10 or 12 tracks at an acceptable size. 6.3.1. Step 1: Spectrogram analysis: Getting an overview (See 5.1.) Track 1a shows the waveform [Note 10] of the syllable we are analyzing, plus one of the other vowels from the set of corner vowels. Play the syllable a couple of times, to make sure that you are analyzing what you think you are, and to begin the process of matching up sound and picture. Does this sound like a reasonably normal English "had"? It's very slow, as you would expect with citation-form data: The total duration of the syllable is about 500 ms. Do you notice anything else funny about it? Track 1b shows a wide-band spectrogram, which gives an overview of the articulation of the whole syllable. Note that we don't need to expand "had" to fill the whole screen: It's actually easier to see the formant trajectories at the given horizontal scale. The spectrogram is displayed at half-range, which is 0-5512 Hz for this wide-band display. If we were interested solely in F1 and F2 of "had", we could display at quarter range (0-2756 Hz), but we would lose some information that might be relevant to potential interaction between F2 and F3. The selected darkness (33%) gives a good display for F1, F2, and F3 of "had"; if we were interested in F1 of "heed", we might want to increase the darkness. The initial /h/ appears as about 40 ms of aperiodic friction noise, and its effect continues as 50-60 ms of breathy voice. The glottal stricture for the /h/ should not significantly affect the tongue position, so we can assume that the formants represent the /æ/ articulation from the very beginning of the syllable. F1 and F2 let us track tongue position through the syllable nucleus. F1 tracks openness (inverse tongue height for vowels): We'd expect /æ/ to be an open (low) vowel, so we'd expect to see F1 well above the nominal mid-height reference line at 500 Hz. This checks with the data display: From the beginning of voicing (about 840 ms) to about 1150 ms, F1 starts well above 500 Hz and gradually increases, showing that the articulation starts open and gradually becomes even more open. F2 tracks the front-back dimension: We'd expect /æ/ to be a (moderately) front vowel, so we'd expect to see F2 well above the nominal central reference line at 1500 Hz. This also checks with the data display: F1 starts substantially above the 1500 Hz line and falls gradually toward it, so that the articulation starts front, and becomes gradually less front. About 1150 ms, there is an obvious inflection point in F1 and F2. This is the beginning of the "stop transition", as the tongue begins to move from the /æ/ to the alveolar closure for the /d/: F1 stops rising and begins to fall, as the tongue blade and body move up toward the closure and the vocal tract becomes less open; F2 stops falling and begins to rise, as the tongue body moves further forward to put the blade into convenient range for the alveolar closure. Signal and spectrogram analysis part 1: Listening and evaluation. Does the display make sense? Is everything what you would have expected? Play it some more, and check things out by playing segments as well as the complete syllable: If you play just the voiceless and breathy-voiced part (try 790-890 ms), you will hear [hæ] with a very short soft [æ]. If you cut at the end of the voiceless part (try 835 ms) and play the rest, you'll still hear [hæd]. If you cut at the end of the breathy voice (ca 890 ms) and play the rest, you'll hear [æd]. Conclusion (no surprise): The /h/ and /æ/ articulations overlap from about 835 to 890 ms. If you play the whole syllable up to the beginning of the stop transition (790-1150), you'll hear just [hæ]. If you add in the transition without the stop closure or release (790-1220), you'll hear [hæd] with an unreleased /d/, confirming that the transition contains information about stop place of articulation. Now try playing the vocalic part from the end of the breathy voice to the beginning of the transition (890-1150). [Note 11] This should all sound like [æ], but you should also be able to hear significant diphthongization: The one surprise in the spectrogram is the significant movement of F1 and F2 during what you might have expected would be a "steady state" vowel; [Note 12] the change of vowel quality that you are hearing corresponds to this movement of the formants. Check this out some more: Now that you've noticed the diphthongization, you ought to be able to hear it easily on the full syllable. It is also useful to play short (50-100 ms) chunks and listen for the vowel quality differences. [Note 13] Spectrogram analysis part 2: Selection of measurement points. Our objective is to get F1 and F2 measurements that fairly represent this token of /æ/. Since F1 and F2 are changing significantly, no single measurement can represent this /æ/. The standard way to handle a diphthong like this is to measure twice, once at the beginning of the formant trajectory, and once at the end. [Note 14] Our job is therefore to pick the best two measurement points. Extending the metaphor in the term "cross-section", we can call our measurement points "slice 1" and "slice2", visualizing a cut through the three-dimensional spectrogram. Figure 1 shows the points I picked, but I encourage you to try to make your own choices before you look at my choices or read the description below. Slice 2: The second measurement point, slice 2, is easy to locate: There is a clear inflection point in both formants about 1150 ms, where the tongue begins to move toward the stop closure. We have two goals: we want to measure /æ/ with a minimum of influence from the surrounding segments, and we want to measure as much of the travel of the /æ/ diphthong as possible. Combining these goals, we should measure just before the inflection point. Slice 1: The first measurement point is harder to decide. The glottal adjustments for /h/ should not affect the tongue position, so ideally we should measure at the very beginning of the syllable in order to get the full travel of the diphthong. On the other hand, the syllable starts out at a low amplitude, and you often find irregularities at the beginning of voicing. The right approach is to try different locations, and to measure the ealiest slice that gives you cleanly interpretable results. In this example, we are lucky and you can get a clean measurement from a slice centered around 860, so that our measurement window corresponds to the 2nd, 3rd, and 4th beats of the vocal folds. 6.3.2. Step 2: Wide-band averaged spectrum: Looking at formants (See 5.2.) Slice 1: Compute an averaged spectrum for slice 1. Since the formants are moving significantly, you want to average over a fairly short interval, so as not to average away too much of the formant travel; 20-30 ms would be appropriate for this example. Select an appropriate segment of the waveform in track 1a (I selected 849-869 ms.), select "Averaged spectrum" in "Spectral Analysis Setup", and click the Spectrogram button. The system computes a spectrogram for the segment you selected (Track 2a), and then displays the average of the spectrogram values as a spectrum (2b). Look closely at the spectrograms, comparing the whole syllable (1b) and your selected segment (2a), and make sure you are happy with the segment you selected: Does this segment appropriately represent the beginning of the travel of F1 and F2? Identify formants and measure formant center frequencies: The wide-band averaged spectrum in (2b) shows three obvious bumps, corresponding to F1, F2, and F3. [Note 15] F1: F1 stands by itself, so it's easy to measure: Place the cursor at the center of gravity of the bump, and read off the frequency value in the window to the left of the spectrum track. You judge center of gravity by eye: Place the cursor in a promising location, and then use the arrow keys to move the cursor back and forth until it looks right: 674 Hz? Definitely too far left. 681? Still a bit left. 689? This looks promising. 697? Also promising. 704? Definitely too far right. Click back and forth between 689 and 697 a few times until you decide which you prefer. Record your number, rounded to the nearest Hz, and also the appropriate calibration correction (section 5.2.b). Hint 1. Each click of an arrow key moves you one pixel (display element); this changes the frequency changes 7 or 8 Hz -- the exact amount and the exact numbers will depend on your display resolution and the scale you picked for the spectrum display. Follow the numbers as you move the cursor on the screen display, so you get an idea of the sort of precision that is possible. Fractions of a Hz are clearly irrelevant, and my proposal to eventually round to tens of Hz should begin to look reasonable. Hint 2. There is no law that says you have to report a number that appears in the display window: If you think that 689 is a bit left of target, and 697 is a bit right, it's fine to split the difference and say 693. In most cases this sort of precision won't be relevant, but sometimes it takes less time to split the difference than to make a hard choice. Hint 3. Record the number along with the calibration correction: "689 - 40 = 649 Hz". Otherwise you're sure to forget whether you recorded your numbers before or after subtracting the 40 Hz. Hint 4. When you are deciding where to place the cursor, it helps to get rid of visual clutter: Move the mouse pointer out of the way; I even find that it's useful to use an opaque ruler to block out the track below. You want to be able to focus on the bump you're measuring. Hint 5. Your objective is to find the center of gravity of the whole bump, the whole area amplified by this particular formant resonance. Often this will match with the high point of the bump, or the middle of the flat top of the bump, but not always. Evaluate the amplified area as a whole. F2 and F3: F2 and F3 are close enough in (2b) so that the amplified areas overlap. This is a very common situation: In (2b), it's easy to identify the separate resonances despite the overlap; in other cases, you may have so much overlap that separation is difficult or impossible. To get a good visual center for each formant, it helps to try to picture how each formant bump would look if the other one was removed. I often use an opaque ruler to suggest the hidden side of the formant, making the assumption that the underlying resonance is symmetrical. Make your best estimates for F2 and F3, draw appropriate cursor locations on your printout of Figure 1, and record your numbers for comparison with my numbers in Figure 2. Slice 2: Conveniently, F1, F2, and F3 all form nice separate bumps in Slice 2. Give yourself practice by making center-frequency estimates for all three formants. Draw appropriate cursor locations on your printout of Figure 1, and record your numbers for comparison with my numbers in Figure 2. By this time, the estimates should be going pretty fast. 6.3.3. Step 3: Narrow-band spectral cross-section: Looking at harmonics (See 5.3.) Slice 1: Compute a suitable narrow-band spectral cross-section for slice 1: Section location: You want the data from the narrow-band display to be directly comparable to the data from the averaged wide-band display, so place the cursor in the waveform in approximately the middle of the window you analyzed for the avereraged spectrum. My averaged wide-band window in Figure 1 ran from 849 ms to 869 ms, so I need a cursor placement for the narrow-band of about 859 ms. Spectral Analysis Setup: Since this is a male speaker, the first thing to try is the "Very Narrow" band setting in Spectral Analysis Setup. When you look at this window, you'll see that the range numbers are different from the numbers for the wide-band setting: To make your new spectrum line up with your averaged wide-band, you want a display range of 0-5512 Hz. This was "half range" for the wide-band, but it is "full range" for the very-narrow-band, so you'll need to adjust the setting. [Note 16] Display: Click the spectrum button to display the very-narrow-band spectral cross-section. This should look like Track 2c in Figure 1. Analysis: Before we start work on the analysis, recall how our source/filter analysis of vowels works: The vibrating vocal cords are the sound source. Since the vocal folds vibrate (pretty much) periodically, the sound produced by this source is periodic and consists of a large number of harmonics spaced at intervals equal to the fundamental frequency. As the harmonics generated at the vocal cords passes through the upper vocal tract, they get filtered by the resonances of the upper vocal tract (the formants), so that some of the harmonics get amplified more than others. The resonant frequencies (the formant center frequencies) depend on the vocal tract configuration -- where the tongue is, and so on. We perceive these different patterns of harmonic amplitudes as differences in vowel quality. The actual acoustic signal for a vowel consists of harmonics. The wide-band display is intended to smear the amplified harmonics together to display the formants, so each of the three big bumps in Track 2b corresponds to a formant, and you can't see the harmonics at all. Your new narrow-band display is intended to show the harmonics themselves, so each of the many narrow bumps in Track 2c corresponds to a harmonic. The narrow-band display has been computed so that the height of a harmonic in the display is intended to correspond to how much that harmonic got amplified as it passed through the upper vocal tract (see Note 5); clusters of amplified harmonics should show you where the resonant frequencies (the formants) are. Often you can look at the harmonic pattern and make a more precise estimate of formant center frequency than you could do looking at the wide-band display alone. F1: Let's begin with F1. In Track 2c, we can see a cluster of amplified harmonics somewhat above 500 Hz. These four or five amplified harmonics are smeared together in the wide-band Track 2b to give us the big bump that we identified as F1. The biggest harmonic has a center frequency of about 636 Hz. The next biggest one is the harmonic immediately above at 743 Hz, the 3rd biggest is the one just below at 528 Hz: [Note 17] You can see that the harmonics are evenly spaced, at intervals of about 107-108 Hz, matching the fundamental as expected. [Note 18] Our job is to analyze the harmonic amplitudes and figure out where the formant center frequency is. It is tempting to select the biggest harmonic in the cluster and say that the formant center frequency is the frequency of that biggest harmonic. This is almost always wrong: The harmonic frequencies are determined by the fundamental, by how fast the vocal folds are vibrating; this is controlled (mostly) by the larynx muscles. The formant frequencies (resonances) are determined by the shape of the upper vocal tract; this is controlled (mostly) by the tongue and lip muscles. The two frequencies are therefore controlled independently [Note 19], and it would be an accident if a formant frequency happened to match up exactly with a harmonic frequency. Let's see what the harmonic amplitudes tell us about the location of the formant center frequency, the resonance that is amplifying the harmonics. The biggest harmonic in the cluster (636 Hz) got the most amplification, so the relevant resonance must be closer to it than to any other harmonic. Suppose the resonance was right on top of the 636 Hz harmonic: then we would expect that the harmonics on either side would get amplified equally--visualize a textbook symmetrical resonance peak superimposed on the big harmonic. But in fact the harmonic to the right (743 Hz) is substantially larger than the harmonic to the left (528 Hz). Therefore the resonance peak (the formant center frequency) must lie to the right of the biggest harmonic, so that the 743 Hz harmonic gets amplified more than the 528 Hz harmonic. How far to the right? Suppose the resonance peak was exactly half-way between the 636 Hz harmonic and the 743 Hz harmonic: Then we would expect that the two harmonics would get equal amplification and would show as the same size in the display. Since the 636 Hz harmonic is substantially bigger, the resonance peak must be substantially to the left of the mid-point. We are now zeroing in on the location of the resonance peak, the F1 center frequency: It must be somewhere to the right of the big harmonic, but substantially to the left of the half-way point. Where exactly? With our present displays, that's a judgement call [Note 20]: Two clicks to the right of the big harmonic is 651 Hz. Three clicks is 658 Hz. Four clicks is 666 Hz. These are all reasonable candidates. You want to visualize a resonance peak superimposed on the harmonics: How far to the right should it be to hit the three biggest harmonics cleanly? F2 and F3: Now let's apply the same method to locating the resonance peak (formant center frequency) for F2 and F3: For F2, we have two adjacent harmonics of nearly the same size, a slightly larger one about 1891 Hz, and a slightly smaller one about 1998 Hz. Since these two harmonics are almost the same size, they should have received almost equal amplification. It follows that the resonance peak should be almost exactly half-way between these harmonics. Since the lower harmonic is very slightly bigger, the resonance peak should be very slightly to the left of center. A cursor placement of 1945 Hz looks like dead center; one click left gives 1937 Hz. [Note 21] Slice 2: Compute a suitable narrow-band spectral cross-section for slice 2, and analyze it in the same way that we did for slice 1. This should be a straightforward application of the same methods we used for slice 1; you can compare your results to mine by looking at Track 3c in Figure 1 and the associated analysis in Figure 2. There is one place where the analysis of Track 3c is not straightforward: What is happening between F2 and F3 at about 1900-2000 Hz? For most of the display, the harmonics are spaced roughly every 95 Hz, which fits with the fundamental frequency you see in the waveform (Note that the fundamental has decreased significantly from slice 1 to slice 2.); but there's a gap between the harmonic at about 1840 Hz and the next obvious harmonic at about 2029 Hz: We're missing the harmonic you would expect to see at about 1935 Hz. Looking at the harmonics on either side of the gap, you can see that our simple source/filter nodel would predict that the missing harmonic should be quite substantial, fitting into the right-hand side of the F2 curve. What gives? The most likely answer is that the missing harmonic is not related to some anomaly in the filter (the resonance pattern), but instead involves some irregularity in the source function: The vocal folds are vibrating in a complex pattern that generates little or no energy at this harmonic frequency. If the energy isn't there in the source function, the formant resonances making up the filter won't have anything to amplify. In this particular case, the irregularity is unlikely to confuse you: The F1, F2, F3 pattern is well-defined, and the gap in the narrow-band display is small. In other cases, source-function irregularaties can be genuinely difficult to interpret. To see some examples, look earlier in the syllable, in the region from 20 or 30 ms in front of the big amplitude irregularity in the waveform to 20 or 30 ms after: [Note 22] For example, a narrow-band section at 914 ms will show interesting missing harmonics at about 2000 Hz and 2500 Hz, producing a display where there seem to be four peaks in the F2-F3 area, and it's quite hard to decide where the F2 center frequency lies. The best way to make sense of these local irregularities is to go back to the wide-band spectrogram, which should give you an overall picture of the resonance pattern and how it is developing: The spectrogram will tell you approximately where the formants must be, and that general location will tell you how to interpret the detailed patterns you see in the averaged spectra and the spectral cross sections. You always get your numbers off the spectral displays, but often the spectrogram will be crucial in telling you where you should measure in the spectral display. 6.3.4. Step 4: Combined best estimate (See 5.4.) In this step, you need to fit all your data together . As you made each of your individual measurements, you should have been looking back at the spectrogram to be sure that your numbers made sense as part of the overall pattern; now you want to get the best numerical estimate you can. Table 1 gives my numbers for the first slice of "had": Table 1. Measurements for [æ] in "had", Slice 1, in Hz Data from Figure 2.
You can see that the corrected wide-band numbers come out fairly close to the narrow-band numbers; the largest difference is 29 Hz. This is a fairly typical result. You'd want to go back over the F2 and F3 measurements and verify your cursor placements, but differences of this scale don't suggest any serious problem. It does tell you something about the precision of the measurements. In this table, the raw numbers, for wide-band and narrow-band, are rounded to the nearest Hz; with measurements bouncing by 10-30 Hz, recording fractions of a Hz is plainly irrelevant. The wide-band calibration correction of -40 Hz is mechanical, fixed for a wide-band averaged spectrum at this sampling rate; if you wonder where these calibration numbers came from, see Signalyze 3.12: Calibration of FFT spectra. In addition, the final result, the combined best estimate, is rounded down to the next lower ten. We round, because reporting in tens of Hz gives an idea of the precision of the overall measurement; we round down, as a further calibration correction. The hard part is the "combined best estimate". This is not mechanical -- it requires an evaluation of the wide-band and narrow-band measurements for the particular example. Often the right answer is an average of the wide-band and narrow-band measurements, but not always. To see how this goes, let's work through a couple of examples. F1: This one is easy, because the (calibrated) wide band and narrow band are so close together: The difference of 9 Hz is about one click apart on the spectral display in Figures 1 and 2. Whether we go with the narrow band or average the two measurements, we'll come to the same result: The narrow-band 658 Hz would round down to 650; the average (654 Hz) would also round down to 650. My final number of 650 is closer to the (calibrated) WB measurement (649) than to the NB measurement (658); but that does not represent a judgement that the WB is more reliable in this case; it's just an effect of the "round down" calibration rule. If we had started with the WB number 649 and applied the "round down" rule, we'd have got 640. F2: Here we have quite a substantial difference (29 Hz), and no obvious problems with measuring either display. If we average the two numbers, we get (1966+1937)/2 = 1952, which rounds down to 1950. That would be an acceptable "right" answer. In my answer, I fudged the average down a bit to get 1940. I did this to match more closely with the NB result, because I generally have more faith in the NB results: They are closer to the actual signal, and they (usually) permit a more precise measurement, once you master the method. If I put a cursor in the NB display at 1950, it looks unambiguously too far to the right. If I put a cursor in the WB display at the uncalibrated equivalent of 1940 (1940 + 40 to reverse the calibration correction = 1980), it looks too far to the left, but I can imagine it as a possible placement. A combined estimate of 1940 is therefore reasonably consistent with both displays, which the mechanical average of 1950 looks clearly wrong for the NB display. Therefore I prefer the 1940 Hz estimate. F3 from Slice 1 and the measurements from Slice 2 reported in Figure 2 all illustrate the same sort of analysis. I recommend making your own combined best estimates for the three formants of Slice 2, and then checking against my version, hidden in Table 2 in Note 23. You can see that there is a significant judgement call involved, but the degree of uncertainty involved is typically about +/- 10 to 20 Hz. That's a fair estimate of the uncertainty in these measurements, under favorable circumstances. I should mention again that this "had" example was selected as representative of fairly easy cases, and an easy case is what you want, when you're first learning a method. 6.3.5. Step 5: Evaluation and plotting (See 5.5.) If you are working through this analysis with Signalyze, don't look at this section until you have done your own measurements and evaluated them as suggested in section 5.5. Table 3 gives my numbers from the analysis in Figure 2, along with sample normative data from Peterson and Barney 1952. Table 3. Measurements for [æ] in "had", Slices 1 and 2, plus normative data from Peterson & Barney 1952 /æ/ Data from Figure 2; measurements in Hz.
Evaluation part 1: my analysis: The textbooks tells us that the vowel in English "had" should be low (open) and front /æ/ monophthong; this sounds like a (reasonably) normal "had", so we're expecting to see numbers that fit with open and front. The vowel in the token we are analyzing is obviously diphthongized, which is an important point in itself; suitable segments around each slice still sound like [æ], so we expect both slices to have numbers that fit with open and front. Let's take it one formant at a time: F1: F1 correlates with openness; 500 Hz is a nominal adult male reference value for a mid degree of openness. The P&B value for /æ/, a maximally open vowel, shows us that 660 Hz counts as fully open for an average male. F1 for slice 1 is essentially identical to the P&B value; F1 for slice 2 is 790 Hz, implying an articulation that is very substantially more open than slice 1. What gives? Does this make sense? There are probably two factors involved: 1. Minor factor -- vocal tract length: We would expect formant frequencies to vary inversely with vocal tract length. The P&B averages presumably reflect an average US male vocal tract; our present speaker (Carden) is a small man; it would be no surprise if his vocal tract length was 5% or even 10% shorter than the average length implied by the normative data. A shorter vocal tract would resonate with higher frequencies, so a given articulation would produce higher formant center frequencies. I call this a minor factor, because it could account at best for about half the difference we are seeing: The 790 Hz in slice 2 is 20% more than the 660 Hz in the normative data, and differences in vocal tract length could account for only a 5% or 10% increase. Our speaker's articulation must be significantly different from the P&B average. 2. Major factor -- the phonological pattern permits the observed differences in articulation: It's clear that this speaker's /æ/ articulation differs from the monophthong reported in textbooks and implied by the P&B numbers. First, the articulation glides substantially, starting open (650 Hz) and ending a lot opener (790 Hz). Second, whatever we assume about vocal tract length, the end point of the glide is 10-20% opener than the P&B average. To see why this is a possible thing for a normal speaker to do, consider the phonological pattern of English vowels: /æ/ is distinctively low (open) and front, predictably non-round (because it's front). Thinking about the low-high = open-close dimension, /æ/ is distinctively the lowest (= most open) front vowel. Phonetically, /æ/ needs to be low (open) enough to be distinct from / F2: We can do a parallel analysis for F2: F2 correlates with the front-back dimension; 1500 Hz is a nominal adult male reference value for the central position on the front-back dimension. The P&B value for /æ/, a moderately front vowel, shows us that about 1700 Hz counts as front for an average male. F2 for slice 2 is essentially identical to the P&B value; F2 for slice 1 is 1940 Hz, implying an articulation that is substantially more front than slice 2. As with F1, we are seeing significant diphthongization; for F2, the numbers show a glide within the front area from quite front to somewhat less front. As with F1, we can see two relevant factors: 1. Minor factor -- vocal tract length: The 1940 Hz number for slice 1 is 13% greater than the P&B number of 1720 Hz, so we could imaginably account for most of that difference simply on the basis of vocal tract size. I continue to call vocal tract size the minor factor, because we still have the diphthongization to account for: If we assume that the speaker has a short enough vocal tract that the 1940 Hz from slice 1 represents essentially the same frontness as P&B's 1720 Hz, then it follows that the 1700 Hz from slice 2 must represent an articulation noticeably less front. Whatever we assume about vocal tract length, we will still end up with a significant difference of articulation to account for. 2. Major factor -- the phonological pattern permits the observed differences in articulation: As with F1, the point is that the variation we see is all within the part of the vowel space that belongs to /æ/. /æ/ is the only low (open) front vowel. The F2 value for /æ/ can move back and forth as much as it likes in the front range, as long as it doesn't get far enough back to trespass on the territory of central [a] or back [ Summary: Compared to textbook norms like the P&B data, the unusual thing about this /æ/ is the diphthongization. Any individual slice of the /æ/ has F1 and F2 values that are appropriate for /æ/, but the vowel as a whole [cf. Note 11] becomes significantly more open and less front over its duration of almost 300 ms. It is natural to speculate that the diphthongization is related to the unusual duration; [Note 27] to test this, we'd want to look at examples at more normal speech rates and especially at examples in sentence context. Plotting: Since this /æ/ is phonetically a diphthong, we would plot it on an acoustic vowel quadrilateral as an arrow running from the [F1,F2] data point for slice 1 to the [F1,F2] data point for slice 2. If we do this on a chart like Peterson and Barney Figure 8 (reprinted as Borden, Harris, & Raphael Figure 5.30), we can see that the whole travel of the diphthong comes within the ellipse representing the range of /æ/ productions. [Note 28] This confirms our analysis of the F1 and F2 values above. Evaluation part 2: your analysis: If you worked through this analysis on your own, your final analysis is not likely to be identical to mine. There are substantial judgement calls in deciding where to measure; in most cases, measurements at different points will give different numbers. There are also smaller judgement calls in interpreting the displays at any given measurement point, and in deciding how to weight the data from different measurement methods. Your final printouts and numbers will probably look noticeably different from my Figure 2. That's fine as long as we're both coming up with valid measurements. A valid measurement (a full-credit measurement, if you're thinking about the lab exam in Linguistics 317 or the equivalent) is a measurement where the formant values accurately represent what the speaker is doing. If we plotted two valid measurements of the same token on a typical acoustic vowel quadrilateral chart, the dots or lines might not overlap, but they'd tell us the same thing about the articulation. 6.3.6. Step 6: Documentation: Providing the evidence (See 5.6.) To document an analysis, begin by making a printout like Figure 1, showing the displays that you will use for your analysis. Then add annotations to that printout, as illustrated in Figure 2. The list below gives a sample set of specific directions for making a useful annotated printout:
a. Identify each spectral display by specifying the band (" averaged very-wide-band"; "very-narrow-band", etc.), the time reference for the measurement, and the window size. b. Add a vertical line at your proposed center frequency for each formant that you are measuring in each spectrum. This should show your measurement for that particular spectral display, not your final combined best estimate. For an averaged wide-band display, your line should show the uncalibrated location -- the objective is to provide a visual check of how you judged each center frequency. As with the time references in (2), you can use the computer systems to put in one of the vertical lines into the spectral display: Pick the most interesting formant, and leave the cursor in the right place when you make your printout; add the other lines by hand while you're looking at the computer display, moving the cursor around the computer display so you can remind yourself which locations match your recorded numbers. c. If your combined best estimate for a given formant is substantially different from one or more of the measurements in (a), add a labelled line showing the combined best estimate. If you're adding that line to an averaged wide-band display, reverse the calibration correction, so that your new line matches appropriately with the visual display. d. Label each formant you measured with F1, F2, etc. e. Record the numbers for each measurement in some suitable location. In most cases, there won't be room to put the numbers into the track with the spectral display; you'll need to add notes next to the display, as in Figure 2. For displays that have a significant calibration correction, record the original number and the calibrated number in a format that makes it obvious what's going on: F1 = 834 - 40 = 794 Hz. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||