Tenth Quarterly Progress Report
November 1, 1997 through January 31, 1998
NIH Project N01-DC-5-2103
Prepared by
Mariangeli Zerbi, Dewey Lawson and Blake Wilson
Center for Auditory Prosthesis Research
Research Triangle Institute
Research Triangle Park, NC 27709
II. Effects of Upward Extension of the Frequency Range Analyzed by CIS Processors
III. Plans for the Next Quarter
Appendix 1: Summary of Reporting Activity for this Quarter
One of the principal objectives of this project is to design, develop, and evaluate speech processors for implantable auditory prostheses. Ideally, the processors will represent the information content of speech in a way that can be perceived by implant patients. Another principal objective is to develop new test materials for the evaluation of speech processors, given the growing number of cochlear implant subjects enjoying levels of performance too high to be sensitively measured by existing tests.
Work in the present quarter included:
In this report we describe in detail studies to evaluate possible effects of extending the overall frequency range analyzed and represented by CIS processors. (Preliminary findings were presented in Quarterly Progress Report 1 for this project.)
Results from other studies and activities indicated in the list above, such as results from the studies with bilateral subject ME2, will be presented in future reports.
In tests with CIS processors we have observed consistent differences in consonant identification scores between female and male voices. This has led us to a careful comparison of the representation of male and female voices by our CIS processor implementations. Typical CIS processors have had an input spectral range of 350 to 5500 Hz. We have determined that this eliminates useful information from the high frequency consonants of female speakers and some low frequency voicing information from both male and female consonants.
We have compiled a list of most frequent consonant confusions among data for four cochlear implant subjects using CIS processors that spanned our customary overall frequency range of 350Hz to 5500Hz. Tables 1 and 2 indicate the relative incidence of those confusions (among 40 presentations of each token across four subjects), and processor manipulations that might reduce their incidence. Among these proposed improvements are including an additional high frequency channel (one covering 5656Hz to 9000Hz), using no pre-emphasis equalization (allowing more voicing information in channel one), and including an additional low frequency channel (80Hz to 350Hz).
The four subjects represented in these data included two with 6-electrode Ineraid implants and two with percutaneous versions of the Nucleus 22-electrode implant. Preliminary data for three of these four subjects -- NP1, NP2, and SR3 -- were presented in QPR1 of the current contract. More complete data for subject SR10 subsequently replaced those presented for subject SR2 in QPR1. The tests of identification of 16 medial consonants were conducted in a hearing alone condition with no feedback as to correct or incorrect responses. Each subject was using the six-channel CIS processor that had yielded his/her best overall performance to date. The processing hardware was identical except for stimulus current sources. All the processors used full-wave rectification in estimating energy levels for representation on each channel, and the same type of compression table for mapping stimuli to each subject's thresholds and most comfortable loudness levels.
Table 1. Possible ways to reduce principal confusions among medial consonants uttered by a female speaker.
| confusion | occurrences (/40) | add low freq. band | no equalization | add high freq. band | |
| t-k | 19 | x | |||
| v-vth | 18 | x | x | ||
| z-vth | 18 | x | |||
| s-vth | 10 | x | |||
| d-g | 10 | x | x | ||
| f-s | 9 | x | |||
| p-k | 9 | ||||
| n-l | 8 | x | |||
| p-t | 8 | x | |||
| f-vth | 7 | x | x | ||
| v-z | 7 | x | |||
| b-vth | 7 | x | x | ||
| n-m | 6 | x | x | ||
| f-z | 6 | x | |||
| v-b | 4 | x | x | ||
| z-d | 4 | x |
Table 2. Possible ways to reduce principal confusions among medial consonants uttered by a male speaker.
| confusion | occurrences (/40) | add low freq. band | no equalization | add high freq. band | |
| b-d | 6 | x | x | ||
| f-s | 5 | ||||
| m-n | 4 | x | |||
| v-vth | 4 | x | x |
It seemed that nine of these principal confusions might be reduced by the inclusion of higher frequency information in such CIS processors.
These observations led us to study the potential benefits of extending the overall frequency range represented by CIS processors at the high-frequency end. We present the results of five 22 electrode percutaneous subjects' performances studied so far. We compared processors analyzing an overall frequency range of 350Hz to 5500Hz to otherwise identical processors analyzing a range of 350Hz to 9500Hz, to examine the benefits or disadvantages of extending the frequency range higher.
With each of five subjects we compared two six channel processors which were identical except for the overall frequency range. The processors all supported good performance by their users. While they were not necessarily the best ones for each subject, they were the best pair of processors differing only in the frequency ranges. The six channel processors all had 12th order Butterworth bandpass filters and full wave rectification. In every case the overall frequency range was divided into six logarithmically equal bands. Pulses were all presented positive phase first and in a staggered order of stimulation across electrodes. Four of the subjects' processors had 4th order low-pass envelope smoothing filters with a 200Hz cutoff, while subject NP2 had a 4th order smoother with a 400Hz cutoff. Electrode choice, pulse width, and stimulation rate all varied among, but not withn, subjects. The processors with the extended frequency range used a 20 bit oversampling A/D while the other processors used a 12 bit A/D. Table 3 gives more detail about each pair of processors compared in this report.
Table 3. Processor descriptions
| subject | normal range processor | extended range processor | pulse rate | pulse width | smoother cutoff | smoother order | electrodes used | |
| NP2 | No. 92 | No. 100 | 833 p/s | 33 µs/phase | 400 Hz | 4 | 1,5,9,13,15,21 | |
| NP1 | 71 | 74 | 833 | 33 | 200 | 4 | 2,6,10,14,19,21 | |
| NP4 | 60 | 71 | 833 | 33 | 200 | 4 | 5,9,13,15,19,21 | |
| NP3 | 25d | 39 | 833 | 33 | 200 | 4 | 1,5,9,13,17,21 | |
| NP5 | 16 | 15 | 500 | 100 | 200 | 4 | 1,5,10,12,14,16 |
The performance of 3 subjects was compared with tests of identification of 16 medial consonants, while tests of identification of 24 medial consonants were used for the other two subjects because of their higher level of overall performance. The results within subjects are summarized in Tables 4 and 5.
Table 4. Percent correct scores for the 16 medial consonant identification tests
| subject | male talker | female talker | |||
| 5500 Hz | 9500 Hz | 5500 Hz | 9500 Hz | ||
| NP4 | 81 ± 2.8 | 84 ± 2.7 | 59 ± 2.5 | 75 ± 2.1 | |
| NP5 | 82 ± 2.5 | 84 ± 2.7 | 62 ± 1.7 | 70 ± 2.8 | |
| NP3 | 68 ± 3.3 | 69 ± 4.7 | 69 ± 2.5 | 78 ± 2.3 | |
Table 5. Percent correct scores for the 24 medial consonant identification tests
| subject | male talker | female talker | |||
| 5500 Hz | 9500 Hz | 5500 Hz | 9500 Hz | ||
| NP1 | 63 ± 1.2 | 68 ± 2.3 | 63 ± 2.7 | 76 ± 1.5 | |
| NP2 | 90 ± 1.5 | 89 ± 1.8 | 72 ± 2.8 | 79 ± 0.9 | |
The male test scores hardly changed with the frequency expansion; the difference between the two scores is insignificant except for subject NP1. On the other hand, the test scores for the female voice all increased by significant amounts when the upper frequency region was included. Many of the confusions in both frequency ranges are the same. For the female voice some confusions are resolved with the extended frequency range while a couple of new ones appear. We compared the most frequent consonant confusions within the two frequency ranges for the 16 consonant test and the 24 consonant test separately. The aggregate matrices from all the male voice 16 consonant tests are shown in Tables 6 and 7 for the normal and extended frequency ranges, respectively. In these and all subsequent matrices the row indicates the presented consonant token and the column the response. [Use of a boldface label distinguishes voiced th from the unvoiced form that will be included in the 24 consonant matrices.] Consonants with less than 60% correct identification score for all three subjects combined are considered "frequently misidentified". In those cases the correct identification scores are shown in red in the aggregate matrices and the percentages noted in Tables 8a and 8b, where the most frequent confusions are listed. These most frequent confusions are labeled with presented token first, then incorrect response, and also appear in bold in the aggregate matrices of Tables 6 and 7. [The same pattern of aggregate matrices for the two overall frequency ranges, followed by lists of frequently misidentified tokens and most frequent confusions will be followed for the other conditions of talker gender and consonant test to follow.]
Table 6. Aggregate matrix for male voice 16 medial consonant tests for subjects NP3, NP4, and NP5 with processors using an overall frequency range of 350Hz to 5500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | ||
| m | 29 | ||||||||||||||||
| n | 6 | 21 | 3 | ||||||||||||||
| f | 25 | 2 | 1 | 2 | |||||||||||||
| v | 4 | 20 | 2 | 4 | |||||||||||||
| s | 2 | 20 | 4 | 4 | |||||||||||||
| z | 23 | 5 | 1 | 1 | |||||||||||||
| sh | 2 | 12 | 16 | ||||||||||||||
| th | 5 | 2 | 21 | 1 | 1 | ||||||||||||
| p | 26 | 1 | 3 | ||||||||||||||
| b | 21 | 9 | |||||||||||||||
| t | 2 | 28 | |||||||||||||||
| d | 3 | 21 | 6 | ||||||||||||||
| g | 3 | 25 | 2 | ||||||||||||||
| k | 8 | 2 | 20 | ||||||||||||||
| j | 2 | 28 | |||||||||||||||
| l | 3 | 2 | 25 |
Table 7. Aggregate matrix for male voice 16 medial consonant tests for subjects NP3, NP4, and NP5 with processors using an overall frequency range of 350Hz to 9500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | ||
| m | 28 | 2 | |||||||||||||||
| n | 7 | 18 | 5 | ||||||||||||||
| f | 24 | 2 | 3 | 1 | |||||||||||||
| v | 4 | 19 | 1 | 6 | |||||||||||||
| s | 3 | 25 | 1 | 1 | |||||||||||||
| z | 27 | 1 | 1 | 1 | |||||||||||||
| sh | 2 | 1 | 27 | ||||||||||||||
| th | 1 | 8 | 2 | 18 | 1 | ||||||||||||
| p | 22 | 6 | 2 | ||||||||||||||
| b | 27 | 3 | |||||||||||||||
| t | 5 | 19 | 6 | ||||||||||||||
| d | 2 | 27 | 1 | ||||||||||||||
| g | 3 | 27 | |||||||||||||||
| k | 10 | 5 | 15 | ||||||||||||||
| j | 30 | ||||||||||||||||
| l | 2 | 28 |
Table 8a. For frequency range of 350Hz to 5500Hz, male talker, each token was presented 30 times (summation for all 3 subjects)
| Frequently misidentified Tokens | % correct ± 2.9 |
| sh | 53 |
| Most frequent Confusions | occurrences of errors |
| sh-s | 12 |
| b-d | 9 |
| k-p | 8 |
Tables 8b. For frequency range of 350Hz to 9500Hz, male talker, each token was presented 30 times.
| Frequently misidentified Tokens | % correct ± 3.4 |
| k | 50 |
| Most frequent Confusions | occurrences of errors |
| k-p | 10 |
| th-v | 8 |
The /sh/-/s/ confusion occurred less often with the extended frequency range processors for all subjects, even though for the male voice both consonants seem to be well represented in the 350Hz to 5500Hz frequency range. Other confusions where hardly affected for these subjects by the expansion of the frequency range. (There have been some instances, for other subjects not in this group, where the male consonant test scores decrease somewhat with the increase of the maximum analyzed frequency to 9500Hz.)
Table 9. Aggregate matrix for female voice 16 medial consonant tests for subjects NP3, NP4, and NP5 with processors using an overall frequency range of 350Hz-5500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | ||
| m | 23 | 3 | 4 | ||||||||||||||
| n | 3 | 24 | 3 | ||||||||||||||
| f | 26 | 4 | |||||||||||||||
| v | 1 | 18 | 1 | 7 | 1 | 1 | 1 | ||||||||||
| s | 16 | 3 | 8 | 1 | 2 | ||||||||||||
| z | 1 | 6 | 2 | 19 | 2 | ||||||||||||
| sh | 2 | 10 | 3 | 15 | |||||||||||||
| th | 2 | 7 | 2 | 18 | 1 | ||||||||||||
| p | 28 | 2 | |||||||||||||||
| b | 1 | 4 | 1 | 2 | 22 | ||||||||||||
| t | 2 | 17 | 11 | ||||||||||||||
| d | 19 | 7 | 4 | ||||||||||||||
| g | 2 | 4 | 2 | 21 | |||||||||||||
| k | 5 | 2 | 23 | ||||||||||||||
| j | 5 | 25 | |||||||||||||||
| l | 3 | 12 | 15 |
Table 10. Aggregate matrix for female voice 16 medial consonant tests for subjects NP3, NP4, and NP5 with processors using an overall frequency range of 350Hz - 9500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | ||
| m | 20 | 12 | 1 | 2 | |||||||||||||
| n | 28 | 2 | 5 | ||||||||||||||
| f | 23 | 9 | 1 | 2 | |||||||||||||
| v | 2 | 21 | 10 | 1 | 1 | ||||||||||||
| s | 2 | 30 | 3 | ||||||||||||||
| z | 1 | 25 | 4 | 1 | 4 | ||||||||||||
| sh | 4 | 2 | 29 | ||||||||||||||
| th | 1 | 14 | 1 | 19 | |||||||||||||
| p | 32 | 1 | 2 | ||||||||||||||
| b | 1 | 5 | 1 | 27 | 1 | ||||||||||||
| t | 1 | 33 | 1 | ||||||||||||||
| d | 1 | 17 | 15 | 2 | |||||||||||||
| g | 1 | 1 | 3 | 1 | 5 | 24 | |||||||||||
| k | 3 | 32 | |||||||||||||||
| j | 35 | ||||||||||||||||
| l | 2 | 15 | 18 |
Table 11a. For frequency range of 350Hz to 5500Hz, female talker, each token was presented 30 times.
| Frequently misidentified Tokens | % correct ± 2.2 |
| z | 6 |
| s | 27 |
| sh | 50 |
| l | 50 |
| t | 57 |
| Most frequent Confusions | occurrences of errors |
| z-th | 19 |
| s-f | 16 |
| l-n | 12 |
| t-k | 11 |
| sh-s | 10 |
Tables 11b . For frequency range of 350Hz to 9500Hz, female talker, each token was presented 35 times.
| Frequently misidentified Tokens | % correct ± 2.5 |
| d | 49 |
| l | 51 |
| th | 54 |
| Most frequent Confusions | occurrences of errors |
| l-n | 15 |
| d-g | 15 |
| th-v | 14 |
| m-n | 12 |
For the female voice 16 consonant test results the most frequent confusions that we attributed to the upper frequency bound's being too low were resolved. These included: /z/-/th/, /s/-/f/, /t/-/k/, and /sh/-/s/. The /l/-/n/ confusion remained the same while some new confusions appeared, namely /d/-/g/, /th/-/v/, and /m/-/n/. In examining these confusions before the present study, we did not expect that expanding the upper band would help subjects distinguish between /l/-/n/, /th/-/v/, and /m/-/n/. Actually it seemed to hurt, possibly because of the frequency band in each channel being wider or because of where the frequency bands for each channel now happen to fall. Thus, we would expect that some of these problems would be resolved with a higher number of channels spanning the frequency range of 350Hz - 9500Hz. In other studies with this same group of subjects, in fact, we saw a dip in test scores for 8 channel processors and then scores went back up for 11 channels. The /d/-/g/ confusion also was present in the 350Hz - 5500 Hz data but increased with the frequency extension.
Table 12. Aggregate matrix for male voice 24 medial consonant tests for subjects NP1 and NP2 with processors using an overall frequency range of 350Hz - 5500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | r | w | t | ng | h | zh | th | ch | ||
| m | 27 | 8 | |||||||||||||||||||||||
| n | 30 | 2 | 3 | ||||||||||||||||||||||
| f | 21 | 3 | 4 | 1 | 6 | ||||||||||||||||||||
| v | 12 | 8 | 1 | 2 | 3 | 9 | |||||||||||||||||||
| s | 1 | 24 | 6 | 1 | 3 | ||||||||||||||||||||
| z | 17 | 17 | 1 | ||||||||||||||||||||||
| sh | 1 | 2 | 4 | 26 | 2 | ||||||||||||||||||||
| th | 2 | 14 | 6 | 13 | |||||||||||||||||||||
| p | 35 | ||||||||||||||||||||||||
| b | 1 | 1 | 29 | 4 | |||||||||||||||||||||
| t | 35 | ||||||||||||||||||||||||
| d | 35 | ||||||||||||||||||||||||
| g | 1 | 32 | 2 | ||||||||||||||||||||||
| k | 2 | 3 | 30 | ||||||||||||||||||||||
| j | 31 | 4 | |||||||||||||||||||||||
| l | 35 | ||||||||||||||||||||||||
| r | 1 | 29 | 4 | 1 | |||||||||||||||||||||
| w | 11 | 24 | |||||||||||||||||||||||
| y | 35 | ||||||||||||||||||||||||
| ng | 1 | 16 | 1 | 1 | 16 | ||||||||||||||||||||
| h | 35 | ||||||||||||||||||||||||
| zh | 4 | 5 | 1 | 1 | 4 | 1 | 19 | ||||||||||||||||||
| th | 11 | 2 | 7 | 1 | 2 | 12 | |||||||||||||||||||
| ch | 12 | 23 |
Table 13. Aggregate matrix for male voice 24 medial consonant tests for subjects NP1 and NP2 with processors using an overall frequency range of 350Hz - 9500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | r | w | t | ng | h | zh | th | ch | ||
| m | 20 | ||||||||||||||||||||||||
| n | 20 | ||||||||||||||||||||||||
| f | 15 | 3 | 1 | 1 | |||||||||||||||||||||
| v | 11 | 4 | 5 | ||||||||||||||||||||||
| s | 11 | 5 | 4 | ||||||||||||||||||||||
| z | 7 | 10 | 3 | ||||||||||||||||||||||
| sh | 19 | 1 | |||||||||||||||||||||||
| th | 3 | 5 | 12 | ||||||||||||||||||||||
| p | 20 | ||||||||||||||||||||||||
| b | 19 | 1 | |||||||||||||||||||||||
| t | 20 | ||||||||||||||||||||||||
| d | 20 | ||||||||||||||||||||||||
| g | 1 | 18 | 1 | ||||||||||||||||||||||
| k | 2 | 18 | |||||||||||||||||||||||
| j | 19 | 1 | |||||||||||||||||||||||
| l | 19 | 1 | |||||||||||||||||||||||
| r | 11 | 8 | 1 | ||||||||||||||||||||||
| w | 5 | 15 | |||||||||||||||||||||||
| y | 20 | ||||||||||||||||||||||||
| ng | 8 | 1 | 1 | 10 | |||||||||||||||||||||
| h | 20 | ||||||||||||||||||||||||
| zh | 7 | 1 | 12 | ||||||||||||||||||||||
| th | 5 | 2 | 1 | 7 | 1 | 4 | |||||||||||||||||||
| ch | 1 | 19 |
Table 14a. For frequency range of 350Hz to 5500Hz, male talker, each token was presented 35 times (summation of all 2 subjects)
| Frequently misidentified Tokens | % correct ± 1.3 |
| v | 34 |
| th | 34 |
| th | 40 |
| ng | 46 |
| z | 49 |
| zh | 54 |
| Most frequent Confusions | occurrences of errors |
| z-s | 17 |
| ng-l | 16 |
| th-th | 13 |
| ch-j | 12 |
| th-th | 11 |
| w-r | 11 |
Table 14b. For frequency range of 350Hz to 9500Hz, male talker, each token was presented 20 times.
| Frequently misidentified Tokens | % correct ± |
| th | 20 |
| th | 25 |
| ng | 50 |
| z | 50 |
| v | 55 |
| s | 55 |
| r | 55 |
| zh | 60 |
| Most frequent Confusions | occurrences of errors |
| th-th | 12 |
| r-w | 8 |
| ng-l | 8 |
| z-s | 7 |
| zh-y | 7 |
| th-th | 7 |
Percent correct scores for subjects NP1 and NP2 did not differ much with the male voice for the two frequency ranges. The most frequent confusions include many of the same tokens for both frequency ranges.
Table 15. Aggregate matrix for female voice 24 medial consonant tests for subjects NP1 and NP2 with processors using an overall frequency range of 350Hz -5500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | r | w | t | ng | h | zh | th | ch | ||
| m | 25 | ||||||||||||||||||||||||
| n | 24 | 1 | |||||||||||||||||||||||
| f | 6 | 5 | 5 | 9 | |||||||||||||||||||||
| v | 2 | 1 | 7 | 10 | 3 | 1 | 1 | ||||||||||||||||||
| s | 2 | 8 | 6 | 7 | 1 | 1 | |||||||||||||||||||
| z | 1 | 2 | 5 | 3 | 4 | 1 | 2 | 1 | 6 | ||||||||||||||||
| sh | 1 | 21 | 2 | 1 | |||||||||||||||||||||
| th | 1 | 19 | 2 | 1 | 2 | ||||||||||||||||||||
| p | 20 | 2 | 3 | ||||||||||||||||||||||
| b | 3 | 20 | 1 | 1 | |||||||||||||||||||||
| t | 2 | 21 | 2 | ||||||||||||||||||||||
| d | 1 | 23 | 1 | ||||||||||||||||||||||
| g | 1 | 1 | 12 | 3 | 5 | 2 | 1 | ||||||||||||||||||
| k | 2 | 3 | 20 | ||||||||||||||||||||||
| j | 25 | ||||||||||||||||||||||||
| l | 5 | 1 | 16 | 3 | |||||||||||||||||||||
| r | 1 | 3 | 13 | 8 | |||||||||||||||||||||
| w | 5 | 7 | 13 | ||||||||||||||||||||||
| y | 1 | 1 | 23 | ||||||||||||||||||||||
| ng | 1 | 1 | 1 | 22 | |||||||||||||||||||||
| h | 25 | ||||||||||||||||||||||||
| zh | 7 | 18 | |||||||||||||||||||||||
| th | 2 | 1 | 1 | 2 | 3 | 2 | 4 | 5 | 5 | ||||||||||||||||
| ch | 25 |
Table 16. Aggregate matrix for female voice 24 medial consonant tests for subjects NP1 and NP2 with processors using an overall frequency range of 350Hz - 9500Hz.
| m | n | f | v | s | z | sh | th | p | b | t | d | g | k | j | l | r | w | t | ng | h | zh | th | ch | ||
| m | 30 | ||||||||||||||||||||||||
| n | 30 | ||||||||||||||||||||||||
| f | 22 | 1 | 1 | 5 | 1 | ||||||||||||||||||||
| v | 4 | 23 | 2 | 1 | |||||||||||||||||||||
| s | 27 | 3 | |||||||||||||||||||||||
| z | 6 | 8 | 13 | 1 | 1 | 1 | |||||||||||||||||||
| sh | 27 | 2 | 1 | ||||||||||||||||||||||
| th | 1 | 13 | 5 | 1 | 4 | 1 | 5 | ||||||||||||||||||
| p | 29 | 1 | |||||||||||||||||||||||
| b | 29 | 1 | |||||||||||||||||||||||
| t | 30 | ||||||||||||||||||||||||
| d | 1 | 29 | |||||||||||||||||||||||
| g | 2 | 23 | 5 | ||||||||||||||||||||||
| k | 2 | 28 | |||||||||||||||||||||||
| j | 30 | ||||||||||||||||||||||||
| l | 12 | 17 | 1 | ||||||||||||||||||||||
| r | 2 | 2 | 3 | 17 | 5 | 1 | |||||||||||||||||||
| w | 1 | 1 | 2 | 4 | 7 | 14 | 1 | ||||||||||||||||||
| y | 10 | 1 | 19 | ||||||||||||||||||||||
| ng | 2 | 28 | |||||||||||||||||||||||
| h | 1 | 29 | |||||||||||||||||||||||
| zh | 1 | 2 | 1 | 26 | |||||||||||||||||||||
| th | 14 | 2 | 1 | 3 | 8 | 2 | |||||||||||||||||||
| ch | 30 |
Table 17a. For frequency range of 350Hz to 5500Hz, female talker, each token was presented 25 times.
| Frequently misidentified Tokens | % correct ± |
| th | 8 |
| z | 12 |
| th | 20 |
| f | 24 |
| s | 32 |
| v | 40 |
| g | 48 |
| r | 52 |
| w | 52 |
| Most frequent Confusions | occurrences of errors |
| th-v | 19 |
| f-p | 9 |
| r-w | 8 |
Tables 17b. For frequency range of 350Hz to 9500Hz, female talker, each token was presented 30 times.
| Frequently misidentified Tokens | % correct ± |
| th | 7 |
| th | 17 |
| z | 43 |
| w | 47 |
| l | 57 |
| r | 57 |
| wMost frequent Confusions | occurrences of errors |
| th-f | 14 |
| th-v | 13 |
| l-n | 12 |
| y-r | 10 |
Errors for the female 24 consonant test where scattered for both frequency ranges. But the /s/ token clearly was better resolved once the high frequency information in the 350Hz-9500Hz range was included. The /z/ token, even though not fully resolved, benefited from the increased frequency range also. Once these consonants are easier to identify, other consonants that were confused with them are better resolved as well, /f/ for example.
We now compare the outputs of two processors -- one using a frequency range of 350Hz to 5500Hz and the other a range of 350Hz to 9500Hz. The figures were produced by a device developed in our laboratory called the wavegrabber for the verification of correct processor operation and archiving of output waveforms. The processors analyzed are identical to the ones used by NP2, except for the absence of compression. In the following figures the scales are zoomed in order to show more detail for each consonant, the scale is indicated on each figure heading. Figures 1a and 1b compare the output of the female token /s/ for both frequency ranges. It is obvious that the /s/ high frequency energy present in the processor including higher frequency analysis (1b) is largely absent in the other processor (1a). The /s/ in the 350Hz-5500Hz processor would have sounded very quiet, like the token /f/ with which it was most frequently misidentified by the subjects. The token /f/ is shown in Figures 2a and 2b for comparison. There is no indication that the extension of the frequency would help identify /f/ other than, as mentioned above, other tokens with which it was frequently misidentified are well resolved by the higher frequency range processor. Figures 3a and 3b compare the outputs for the female token /z/ for both frequency ranges. Again we see that for this particular talker the 5500Hz upper limit cuts off most of the high frequency energy in the consonant.
As we can see in the test results, however, there were some disadvantages in increasing the frequency range, partly due to the widening of the frequency bands of each channel and shifting of frequency bandpass edges. A possible example of this is the confusion between female voice tokens /d/ and /g/ seen with the 16 consonant test subjects using the expanded frequency processor. Figures 4a and 4b show the token /d/ in both frequency ranges and Figures 5a and 5b show the token /g/. In both cases the consonant's burst energy gets split between two bands by the 9500Hz processor. Also notice how the shape of the envelope of channel 1 gets smoothed by the 9500Hz processor, particularly for the /d/ token. This also is evident in several other tokens. For a six channel processor spanning 350Hz-5500 Hz, the 3db points for the bandpass filters are: 350, 554, 877, 1387, 2196, 3475, and 5500 Hz. For the six channel processor spanning 350Hz-9500 Hz, the 3db points are at: 350, 607, 1052, 1823, 3161, 5480, and 9500 Hz.
Figures 6, 7, and 8 are the processors responses to tokens /m/, /n/, and /l/ respectively. Expanding the frequency range decreased the scores for these tokens, possibly due to the widening of the first bandpass filter. Even though the /l/ token has strong voicing information in channel 1 for both processors, it is frequently confused with the /n/.
Some limited testing has been done with a frequency range of 150Hz to 9500Hz. Only NP1 and NP2 were tested with processors which included these low frequencies. Their test scores were very poor for this condition. These limited data show that with the 150 Hz lower frequency limit /m/ and /n/ tokens frequently are misidentified while /l/ becomes more distinct. Limited data for these two subjects also indicate that simply removing the equalization filter (at least when using an overall frequency range of 350Hz-9500Hz), improves /l/ token identification with a little decrement in identification of other tokens. The overall test scores for this condition were equivalent to the test scores of an otherwise identical processor with the equalization filter. Further studies with different equalization filters may prove useful in improving identification of consonants characterized by small low frequency energies.
On the male tests, the frequency expansion improved the /s/-/sh/ distinction, probably by providing more redundant information. Figures 9 and 10 display the processor responses for /s/ and /sh/, respectively, for both frequency ranges.
The panels within each figure are arranged differently than is the case in the printed version of this report.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 1a & 1b Processor response to female token input /s/, 1st exemplar, using scale x:480-840, y: ± 2.9 for both. Part a (left) is for a processor spanning the frequency range 350-5500Hz , while b is for a processor spanning the frequency range 350-9500Hz.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 2a & 2b Processor response to female token input /f/, 1st exemplar, using scale x:460-740, y: ± 2.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 3a & 3b Processor response to female token input /z/, 1st exemplar, using scale x:460-760, y: ± 2.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 4a & 4b Processor response to female token input /d/, 1st exemplar, using scale x:480-780, y: ± 2.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 5a & 5b Processor response to female token input /g/, 1st exemplar, using scale x:480-780, y: ± 2.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor. Notice how the energy from channel 4 in the narrow range processor gets distributed between channels 3 and 4 in the expanded frequency processor. Also channel 1 here looks more distinct than channel 1 in the expanded version.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 6a & 6b Processor response to female token input /m/, 1st exemplar, using scale x:460-740, y: ± 2.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 7a & 7b Processor response to female token input /n/, 1st exemplar, using scale x:460-740, y: ± 2.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 8a & 8b Processor response to female token input /l/, 1st exemplar, using scale x:460-740, y: ± 2.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 9a & 9b Processor response to male token input /s/, 1st exemplar, using scale x:376-716, y: ± 4.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Figure 10a & 10b Processor response to male token input /sh/, 1st exemplar, using scale x:376-716, y: ± 4.9 for both. Part a (left) is for a 350-5500Hz processor and b for a 350-9500Hz processor.
In general, upward extension of the overall frequency range to 9500 Hz does not degrade overall processor performance for these subjects, while producing substantial improvements in medial consonant identification for the female voice.
9500 Hz probably is a higher upper frequency limit than necessary to achieve these results. Examination of spectrographs of the female voice tokens in the Iowa videodisc recordings indicates that 8000 Hz would be sufficient. Other voices, including those of children, should be included in future studies, as should various alternatives to the present preemphasis equalization filter.
Our plans for the next quarter include the following:
We thank subjects ME2 and SR2 for their participation in the studies of this quarter. We also are most grateful for the many and important contributions made by co-investigators Stefan Brill and Joachim Müller, and by consultant Sigfrid Soli, to the studies with bilateral subject ME2.
Reporting activity for the last quarter, covering the period of November 1, 1997 through January 31, 1998, included the following:
Wilson BS, Finley CC, Lawson DT, Zerbi M: Temporal representations with cochlear implants. Am J Otol 18: S30-34, 1997.
Wilson BS: Review of studies at RTI with recipients of bilateral cochlear implants. University of Iowa, Department of Otolaryngology, Head & Neck Surgery, Iowa City, IA, January 27, 1998.