Eighth Quarterly Progress Report
 
 

July 1 through September 30, 2000

NIH Project N01-DC-8-2105
 
 

Speech Processors for Auditory Prostheses
 
 
 
 
 
 

Prepared by

Dewey Lawson, Blake Wilson, Robert Wolford

Stefan Brill, and Reinhold Schatzer
 
 

Center for Auditory Prosthesis Research

Research Triangle Institute

Research Triangle Park, NC 27709
 
 
 


CONTENTS

I. Introduction

II. Combined electric and acoustic stimulation of the same cochlea

III. References

IV. Plans for the next quarter

V. Acknowledgments

Appendix 1: MusiCI: A complex tone synthesizer for cochlear implant users


I. Introduction

The main objective of this project is to design, develop, and evaluate speech processors for implantable auditory prostheses. Ideally, such processors will represent the information content of speech in a way that can be perceived and utilized by implant patients. An additional objective is to record responses of the auditory nerve to a variety of electrical stimuli in studies with patients. Results from such recordings can provide important information on the physiological function of the nerve, on an electrode-by-electrode basis, and also can be used to evaluate the ability of speech processing strategies to produce desired spatial or temporal patterns of neural activity.

Work in this quarter included:


In this report we describe the recent studies with subject ME6 mentioned above. These studies included evaluation of many combinations of acoustic and electric stimulation of the same ear, along with control measures of speech reception performance with each type of stimulation alone. Results for the other studies indicated in the list above will be presented in future reports.

II. Combined electric and acoustic stimulation of the same cochlea

Until recently, people with any significant amount of residual hearing were not considered candidates for cochlear implantation. Recently, cochlear implant candidates with a significant degree of residual hearing in one ear either would be implanted in the other ear or -- if specifically electing to implant the "better ear" -- expect to sacrifice the residual acoustic hearing in hope of greater potential for cochlear implant performance. Published reports have indicated the destruction of usable residual hearing capacity in a majority of such intracochlear implantations, at least over the range of frequencies corresponding to the length of the inserted electrode array. [Rizer 1988; Bogess, Baker and Balkany 1989; Brimacombe, Arndt, Staller and Beiter 1994; Hodges, Schloffman and Balkany 1997; and Shinn, Deguine, Laborde and Fraysse 1997.]

In studies presently under way at the Johann Wolfgang Goethe University in Frankfurt [von Ilberg, Kiefer, Tillein, Pfenningdorff, Hartmann, Stürzebecher and Klinke, 1999] and at the University of Iowa, electrode arrays are being inserted a relatively short distance into scala tympani in an effort to preserve residual low frequency acoustic hearing (apical to the apicalmost position of the arrays) while allowing high frequency speech information to be conveyed to the basal end of the cochlea by electrical stimulation. A traditional hearing aid and a cochlear implant speech processor then are employed simultaneously and cooperatively to convey speech to the same ear. In this report we describe our initial studies with one of the Frankfurt subjects.

The subject, ME6, was born in 1959. Her sudden hearing loss occurred in 1978 during treatment of a severe infection using an ototoxic drug. A right ear hearing aid was employed beginning in 1990. In 1999 Dr. Jan Kiefer inserted a standard Med-El Combi40+ electrode array 20 mm into her right cochlea, placing 8 of the device's 12 electrodes within scala tympani. After initial studies in Frankfurt, Dr. Kiefer and his colleague Dr. Thomas Pfenningdorff referred the subject to our laboratory for further research. Dr. Pfenningdorff accompanied the subject during her two-week visit and participated in the studies described below.

Pre- and post-implantation clinical audiograms were within 5-10 dB of each other over the frequency range of 125-500 Hz. The most recent clinical data indicated a hearing loss sloping gently from 45 to 55 dB HL over that range, followed by a much steeper decline to 95 dB HL at 1 kHz and 110 dB at 1.5 kHz. [A clinical audiogram for the unoperated left ear, which was never aided except during an unsuccessful trial of binaural aids, indicates 55 dB HL at 125 and 250 Hz, and 75 dB HL at 500 Hz.]

The subject related a history of tinnitus from about the time of her sudden hearing loss, periodically requiring drug therapy. Her tinnitus was especially severe immediately post surgery and seems to have remained louder than generally experienced before the cochlear implant, though perhaps softening somewhat recently. The tinnitus is perceived as a tone, which has become higher in pitch since implantation. The subject also has a pre- and post-implant history of episodic vertigo, for which she continues to receive medication.

Clinically, ME6 uses an 8-channel CIS strategy running on a Tempo+ BTE cochlear implant processor, along with a Resound ITE hearing aid. She reports that her speech understanding is better with the cochlear implant alone than with the hearing aid alone, and best with simultaneous use of both devices.

In addition to our routine initial assessments of threshold and MCL levels for stimulation with electrical pulse trains at various rates and durations, we undertook a number of preliminary acoustic assessments with this subject. One was a determination of relative pure tone thresholds on a finer frequency scale. This was done for both ears at 100 Hz intervals from 100 Hz to 1 kHz, as shown in Figure 1. The test tones were digitally synthesized samples played through the same system later used to present acoustic stimuli via circumaural earphones in a quiet laboratory. [With the earphones in place the subject denied hearing any room background noises such as HVAC sounds or equipment cooling fans.]

{short description of image}

Figure 1. Relative pure tone thresholds via earphone in a quiet laboratory. Subject ME6.

It was anticipated that such data might support construction of digital compensating filters to allow spectrally flat transmission of acoustic signals to the subject.

Other preliminary acoustical studies involved relative pitch perception. We tested ME6's ability to discriminate between MCL pure tone stimuli differing by an equal-tempered semitone (1/12 octave) as a function of absolute frequency. We found a clear boundary at 784 Hz (a musical G re. a 440 Hz concert A); the subject could discriminate sequential semitone intervals at and below F#-G (740-784 Hz) but not at and above G-G# (784-830 Hz). Another indication of the practical upper frequency limit for information transmission via this subject's residual hearing was obtained through a sequence of consonant identification tests using acoustic signals alone. In the sequence of four tests, the acoustic signal was limited by low-pass filters to frequencies below 2 kHz, 1 kHz, 500 Hz, and 250 Hz respectively. The corresponding percent correct scores for identification of 16 medial consonants were 40±2, 44±2, 40±2, and 28±3.

A pitch matching study between the two ears, with spectral compensation based on the threshold measurements and stimuli at 100 Hz intervals from 200 to 600 Hz, yielded the results shown in Figure 2.

{short description of image}

Figure 2. Pitch matching of acoustic pure tones between the two ears. Subject ME6. The black diagonal line corresponds to common pitch percepts for common frequencies in the two ears, while the red line is provided merely for comparison purposes.

Pure tone stimuli near 400 Hz produced the same pitch sensations in the two ears, with pitch changing more rapidly as a function of frequency in the left ear than the right. [For reference, the top point in Figure 2 -- LE 600 Hz pitch matched with RE 530 Hz, corresponds to a frequency ratio of 1.132, more than a two semitone musical interval.]

In contrast to the strong frequency dependence of thresholds observed in the detailed audiograms, a subsequent right-ear loudness balancing study at MCL for 100 Hz interval pure tone stimuli revealed a spectral response flat to within 5 dB, so our supra-threshold pitch matching results may have been influenced by loudness differences. In view of this observed flatness at MCL, no spectral compensation was employed for our combined electric and acoustic speech reception studies with this subject.

In designing prostheses that involve simultaneous acoustic and electrical stimulation of the same ear, three distinct ranges of frequency and/or place merit consideration: (1) the range of frequencies conveyed in the acoustic signal and the associated range of locations along the organ of Corti, (2) the frequency range analyzed by the cochlear implant speech processor, and (3) the range of locations along the spiral ganglion addressed by stimulating electrodes. Notice that the conveyed spectrum may be limited by (1) and (2), while the pitches at which that spectrum are represented depend on (1) and (3).

It is possible to represent all three of these ranges graphically on a common scale by associating electrode locations with approximate characteristic frequencies along the cochlea. It is important to note that the accuracy of such associations is severely limited in that: (a) it is primarily spiral ganglion cells that respond to electrical stimulation, (b) the helical path of the spiral ganglion approximates that of scala tympani only in the basal turn, and (c) the perceived pitch elicited by electrical stimulation probably depends on electrode distance from the modiolar wall and on the local pattern of neural survival as well as on electrode distance along the scala tympani.

Using such a graphic representation, Figures 3 and 4 illustrate different qualitative relationships possible among the three ranges described above. In each case, the overall scale (shown in blue in the HTML version of this report) represents the approximate frequency limits of normal hearing -- 20 Hz to 20 kHz. To the left of this scale is shown the range of the acoustic signal -- which may be limited by the subject's pattern of residual hearing and/or by application of a low-pass filter to the input signal. Symbols on the scale indicate approximate tonotopic locations of implanted electrodes (in examples to follow, subsets of electrodes being stimulated in a particular situation will be shown in red or a contrasting gray). Finally, to the right of the scale, the range of frequencies analyzed by the cochlear implant speech processor is indicated (in examples to follow, this overall range will be subdivided into the appropriate number of bands, each connected by a line to the symbol for the active electrode stimulated by the corresponding processing channel).

{short description of image} {short description of image}

Figure 3. Graphical representations of qualitative relationships between the frequency band conveyed in an acoustic signal and that analyzed by a cochlear implant speech processor.

In Figure 3 we illustrate two possible relationships between ranges (1) and (2), the frequency range conveyed in the acoustic signal and that analyzed by the cochlear implant speech processor. There might be a significant gap between those two frequency ranges, in which case a region of the spectrum would not be conveyed to the listener ("analysis gap"). Alternatively, some common region of the input spectrum might both be included in the acoustic signal and also influence signals to one or more cochlear implant electrodes ("analysis overlap"). While a substantial gap would seem certain to impact prosthesis performance negatively, one can imagine an overlap being either advantageous or damaging to speech understanding. Neither of the ranges has precise edges, of course; the acoustic upper limit being influenced by the subject's audiogram and/or the order of any low-pass filter on the acoustic channel, and the CI analysis lower edge by the order of its lowest bandpass filter.

{short description of image} {short description of image}

Figure 4. Graphical representations of qualitative relationships between the region of the cochlea stimulated by an acoustic signal and that stimulated by cochlear implant electrodes.

In Figure 4 we similarly illustrate possible relationships between ranges (1) and (3), the region of the cochlea subject to acoustic stimulation and that responding to electrical stimulation. Here a "stimulation gap" is the anticipated condition, since the surgical intention is to preserve residual hearing by not advancing the electrode array into a region associated with those frequencies. It is likely that presence of electrodes in a given region of the cochlea precludes any sensitivity there to acoustic stimulation. We note, however, that if a "stimulation overlap" were discovered to exist in a given patient it would constitute an important opportunity for research. Such a situation would imply, for instance, spontaneous activity among neurons that also were subject to electrical stimulation.

In Figure 5 we use the same graphical representation methods to display nine specific relationships among electric and acoustic stimulation parameters that were included in our initial studies with subject ME6.

{short description of image}

Figure 5. Graphical representations of specific relationships included in initial studies of hearing prostheses utilizing simultaneous acoustic and electric stimulation of the same ear.

Among these conditions are instances in which the acoustic signal spectrum extended up to 1 kHz and other cases in which it was limited to 500 Hz and below. The electrode locations, of course, were constant across all conditions for this subject. While the upper limit of frequencies analyzed by the cochlear implant speech processor remained fixed at 5.5 kHz, lower limits used included 300 Hz, 600 Hz, and 1 kHz. Both four channel and eight channel configurations were included for the cochlear implant part of the prosthesis. When the cochlear implant was restricted to four channels, they were assigned to the apicalmost four usable electrodes in some cases, the basalmost four in others, and the odd-numbered electrodes (spanning almost the full range occupied by all eight) in still others. Common parameters shared by all these cochlear implant designs included presentation of 25 µs/phase pulses at a rate of 2273 p/s to each of the activated electrodes, and smoothing filters limiting the envelope response on each channel to 200 Hz and below.

As this list suggests, a variety of interesting performance comparisons were possible involving different subsets of these nine conditions. In the figures that follow, bar charts will indicate relative speech reception performance levels while copies of the appropriate graphical representations from Figure 5 will be displayed below to indicate the relationships being compared.

Our primary indicator of speech reception performance was identification of 16 medial consonants. These were presented with both male and female talkers, in quiet and at S/N ratios of +10 and +5 dB. Token presentations were not repeated and there was no feedback as to correct or incorrect responses. Tokens were presented in randomized sets, allowing the score for each set to be treated as a measurement and standard deviations to be calculated on that basis. In addition, information transmission scores were derived from the consonant confusion matrices -- both overall and for the individual features of voicing, envelope, frication, place, and duration.

Since ME6 is a native German speaker, the 16 medial consonant token set used was the one routinely employed by our lab in such cases. This involves relabeling five of our 16 standard English consonants (f label changed to v, v to w, s to ss, z to s, and sh to sch), substitution of h for the voiced th used in our English list, and substitution of the English y sound to correspond to a German interpretation of the j label. Three exemplars of each token were digitally recorded, using our own onset and offset addresses and the published Iowa videodisc materials. The noise used was the CCITT long term averaged speech spectrum.

In selected cases, we also administered sentence tests at +10 dB S/N using the same noise. ME-6's English skills made it possible to employ the CUNY sentence lists for this purpose.

Analysis gap vs. Analysis overlap

Many of the patterns present in our data from this first research subject are illustrated in Figure 6. The common elements in the data summarized in that figure are the presentation of medial consonant tokens at a S/N ration of + 10 dB with respect to CCITT noise, the use of a fixed acoustic signal (low-pass filtered to exclude frequencies above 1 kHz), and the use of 8-channel cochlear implant speech processors.

{short description of image}

Figure 6. Percent correct scores for identification of 16 medial consonants in CCITT noise at +10 dB S/N. Subject ME6. Acoustic alone, acoustic plus electric, and electric stimulation alone. Acoustical signal includes 1 kHz and below. Male and female talkers. Cochlear implant analysis bands with three different low frequency limits. Ten presentations of each token in each condition, bars indicate standard deviation of the mean.

These data include comparisons across male and female talkers; conditions of acoustic signal only, acoustic and electrical stimulation combined, and electrical stimulation alone; and three different low frequency boundaries for the overall band analyzed by the cochlear implant speech processor.

Let us begin by considering the six three-bar groups comparing acoustic only, combined, and electric only prostheses for various combinations of talker and processing strategy. In each of these six cases (and in all cases we examined) performance with the acoustic signal alone was significantly poorer than with electrical stimulation alone or with both combined. This subject's speech reception would have been improved by her cochlear implant surgery even if that intervention had caused the total loss of her residual acoustic hearing. In five of these six cases performance is significantly better with combined electric and acoustic stimulation than with the cochlear implant alone. This pattern of an advantage of simultaneous use of both types of stimulation is representative of our other data in noise, but in quiet we have seen some instances in which there was a significant advantage to the use of both electrical and acoustic stimulation in quiet for one of the two talkers but no significant difference for the other.

Now we turn to considering the data of Figure 6 in terms of the three different cochlear implant analysis ranges. The lower frequency limits of these ranges -- 300 Hz, 600 Hz, and 1 kHz -- were chosen to explore the possible performance impact of analysis overlap and/or analysis gap. While the upper frequency limit of the acoustic signal was nominally 1 kHz in each case, the acoustic-alone consonant recognition study as a function of upper frequency limit and the semitone pitch discrimination study, both discussed above, suggest an effective upper frequency limit on information that lies somewhere between 500 and 750 Hz. In that case, the 300 Hz to 5.5 kHz range represents a clear analysis overlap, the range with a 1 kHz lower limit represents a clear analysis gap, and the 600 Hz to 5.5 kHz range may approximate a minimum-overlap minimum-gap case. For combined acoustic and electric stimulation, that 600 to 5.5 k range does support significantly better consonant recognition than either alternative for the male talker, but there is no significant difference among the three analysis ranges for the female talker. In the case of electric stimulation alone, analysis only of frequencies above 1 kHz is associated with the poorest consonant recognition with both talkers. For the female talker and electric stimulation alone the 300 Hz lower limit shows an advantage over the 600, but there is no significant difference between those two ranges for the male talker.

Stimulation Range and the Degree of Stimulation Gap

In Figures 7 and 8 we hold the degree of analysis overlap constant while varying the range of stimulation or the stimulation gap. In each figure, the left two pairs of bars represent the same stimulation gap but different stimulation ranges, while the right two pairs of bars represent the same range of stimulation but different stimulation gaps. The fixed analysis relationships are different in Figure 7 and Figure 8, representing a clear analysis overlap and a minimal analysis gap, respectively. All data in these two figures are for combined electric and acoustic stimulation and medial consonants in quiet.

{short description of image}

Figure 7. Identification of 16 medial consonants in quiet, male and female talkers, simultaneous acoustic signal below 1 kHz and four-channel electric stimulation, based on 300 Hz to 5.5 kHz analysis range. Three different assignments of implanted electrodes to channels. Subject ME6.

For the analysis overlap case in Figure 7, we have different patterns for the male and female talkers. For both male and female talkers, the smaller stimulation range (apical four) produced significantly poorer results than the wider range (odd electrodes) for the same stimulus gap (same apicalmost electrode in each case). For the male talker there were significantly better results for the larger stimulus gap (basal four) than the smaller one (apical four) for the same stimulation range (four adjacent electrodes), while the opposite was true for the female talker.

{short description of image}

Figure 8. Identification of 16 medial consonants in quiet, male and female talkers, simultaneous acoustic signal below 500 Hz and four-channel electric stimulation, based on 600 Hz to 5.5 kHz analysis range. Three different assignments of implanted electrodes to channels. Subject ME6.

For the analysis gap case in Figure 8, both male and female talkers show the same patterns observed for the male talker in Figure 7. The differences for both talkers between scores for constant analysis gap and different stimulus range (left two groups) are only marginally significant, however.

Effects of Increasing Noise

In Figure 9, we group results obtained with cochlear implant speech processor number 7 (see Figure 5 for configuration) to demonstrate the effect of signal-to-noise ratio. We show consonant identification scores for male and female talkers, under both electric only and electric plus acoustic stimulation, in quiet and under S/N conditions of +10 and +5 dB with respect to CCITT speech spectrum noise.

{short description of image}

Figure 9. Identification of 16 medial consonants as a function of signal-to-noise ratio, for RTI processor 7, Subject ME6. Male and female talkers. Electric stimulation only (via cochlear implant) and combined electric and acoustic stimulation. Measurements in quiet and at S/N rations of +10 and +5 dB with respect to CCITT noise.

While these results are not without ambiguities (no significant difference, for instance, in the scores at +10 and +5 dB S/N for the female talker with combined electric and acoustic stimulation), they seem consistent with the combination of electric and acoustic stimulation being less sensitive to the impact of speech spectrum noise than for electric stimulation alone. It is important to note, however, that a cochlear implant processor designed to be used alone, and accordingly analyzing a range that included frequencies lower than 600 Hz, might have a different noise dependence as well as higher overall performance levels than this electric-only condition.

Summary

This subject's post-implantation residual hearing supports semitone pure tone pitch discriminations at MCL below 750 Hz. Her aided acoustic-only consonant identification scores are not improved by including frequencies above 500 Hz

Both electric-only and combined electric and acoustic modes support speech reception scores that are significantly better than those obtained under aided acoustic-only conditions. In terms of improved speech reception, implantation would have been considered successful whether or not her residual acoustic hearing was preserved.

In +10 dB and +5 dB noise conditions, combined electric and acoustic stimulation provides a significant advantage to this subject over use of the same cochlear implant processor in an electric-only condition. In quiet, evaluated with a variety of different speech processor designs and both male and female talkers, the combined condition has been associated with the better performance whenever a significant difference has been observed.

Speech reception performance with combined electric and acoustic stimulation in noise seems to be optimized for a minimum-overlap, minimum gap relationship between acoustic signal and electric analysis bands, based primarily on an advantage with the male talker.

Comparison of speech reception performance under controlled manipulations of stimulation range and degree of stimulation gap -- under both analysis overlap and analysis gap conditions -- reveal interactions among those variables that indicate a need for further study and comparisons across subjects.

Comparisons of medial consonant identification under electric-only and combined electric and acoustic stimulation in quiet and at S/N ratios of +10 and +5 dB produced data consistent with the combined mode being somewhat less sensitive to the negative impact of increasing levels of noise.

III. References

Bogess WJ, Baker JE, Balkany TJ: Loss of residual hearing after cochlear implantation. Laryngoscope 1989;99:1002-1005.

Brimacombe JA, Arndt PL, Staller SJ, Beiter AL: Multichannel cochlear implantation in adults with severe-to-profound sensorineural hearing loss; in Hochmair-Desoyer IJ, Hochmaier E (eds.): Advances in Cochlear Implants. Wien, Manz, 1994, pp 387-392.

Hodges AV, Schloffman J, Balkany T: Conservation of residual hearing with cochlear implantation. Am J Otol 1997;18:179-183.

Rizer FM: Postoperative audiometric evaluation of cochlear implant patients. Otoloryngol Head Neck Surg 1988;98:203-206.

Shin YJ, Deguine O, Laborde JL, Fraysse B: Conservation of residual hearing after cochlear implantation (in French). Rev Laryngol Otol Rhinol 1997;118:233-238.

von Ilberg C, Kiefer J, Tillein J, Pfenningdorff T, Hartmann R, Sturzebecher E, Klinke R: Electric-Acoustic Stimulation of the Auditory System. ORL 1999;61:334-340.

IV. Plans for the next quarter

Our plans for the next quarter include the following:

V. Acknowledgments

We thank subjects ME4, ME5, ME6, and NU6 for their participation in the studies of this quarter. We are grateful for the contributions made by Joachim Müller in the studies with subject ME5, and for the contributions made by Thomas Pfennigdorff in the studies with subject ME6.

Appendix 1. MusiCI: A complex tone synthesizer for cochlear implant users

MusiCI

A Musical Tone Synthesizer for Cochlear Implant Users

Dewey T. Lawson, Ph.D.

Center for Auditory Prosthesis Research
Research Triangle Institute

MusiCI is a computer program created specifically to allow users of cochlear implants to design, synthesize, listen to, and compare musical tones. [The name comes from Music for Cochlear Implants, and is pronounced "MOO-see-chee" like the Italian word for musicians.] It is intended both for the enjoyment of cochlear implant users and as a tool for research. The program can be used on any PC that is running the Windows 98 or Windows 95 operating system and that is capable of playing .WAV sound files at a 44.1 ks/s rate ("CD sound"). Appropriate versions of Visual BASIC and Windows Common Dialog support files are supplied with the program.

The control interface by which the cochlear implant user can construct a wide variety of musical tones, intervals, and chords has been designed to allow those with some training in music theory and/or acoustics to apply their knowledge and experience, while also allowing the most musically naïve listener to explore and enjoy a wide range of musical sounds.

The analog audio signals created by a computer running this program are intended for direct injection to the speech processor electronics of the user's cochlear implant. As with any other audio source, if the computer running this program is connected to the AC power mains, its audio output should not be connected to the speech processor of a cochlear implant system except through an approved electrical isolator device.

MusiCI was developed as part of a research effort supported by the Neural Prosthesis Program of the National Institutes of Health, and is provided to research subjects free of charge by Research Triangle Institute. It is not a commercial product and is not supported as such. Copies of the program are custom tailored to the characteristics of each user's cochlear implant processing device and strategy. Inquiries should be directed to Dewey T. Lawson, Ph.D., Center for Auditory Prosthesis Research, Research Triangle Institute, P.O. Box 12194, Research Triangle Park, NC 27709; Voice telephone +919.541.6801, facsimile +919.990.8385, e-mail dtl@rti.org.

Two types of documentation are available for MusiCI : (1) an operator's manual for cochlear implant users, describing how to run and use the program, and (2) a technical manual for researchers describing the architecture of the program, many of its features, its uses in research, and the various data formats involved. Within these documents, hypertext links have been provided to glossary definitions for many musical terms. Such terms usually will be in italics, and will carry whatever additional characteristics (color, underline) your browser uses for links. Use your browser's "Back" button to return to your place in the main text. The full glossary in alphabetical order may be found at the end.

The core of the program is a highly flexible synthesizer, capable of combining up to nine harmonic partials for each of three independent fundamentals -- as many as 27 simultaneous pure wave partials constituting single complex tones, two-tone intervals, and triad chords. The ability to control each partial makes it possible to avoid bothersome interactions among speech processor channels, and the program suggests to the user which partials might be omitted for such reasons. The program can keep any two synthesized combinations available for rapid comparisons by the user. When a user discovers a particularly interesting comparison, the program can record the synthesis prescriptions for both sounds, along with any comments the user would like to enter. All such records and comments are collected in a computer file, for later analysis by researchers in the hope of learning more about music (and speech) perception with cochlear implants. Users can preserve prescriptions for any individual sounds they like in named files for later reference.

Several accessories have been integrated with the core program. One allows the user to construct full chromatic musical scales based on selected tones generated in the core program, and to play music with those scales by clicking on the keys of a computer graphics keyboard. Individual notes of the scales can be adjusted in the core program, and prescriptions for the full scales can be stored as files for repeated use. Another accessory can automatically administer formal psychophysical tests to the user, reading computer files containing test scripts prepared by researchers, and collecting user responses in other files to be returned to the researchers for analysis. A wide variety of standard test types are supported. Since only synthesis parameters need be provided to the program for each complex tone stimulus (rather than much more lengthy digital recordings of each sound) it will be easy for users and researchers to exchange tests and results by e-mail or diskette.

A hidden feature of MusiCI, accessible by "hot keys," allows researchers to use the core interface to display the characteristics of sounds commented on by users in the files they supply for analysis. Another hidden feature allows the production of complex tones, intervals, and chords based on a non-traditional musical scale that provides many parallels to the consonances and dissonances of traditional music but in a different context. At present, the latter feature is not documented in the operator's manual, to avoid possible contamination of ongoing research studies.

A few technical details for musicians

The absolute frequency of the fundamental for single complex tones (and for the lower fundamental in intervals and the lowest in chords) is based on the equal tempered musical scale most commonly used for keyboard and fretted string instruments in the West today. In that scale, each semitone ("half step") corresponds to a frequency ratio of the twelfth root of two: thus all semitones are identical intervals, with twelve successive ones exactly equaling an octave. Absolute frequencies are based on the 440 Hz concert A now commonly used in the West.

For the intervals between simultaneous fundamentals, however, MusiCI uses two different systems. For the less consonant musical intervals (minor second, major second, tritone, minor seventh, and major seventh) the same equal tempered scale is used. For more consonant intervals (octave, perfect fifth, perfect fourth, major third, minor third, minor sixth, and major sixth) the more consonant "Just" tuning is used. The fundamental frequencies in such intervals are related by ratios of small integers. Just intervals between simultaneously sounded notes occur frequently in modern performances, whenever musicians are able to achieve them.

When in the chord mode, buttons are available to generate major, minor, diminished, and augmented triads automatically with respect to the root. Any three-note combination can be constructed manually, including triad inversions. Pitch specification for both upper tones is in terms of musical intervals above the lowest note, whether or not that note is musically the root of the chord.

A few technical details for scientists

The complex musical tones designed with MusiCI are stored as compact ASCII prescriptions (typically 86 bytes in length) and synthesized just before being played for the user. For playing, standard RIFF-compatible .WAV files are synthesized. The sample rate for the 16-bit words of these digital recordings is 44,100 per second. Each .WAV file generated by MusiCI is one half-second in duration, monophonic, and thus approximately 44 kB in length. The relative amplitudes of successive harmonics of each fundamental can be specified as inversely proportional to frequency, or to the square or cube of frequency. The first two of these options are typical of the spectra of common musical instruments, while the third is included to allow compensation for the preemphasis filters built into many cochlear implant speech processors (to make higher frequency consonant sounds more competitive with louder low frequency vowel sounds). This dependence of harmonic amplitude on frequency (which also is proportional to harmonic number) can be specified separately for each fundamental in the case of complex intervals and chords, as can the overall relative amplitude of each fundamental and its harmonics within an interval or chord. A file customizes MusiCI to each user's own speech processor. Designed principally for use with continuous interleaved sampling (CIS) processors, this file provides the program with data as to the frequency pass-bands utilized by the particular processor in analyzing sounds. It allows the program to determine the extent to which presence of a given harmonic partial of a given fundamental will result in a response in only one analysis band (and thus only in the corresponding stimulus channel and implanted electrode) or in two adjacent bands. Motivations for providing such control include the expectation that uncontrolled interactions among speech processor channels is a major contributor to the unpleasantness of many musical sounds as experienced by cochlear implant users, and the observation that beating between adjacent harmonics when both lie unambiguously within a single analysis channel can support accurate pitch perception (beating between any two adjacent harmonics occurs at the fundamental frequency). Where possible, data are supplied for two different separation criteria: to identify situations in which response to a harmonic partial will be down by at least 10 dB in adjacent channels, and situations in which it will be even more strongly limited to a single analysis band -- down by at least 20 dB in any other. Either of these criteria may be used to advise the user in selecting partials, and can be used as the basis for automatic selection of partials as a starting point for each complex stimulus. In every case, however, the user is free to include or exclude any of the first nine harmonics of each fundamental. Display of advice based on the criteria can be disabled.

Following are drafts for an Operator's Manual and a Researcher's Manual.

MusiCI Operator's Manual

Virtually all operations in MusiCI can be accomplished by pointing your computer's mouse to a control in the program's window on your screen and clicking the left-hand mouse button once. That is what we will mean when we ask you to "click on" a particular control. In most cases, if you just point to a control with the mouse for a second or two, a helpful label will appear explaining its function in more detail.

The controls include several different types of buttons:

Radio buttons are round, with labels to their right. Only one of them within a particular box can be selected at the same time. The selected one will have a dark dot at its center. When you select one by clicking on it, any previous selection in the same box is canceled, but there may not be any other immediate response on the screen.

Check boxes are small squares, also with labels to the right, each of which can be selected or not, regardless of whether other ones nearby are. Selected boxes contain a check mark: clicking on one that is not selected will select it, and clicking on one that is already selected will cancel its selection.

Command Buttons are larger rectangles with labels printed on the buttons themselves.

Other controls found in this program include faders, used to adjust relative loudness. The fader is a small rectangle that can be slid up and down anywhere along a vertical range. There is an "up" arrow button at the top of the range and a "down" arrow at the bottom. The fader can be moved by clicking on one of those arrows, or by pointing to the fader itself and then holding down the left mouse button to "grab" it while moving it along the scale. Release the mouse button once the fader is where you want it. If you happen to have a separate scroll feature on your mouse or other pointing device, you can simply point to the fader and scroll up or down.

Finally among the controls are lists from which you can select one or more items by clicking on them. In some cases all the items on a list can be seen at once, but if there are a pair of "up" and "down" arrows to the right of the list those can be used to bring other parts of the list into view. Some lists allow the selection of only one item, with any prior selection automatically cancelled by a new one. The selected item or items will appear as light printing on a dark background.

When the program MusiCI is launched on your computer the window that initially appears will resemble Figure 1 below.

{short description of image}

Figure 1. Initial MusiCI window in Single-note mode.

This window includes examples of each of the types of control discussed above.

In this initial condition, the program is ready to synthesize and play a single musical note, the "A" above "middle C" on the piano (highlighted in the small visible portion of the upper list), using only harmonics that will produce the simplest kind of responses in your speech processor. (In the example in Figure 1 this means only the four harmonics -- numbers 1, 4, 6, and 9 – highlighted on the lower list. Your version of the program has information unique to your own processor -- the title at the top of the window will reflect that -- and MusiCI may have selected a somewhat different set of harmonics.) To synthesize this sound (create a digital recording of it that you can listen to) click on the "One" command button in the box at the upper right marked "Make As." Within a fraction of a second, the computer will have made a recording of the "A" and stored it as "One" of a pair of sounds that are always available for you to compare (the other is called "Two"). To hear it, just click on the "One" command button in the "Play" box near the bottom right of the window.

Note that the "Single note" radio button in the "Mode" box at the top of the window is selected. For the next step in our introductory tour of the program, click on the "Two-note interval" radio button in the same box. The window will immediately change into one like that shown in Figure 2

{short description of image}

Figure 2. Window in Two-note interval mode.

Now we have a second pair of lists. The upper of this new pair selects the musical interval by which our additional note's fundamental will differ from the first in pitch (a Major third in this case -- a C# above the lower A), while selected harmonics of this second fundamental are highlighted in the lower one.

Try clicking on the "Two" button in the "Make As" box to make a recording of this two-tone interval. Then compare the sound to the Single-note tone recorded earlier, by clicking on "One" and "Two" in the "Play" box.

Next, proceed with the tour by clicking on the third radio button in the "Mode" box -- the "Three-note chord" button -- to change the window to one like that shown in Figure 3 below.

{short description of image}

Figure 3. Window in Three-note chord mode. The chord is a major triad.

Two more lists have appeared for our third note. Their functions are exactly like those of the pair for the second tone. The upper list specifies the interval of the third note's fundamental above the first -- a Perfect fifth in this case -- an E above the lower A and C# -- with its selected harmonics highlighted in the list below.

Replace either sample "One" or sample "Two" with a recording of this three-tone chord by clicking on the appropriate button in the "Make As" box, then compare the result with the remaining sample by clicking on both buttons in the "Play" box. Depending on the speed of your computer, you may notice that synthesis of two and three-tone samples takes longer. You can go ahead and click on a sequence of command buttons and they will be executed in the right order as soon as the computer catches up.

Now let's begin to explore some of the more subtle ways in which you can produce different musical tones with this program. Click on the "Single note" radio button in the "Mode" box to return to the conditions shown in Figure 1. If you stored the three note chord in sample "One" in the previous step, click on "One" in the "Make As" box now to restore it to the original Single note sound.

Now notice the three options represented by radio buttons in the "Channel Separation" box in the upper left corner of the window. The "20 dB criterion" has been selected up to this point. Watch what happens in the lower list box as you click on the "10 dB criterion" radio button, as illustrated in Figure 4.

{short description of image}

Figure 4. Conditions of Figure 1, but with the 10 dB channel separation criterion.

Additional harmonics have now been automatically selected. Click on the "Two" button in the "Make As" box and then use the buttons in the "Play" box to compare the two sounds. You can also alternate between the two harmonic lists on the display by clicking on the corresponding buttons in the "Show" box. Clicking on the "One" button there will always show you the display corresponding to the sound you'll hear by clicking on the "One" button in the "Play" box, and similarly for sample "Two."

Let's take a moment here to explain the meaning of these different "Channel Separation" criteria. Any single harmonic in a musical tone produced by this program will have some single frequency. That frequency may happen to fall near the middle of one of the frequency bands with which your processor analyzes sounds: in that case the harmonic will produce a response only in the single channel and electrode corresponding to that single band. On the other hand, the harmonic's frequency may happen to fall about half way between the centers of two adjacent analysis bands, producing similar smaller responses in both the corresponding channels and electrodes. Thus the effect that a single harmonic has on your processor's output can be quite different depending on the relationship between the harmonic's exact frequency and the frequency band settings of your processor.

When you consider that a typical musical tone may be composed of at least several such harmonics, there could be enormous differences between the simplest case in which each harmonic affected only one band, channel, and electrode and the opposite extreme in which each harmonic affected a pair of channels and electrodes, the effects overlapping in unpredictable ways across the channels. Such uncontrolled channel interactions are a likely source of distracting variations in music heard via a cochlear implant.

When the "20 dB criterion" is in effect, the harmonic lists will indicate those harmonics that lie near the center of one channel's frequency band and that have less than 1/10 as much effect on any adjacent channel. The "10 dB criterion" will include some additional harmonics that, while still mainly affecting one channel, have a bit of an effect on an adjacent one as well. Each harmonic selected under either of these criteria will be labeled on the list with both its harmonic number and the number of the channel that responds to it (e.g. H6 Ch5). [Your version of MusiCI contains detailed information about the frequency bands of your own processor. If your audiologist ever significantly alters the programming of your device -- changing the number of channels, for instance, or the nature of the bandpass filters -- it may be necessary to update the information in MusiCI accordingly.]

When the "Autoselect" box at the bottom of the window is checked, all harmonics satisfying the current criterion will be selected automatically for inclusion in the synthesized tone. You can always alter those selections by clicking on the locations of individual harmonics within the list. Or you can turn off the "Autoselect" feature and make all selections manually.

A full set of nine harmonics is always available, and all nine can be selected by default if you leave the "Autoselect" checked and choose the "No restriction" radio button in the "Channel Separation" box, as shown in Figure 5.

{short description of image}

Figure 5. Conditions of Figure 1, but with no separation restriction.

Notice that in this case channel numbers are not indicated.

You may want to spend some time experimenting with different choices among the harmonics and different fundamental pitches. Do the sets of harmonics selected under the 20 dB criterion seem to sound consistently better or worse than the somewhat larger sets fulfilling the 10 dB criterion? How much variation do you hear from note to note as you select different note pitches from the upper list while leaving the channel separation criterion the same? Can you improve the sound of some notes by adding harmonics that do not fulfill the criteria?

In the upper list in the left column where the pitch of the fundamental is selected, by the way, the numbers after the note names identify the octave in which the pitch occurs. The octave numbers change at each A, which is additionally labeled with its frequency in Hz (cycles per second). The lowest fundamental available is A1, corresponding to a frequency of 110 Hz.

Returning now to our general tour of the program's features, if you click on the 20 dB criterion, a Root pitch of A3, the Three-note chord mode, and specify a Major triad by clicking on "Maj" in the "Chord" box, you will recreate the situation of Figure 3. The intervals of the upper two notes -- a Major Third and a Perfect Fifth above the Root A3 respectively -- result in those upper notes being C#3 and E3. Clicking on the "Min" radio button in the "Chord" box will change over to a minor triad, as shown in Figure 6.

{short description of image}

Figure 6. Conditions of Figure 3, but with a minor triad specified.

The principal difference here is that the middle note is changed from C#3 to C3, i.e. it is lowered a half step to be a minor third above the root. The channel separation criterion, of course, may select different harmonics as well. See whether you can hear a difference between the traditionally "bright" major triad and the traditionally "darker" or "sadder" minor one. Can the addition or removal of harmonics make the difference more appropriate sounding?

Two additional options may be specified from the "Chord" box -- a diminished triad (with upper notes a minor third and a tritone above the root) and an augmented triad (with upper notes a major third and a minor sixth above the root).

As you explore the effects of various manipulations on the musical sounds you hear, we hope that you will encounter some especially interesting contrasts. When that happens, we hope that you will click on the "Record Comment" command button and type in a description of what you hear when you compare samples "One" and "Two". A window like that shown in Figure 7 will open.

Figure 7. Window opened by the Record Comment button.

You can type any comments you like in this window, and go back and edit them with the mouse and the keyboard if you like. Once you are finished, just click on the "DONE" command button. That will add your comments to a computer file, along with exact technical descriptions of the two samples you were comparing. The window will disappear. From time to time we'll ask you to let us study that file. We hope in that way to discover new correlations between attributes of cochlear implant stimulus patterns and qualities of sound perceived by listeners.

Another feature of the program is the ability to preserve any sounds you'd like for future reference, by storing them as named files. Clicking on the "Save" command button opens a dialog box like that in Figure 8.

Figure 8. Dialog window opened by the Save button.

All existing files with the standard specification file extension .SPC will be displayed automatically. This will include 1.spc and 2.spc, the specifications of the current samples "One" and "Two." It may also include a set of 25 files -- with names not1.spc through not25.spc -- created by the Full Scale feature from its window (see below). Any others will be ones you have created and named in the past. You may enter a new name in the File name blank as shown in Figure 8, or click on an existing file name to replace its contents. Do not enter a file extension.

Use the "Recall" command button to return a previously saved specification to the display. This will open a window like the one in Figure 9.

Figure 9. Dialog window opened by the Recall button.

Again, all files with the .SPC extension will be displayed. Click on the one you want to load into the display and then click the "Open" command button, or simply double click on the file name. Once the recalled specification appears in the main window's display, it can be synthesized, recorded, and listened to as "One" or "Two" in the usual way.

Beyond comparisons of "One" - "Two" pairs of sounds MusiCI supports the construction of full chromatic scales. The "Full Scale" command button gives you access to the window illustrated in Figure 10, in which such things are done. Before clicking on that button, however, you need to be sure that the tone, interval, or chord on which you want to base your scale is displayed in the main window.

Figure 10. Window opened by the Full Scale button.

A single click on the "Make" button in this window will synthesize 25 notes, spanning two full chromatic octaves. Those notes will be available as files not1.spc through not25.spc and not1.wav through not25.wav. Before accessing this window you selected a reference tone, interval, or chord by displaying it in the main window. Before clicking on the "Make" button you also have two other choices to consider: In the "Pattern" box you can choose whether to apply the same criterion (as used in the reference tone) across all notes in the scale or to use the same harmonics in each case, and in the "Octaves" box you can choose between two different two-octave ranges. Depending on the speed of your computer, it may take some time to complete synthesis of all 25 notes (If you have the Scale window in a position on your screen that leaves the root pitch list visible in the main MusiCI window, you can watch the root pitch change as the 25 notes are synthesized and recorded). As soon as the task is done you may play any of the notes by clicking on the appropriate keys of the musical keyboard displayed in the Scale window. Each note will last a half second, but shorter effective durations can be obtained by interrupting one note with the next. A brief noise may be noticed at each such interruption.

[The first time you make a full scale, approximately 1.1 Mb of additional disk storage will be tied up on your computer's disk, primarily to contain the 25 .WAV files. Subsequently, making or retrieving scales will change storage requirements only by the size of the much smaller .SPC and .SCS specification files.]

Additional features of this screen include the possibility of taking an individual note back to the main window as sample "One" for adjustment and then bringing it back into the context of the full scale. This is done with the command buttons "Edit as One" and "Replace with One." In both cases the scale note involved will be the last one played from the Full Scale window's graphic keyboard.

A note for musicians: If your reference "tone" from the main window was in fact a two-note interval, each key on the keyboard in this window will play such an interval. You can make "scales" that let you play parallel perfect fifths, for instance, (you might call that one "Medieval") or parallel major thirds ("Brahms"?). As another possibility, if you choose a major chord as your reference and use that to make a scale, you'll have a keyboard that will harmonize with you (I, IV, V, I, et c.)! Then, by editing individual chords in the main window, you could even equip your virtual accompanist keyboard with inversions of some of the chords, use one of the keys for three notes from a V7 chord, et c.

Finally, provision is made for storing and retrieving all 25 specifications for a full scale as a single file. Such files carry the extension .SCS and consist of 25 lines, each of which is identical in format to a single .SPC file. Naming, generating, and saving such a file is accomplished using the "Save" command button in the "Entire Scale" box, which opens the dialog window shown in Figure 11.

Figure 11. Dialog window opened by the Save button in the Scale window.

Use of this window is essentially identical to that discussed for Figure 8 above. The corresponding retrieval, and resynthesis is accomplished with the "Recall" command button in the "Entire Scale" box, involving the dialog window like that illustrated in Figure 12.

Figure 12. Dialog window opened by the Recall button in the Scale window.

Use of this window is essentially identical to that discussed above regarding Figure 9. Note that when you recall a full scale in this way, the program will replace the previous scale in files not1.spc through not25.spc and not1.wav through not25.wav. Consider saving the specifications for the previous scale before taking this step.

A final feature of the MusiCI program is the capability to administer a variety of formal tests designed by researchers and provided to you via e-mail or on diskette. Each such test will generate a results file to be returned to the research institution for scoring and analysis.

After copying the test script file(s) to the same directory that contains your other MusiCI files, you can invoke the test administration feature by typing Alt-T from the keyboard. The first window to appear will be the one shown in Figure 13.

Figure 13. Initial window opened by entering the Alt-T code from the keyboard.

If you encounter this window by accident, close it by clicking on the X in the upper right corner. You will receive additional specific instructions about any test you are asked to take using this feature of MusiCI.

There are several controls on the main window that we haven't discussed yet, and they all have to do with relative loudness. The "1/f", "1/ff", and "1/fff" radio buttons that appear in "Weight" boxes below each of the pairs of lists control how quickly the loudness of upper harmonics drops off with increasing pitch -- the relative "weight" of the harmonics. [These scientific labels mean that the relative strength of each harmonic varies inversely with its frequency, the square of its frequency, or the cube of its frequency, respectively. This has nothing to do with the musical loudness notations f, ff, and fff!] Most common musical instruments produce complex tones that correspond to the "1/f" or "1/ff" category, so those settings are likely to produce tones that most resemble other musical sounds you hear around you through your implant. Since many speech processors are designed to be more sensitive to higher frequencies, however, (to better detect soft, high consonant sounds in the presence of loud, low vowels in speech) use of the "1/ff" and "1/fff" settings in MusiCI may produce a balance more similar to that of normal hearing.

The fader controls that appear to the right of each "Weight" box are at their maximum settings by default, but can be moved to reduce the relative loudness of one or more tones within intervals and chords. The larger fader near the upper right corner of the main window is a master gain control that can reduce the loudness of the overall sound. If an error message about an "overflow" ever appears while MusiCI is synthesizing a tone, lower the position of the master gain fader and try again. Check the settings of the faders if the program's sounds are suddenly strangely soft or silent. Note that changing a fader setting will not affect the loudness of an existing sample until that sample is resynthesized with one of the command buttons in the "Make" box.

Earlier in this manual we discussed the possible advantages of excluding individual harmonics that produced responses in more than one channel of your speech processor. We should also mention that there is a potential advantage to including adjacent pairs of harmonics that produce responses in the same channel. [When that happens, a phenomenon called beating may convey the fundamental pitch particularly well in that channel. As an example, look at the second tone in Figure 2. Two sets of adjacent harmonics have been selected in that example: while harmonics 2 and 3 produce responses in different channels (channel 3 and channel 4) and thus wouldn't offer this advantage, harmonics 7, 8, and 9 all produce responses in channel 6 and might prove particularly helpful in conveying the pitch of the whole note. As a further example, compare the selected harmonics in Figures 1 and 4. In that case, relaxing the channel separation criterion from 20 dB to 10dB adds two harmonics, the 3rd and 7th. While adding the 3rd harmonic may or may not produce problems because of channel interactions, adding the 7th offers the extra potential benefit of beating with the 6th harmonic in their common channel 5.]

A final remark: As you will have noticed, this program offers MANY possible adjustments. Don't let that intimidate you. Feel free to ignore the musical and scientific issues and just play with the sounds. MusiCI may offer almost as many confusing options as the fitting system for your speech processor, but at least you don't need your audiologist along to play with them! Try one type of change at a time, and save the improvements.

MusiCI Researcher's Manual

Main Program

MusiCI is object oriented, with a main window that displays a set of parameters defining a half-second complex tone stimulus involving as many as three independent fundamental frequencies, each with up to nine harmonic partials. The window elements that display those parameters and allow control of them are fully described in the Operator's Manual above. Additional objects associated with the main screen allow bidirectional conversion between the displayed parameter set and compact ASCII vector representations of such sets; creation and playing of the stimuli themselves in .WAV file form; easy access to any two stimulus designs for rapid comparisons; a facility for collecting, documenting, and archiving anecdotal user comments about such comparisons; a facility that allows users to construct, use, save, and manage two-octave chromatic scales based on stimuli they have designed; and a utility that can synthesize the necessary stimuli and automatically administer formal tests involving comparisons of complex tones, based on flexible ASCII scripts. An addendum at the end of this Researcher's Manual describes a non-traditional musical scale system also available as an option within MusiCI -- information that should not be shared with cochlear implant users who are potential naïve subjects for testing with such stimuli.

The 48 equal tempered pitches available for use as lowest (root) fundamentals range from A1 at 110 Hz to G#4 at 1661.2 Hz. [Note that our octave numbering system is not the standard musical one, which has C as the lowest pitch within each octave and identifies A 440 Hz as A4.] The frequency ratios relating upper notes to the lowest fundamental in our synthesized intervals and triad chords range from minor second (a single semitone) through a musical twelfth (a factor of three, an octave plus a perfect fifth, approximately 19 equal tempered semitones). The Operator's Manual discusses which of those simultaneous intervals are equal tempered and which are just.

When instructed to create a stimulus .WAV file, the synthesis engine begins by building a sine wave lookup table for each of the three fundamental frequencies, corresponding to a sample rate of 44.1 ks/s. The parameter set also is examined to determine the relative amplitudes of each of the 27 harmonic partials. A pointer for each harmonic is incremented through the lookup table for its respective fundamental, the pointer for the nth harmonic being advanced n steps in the table for each successive sample interval. Pointers are reset individually to cyclically follow the lookup table waveforms. All the contributions are summed for each sample interval, with multiplication by appropriate digital gain factors being applied at the level of each harmonic (derived from the weighting category and the harmonic number), and (derived from the display fader settings) each of the three distinct harmonic series, and the overall signal. A conservative fixed gain factor specified in each user's device customization file multiplies the overall signal, and any overflow of the 32767 amplitude limit is detected and an error message generated. Linear ramps, 500 samples in length, are applied to the overall signal at the onset and offset of each stimulus.

Automated Test Administration Module

When, after pressing Alt-T, a subject clicks on the START button shown in Figure 13 above (in the Operator's Manual), the MusiCI program searches for a file named pending.tst in the current directory. The existence of such a file will indicate that an ongoing test was suspended earlier by the subject (by clicking on a "Recess" command button in a test window). If such a file is found, it will be read and the suspended test resumed immediately.

If no pending.tst file is found, the program will close the main MusiCI window and open a window like that shown in Figure 14 below.

Figure 14. Dialog window opened by the Start button in the Automated Test window, if no pending.tst file has been found.

This window will display a list of all files in the current directory that have .TST extensions. Each such file is expected to contain instructions for administering a formal complex tone test, using a scripting format outlined in Tables 1 through 3 below. Through such a file, the name of the testing window can be specified for the individual test, and labels assigned to each of eight multiple choice check boxes for a "Choose all that apply" panel and each of eight linked radio buttons in a "Choose one" panel. Then, as the test proceeds, the file can specify the contents of an instruction line within the test window, whether the multiple choice or forced choice panel is displayed, exactly which check boxes or radio buttons are made visible within that panel, which command buttons (among six) are available, the nature and number of stimuli to be delivered, the minimum delay between stimuli, and the number and nature of repeat presentations to be allowed for each response sequence within the test.

Table 1. Test Script (.TST) File Format Summary:
title of the test (displayed atop test window and written to response file) [dedicated line, spaces permitted]
name of response file to be created [dedicated line, any valid file name under Windows 95 and 98]
initial instruction line for display in test window [dedicated line, spaces permitted]
{8 lines containing captions for check boxes in the multiple choice panel [dedicated lines, spaces permitted]}
{8 lines containing captions for radio buttons in the forced choice panel [dedicated lines, spaces permitted]}
command line for 1st condition (see Table 2 below)
optional instruction line and stimulus specification line(s) for 1st condition (see Table 3 below)
command line for 2nd condition
optional instruction line and stimulus specification line(s) for 2nd condition
et c.

Table 2. Command Line Format for .TST Files (single line, comma delimited integers)
condition number (arbitrary, recorded in response file),
instruction flag (if > 0 contents of the next line will replace any instructions already displayed in the window),
number of stimuli in this condition (that many lines will be read in as stimulus specifications),
number of repeats permitted (if zero, PLAY button disappears after one use, otherwise PLAY changes to REPEAT for the specified number of uses; if Play1 and Play2 buttons enabled instead, Play1 is allowed the specified number of repeats before disappearing and the Play2 button remains active until its first use after Play1's demise),
mode (forced choice if 1, loudness balance if 2, otherwise multiple selections allowed) for mode 2 the second stimulus master gain is initially set to the stimulus specification value and then is controlled by the displayed fader,
multiple choice mask (bit mask to enable any combination of the 8 captioned check boxes: sum 1=Check1 . . . 128=Check8),
forced choice mask (bit mask to enable any combination of the 8 captioned radio buttons: sum 1=Button1 . . . 128=Button8),
response options mask (bit mask: sum 1=Next, 2=Recess, 4=Play/Repeat, 8=Play1, 16=Play2),
minimum silence between successive stimulus presentations (in half seconds)

Table 3. Stimulus Specification Line Format (single line, comma delimited integers)
channel separation criterion (1=20 dB, 2=10 dB, 3=None),
number of fundamentals (1, 2, or 3),
master gain setting (0-99),
scale system switch (0=standard, 1=non-traditional [see Addendum below]),
code for root pitch (ET semitone number beginning with 1=A at 110 Hz: 1-48),
{9 comma delimited binary digits flagging the 9 harmonics of the root, 1-9 order),
root gain setting (0-99),
upper harmonic weight for root (0=1/f, 1=1/ff, 2=1/fff),
code for second fundamental's interval above root (in semitones, 1-19),
{9 comma delimited binary digits flagging the 9 harmonics of the second tone, 1-9 order},
second tone gain setting (0-99),
upper harmonic weight for second tone (0=1/f, 1=1/ff, 2=1/fff),
code for third fundamental's interval above root (in semitones, 1-19),
{9 comma delimited binary digits flagging the 9 harmonics of the third tone, 1-9 order},
third tone gain setting (0-99),
upper harmonic weight for third tone (0=1/f, 1=1/ff, 2=1/fff),

To illustrate possible uses of this test scripting format, a rather strange .TST file has been constructed. It is shown, line by line, in Table 4 below and produces the windows shown in Figures 15 through 19.

Table 4. Example .TST File Used to Generate Figures 15 through 19
Demonstration Test Script
demo.rlt
A full set of multiple choices, 3 repeats allowed
Bright
Dull
Smooth
Rough
Simple
Complex
Pleasant
Annoying
Second pitch higher
Second pitch lower
Can't tell
Second louder
Second softer
First Different
Second Different
Third Different
1,0,1,3,0,255, 0,7,1
1,1,99,0,01,1,1,1,1,1,1,1,1,1,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0
2,1,2,3,1, 0, 3,7,2
Now a two-alternative forced choice, 3 repeats allowed
1,1,99,0,25,1,1,1,1,1,1,1,1,1,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0
1,1,99,0,26,1,1,1,1,1,1,1,1,1,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0
3,1,3,0,1, 0,224,7,2
Finally, a 3-alternative odd-man-out with single presentation
1,1,99,0,18,1,1,1,1,1,1,1,1,1,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0
1,1,99,0,19,1,1,1,1,1,1,1,1,1,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,00,,0,0,99,0
1,1,99,0,18,1,1,1,1,1,1,1,1,1,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0
4,1,1,10,2,0,0,11,1
Adjust the volume control on your processor so that Tone 1 is Most Comfortable
1,1,99,0,25,1,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0
5,1,2,10,2,0,0,27,1
Adjust the fader to make Tone 2 the same loudness as Tone 1
1,1,99,0,25,1,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0
1,1,20,0,25,1,1,1,1,1,1,1,1,1,99,0,0,0,0,0,0,0,0,0,0,0,99,0,0,0,0,0,0,0,0,0,0,0,99,0

Figure 15. Example of a multiple choice test response window.

For the initial test condition shown in Figure 15 the command line code is "1,0,1,3,0,255, 0,7,1". The condition number is 1. No additional instruction line is read in, leaving the previous (original) one displayed. A single stimulus is involved. No more than three repeat presentations will be allowed. The multiple choice panel is displayed. All eight check boxes in that panel are made visible (and, thus, available for user selection). None of the radio buttons in the other panel are displayed. Three command buttons -- NEXT, Recess, and PLAY/REPEAT -- are displayed. The minimum delay between successive stimulus presentations is set at 0.5 second. [Delays are imposed by the playing of the half-second file silence.wav a specified number of times.]

Figure 16. Example of a two alternative forced choice test response window.

For the next test condition, shown in Figure 16, the command line code is "2,1,2,3,1, 0, 3,7,2 ". The condition number is 2. This time a new instruction line is read in. Comparison of a pair of stimuli is involved. Again, no more than three repeat presentations will be allowed. The forced choice panel is displayed. None of the check boxes in the other panel are displayed. The first two radio buttons in the forced choice panel are made available. The same three command buttons -- NEXT, Recess, and PLAY/REPEAT -- are displayed. The minimum delay between successive stimulus presentations (including the interval between the two stimuli in each presented pair) is set at 1.0 second.

Figure 17. Example of a three alternative forced choice test response window.

Figure 17 displays the window as it appears for the next test condition in our example .TST file. The command line code is "3,1,3,0,1, 0,224,7,2 ". The condition number is 3. A flag again causes the reading in of a new instruction line to be displayed for this condition. Comparison of three stimuli is involved. No repeat presentations will be allowed. The forced choice panel is selected again. No check boxes are displayed in the other panel. The last three radio buttons in the forced choice panel are selected as possible responses for this condition. The same three command buttons -- NEXT, Recess, and PLAY/REPEAT -- again are displayed. The minimum delay between successive stimulus presentations (including the interval between adjacent stimuli in each presented set) is again set at 1.0 second.

The PLAY/REPEAT command button is initially displayed with the label PLAY, and the program waits for the subject to click on it before first delivering a stimulus. If the command line has indicated that repeats are to be allowed, the PLAY button is relabeled as REPEAT until the specified number of repeats has been met. The PLAY/REPEAT button then disappears. The subject is free to make and alter response selection(s) until he/she clicks on the NEXT command button.

As an alternative to the PLAY/REPEAT button, a pair of command buttons labeled "Play1" and "Play2" can be used. This will allow the subject to play (and, optionally, repeat) two stimuli in any order. If the parameter is set to allow a certain number of repeats, that many additional presentations of the first stimulus will be allowed before the Play1 button disappears. The Play2 button will remain visible and active until it is used once after the Play1 button disappears. The minimum delay between successive stimuli functions in this response mode as well.

{short description of image}

Figure 18. Example of Loudness Balancing Task: Preliminary Processor Adjustment

As a final example of ways in which .TST file scripts can be used, we illustrate how mode 2 can support automated loudness balancing among stimuli prior to testing with those stimuli. Figure 18 displays a preliminary window produced by the next condition in our example .TST file. The command line code is "4,1,1,10,2,0,0,11,1". The condition number is 4. A flag again causes the reading in of a new instruction line to be displayed for this condition. Only a single stimulus is involved, with up to 10 repeats allowed as the subject adjusts the volume control on his/her processor to achieve a most comfortable level for the reference sound associated with the Play 1 button. Mode 2 causes the fader (which will be used in the following step) to appear in the window. No check boxes or radio buttons are displayed. Three command buttons -- NEXT, Recess, and Play 1 -- are displayed. The minimum delay between successive stimulus presentations is set at 0.5 second. Typically, the reference sound supplied here will have been chosen as softer than any of the other sounds to be balanced with it, and will be presented at the maximum master gain setting (99).

{short description of image}

Figure 19. Example of Loudness Balancing Task: Stimulus Loudness Adjustment

Once the subject's processor is adjusted to provide MCL for the reference sound, the final window shown in Fig. 19 can be repeated for each sound to be loudness balanced with respect to it. The command line code is "5,1,2,10,2,0,0,27,1." 5 is the condition number, and a new instruction line is provided. There are two stimuli in this case, the first being the same reference sound used in the preliminary step, and the second being a stimulus to be loudness balanced against that reference, a process for which as many as 10 repeat playings of the reference sound will be allowed. The mode setting of 2 displays and enables the fader control, while no radio buttons or check boxes are shown. The Play 2 button is added to those from the preliminary step. A minimum delay of 0.5 s is specified between successive tones. The initial setting of the fader control is 20%, as specified in the stimulus specification line, to ensure that the sound being balanced will be audible initially, but not louder than the reference sound. As the subject moves the graphic fader control, a flag is set indicating that a change has occurred. When the Play 2 button next is activated, the status of that flag leads to resynthesis of the test sound with the new master gain setting before it is played.

The NEXT button causes the subject's response(s) to be recorded in the file specified in the .TST file's header, along with a date/time stamp and an indication of the number of repeats the subject actually used . An outline of the format of a response file is shown below in Table 5. After recording the response(s) to the previous condition, the program then reads the next command line, if any, and proceeds accordingly. If there is no further command line, the program places an end of test entry in the response file, deletes the .TST file, closes the automated test window, and reopens the main MusiCI window.

Table 5. Response File Format [comma delimited, one line per recorded event]
Event Code (b=begin or resume test, c=response to condition, s=suspend test, e=end test),
For events type b, s, and e: Test title (s also includes message "suspended by recess button"),
For event type c: condition number (from .TST file), mode (0=multiple choice, 1=forced choice, 2=loudness balance),
For event type c, mode 0 or 1: eight binary digit response vector for response buttons 1 through 8 of the appropriate type,
For event type c, mode 2: final master gain value for stimulus 2,
For event type c, all modes: number of presentations delivered (by PLAY/REPEAT or by Play1),
Date/Time stamp

The command button marked "Recess" records the subject's response(s) to the previous condition just as the NEXT button does. It then, however, constructs a new .TST file, named pending.tst, that specifies the same response file name, has a header designed to reproduce the instruction line and label settings as of the time the test was suspended, and then includes all the condition command lines, instruction lines, and specification lines remaining to be acted on. After creating this new file, the program enters a suspension record in the response file, deletes the original .TST file, closes the automated test window, and reopens the main MusiCI window.

ARCHIVE.TXT files as a Research Tool

When the Record Comment command button in the main MusiCI window is clicked on, text entered into the Comments window, and the DONE command button in that window clicked on (see Figure 7), the file archive.txt is opened, or created if necessary, and an entry appended to it. The format of that entry is indicated in Table 6.

Table 6. ARCHIVE.TXT Comments File Format
Date/Time stamp [dedicated line, includes spaces]
stimulus specification line for Sample "One" (see Table 3 above)
stimulus specification line for Sample "Two"
comments entered by user [dedicated line, may contain spaces and special characters]
blank line
blank line
(sequence repeated for each use of comment feature)

A hidden utility has been included in MusiCI for the convenience of researchers examining archive.txt files. While reading comments from such files using a word processor, a researcher can place any stimulus specification line in the Windows Clipboard memory (typically by highlighting the single line and pressing Ctrl-C). Then, with focus on the MusiCI main window, press Alt-1 [or Alt-2] to adopt the stimulus specification as 1.spc and 1.wav [or as 2.spc and 2.wav] and display its parameters. Since it overwrites the previous file(s), consider saving the previous 1.spc or 2.spc as a named file before using this feature.

Customization of the Program

Table 7 documents the format used in preparing the files that customize MusiCI to individual speech processors and underlie the program's information about each harmonic in terms of channel separation criteria. Such specification files are named musici.cus and also contain the main window title line displayed for each customized version of the program.

Table 7. MUSICI.CUS Processor-Specific Customization File Format
Processor Name (displayed atop main window) [dedicated line, may contain spaces]
two comma-delimited integer parameters: a master gain multiplier (default = 2000) and a scale switch (off=0) to grant the user "hot-key" (alt-B) access to a window for designing tones based on a non-traditional musical scale [see Addendum below]
nine comma-delimited integers for harmonics 1 - 9 of fundamental A1 at 110 Hz: each contains channel number for which the respective harmonic satisfies a 10 dB criterion, or 100 + that number if it satisfies the 20 dB criterion as well. A zero indicates that the harmonic does not satisfy either criterion for any channel.
nine comma-delimited integers for harmonics 1 - 9 of equal tempered fundamental Bb1
nine comma-delimited integers for harmonics 1 - 9 of equal tempered fundamental B1
et c. through 48th note G#4

.WAV Files Generated by MusiCI The 44000 bytes of data, less significant byte first, begin with the 45th byte of the file. Thus the file length is 44,044 bytes. The RIFF file header, identical in the case of each .WAV file generated by MusiCI, specifies PCM coding, a single (monophonic) channel, a 44 ks/s sample rate, a 16 bit/sample data length, and corresponding chunk length, data rate, and block alignment parameters.

Glossary of Musical Terms

augmented triad
a chord composed of three notes: a root, a note a major third above the root, and a note a minor sixth above the root. The upper two notes also differ by a major third.
beating
an acoustical and musical phenomenon in which the simultaneous combination of two pure tones has a loudness that fluctuates at a rate equal to the difference in their frequencies.
chord
a simultaneous combination of three or more complex tones, based on different fundamental pitches
chromatic
type of musical scale that includes pitches at semitone intervals, (twelve distinct pitches per octave -- including both the black and white keys of a piano) in contrast, for instance, to the seven pitches per octave (white keys only -- "do, re, mi, fa, sol, la, ti") of a diatonic scale
complex tone
a musical tone that is comprised of more than one partial
diminished triad
a chord composed of three notes: a root, a note a minor third above the root, and a note a tritone above the root. The upper two notes also differ by a minor third.
equal tempered
a system of musical tuning in which every semitone interval corresponds to a frequency ratio of the twelfth root of two; all larger intervals are defined as combinations of various numbers of equal tempered semitones. This system, often called "equal temperament" is the one most commonly used today on keyboard and fretted string instruments in the West.
fundamental
the lowest frequency partial of a harmonic series; the first harmonic. The fundamental does not have to be present in a complex tone to define its pitch, but can be implied by the presence of a pattern of upper harmonics consistent with that pitch.
harmonic
a partial whose frequency is an integer multiple of some fundamental's frequency, making it part of the harmonic series of that fundamental. The frequency of the nth harmonic of a fundamental is equal to n times the frequency of the fundamental, and n is often referred to as the harmonic number.
interval
a musical difference between two pitches, corresponding to a particular ratio of frequencies between the two fundamentals involved; also used to refer to a simultaneous combination of two complex tones based on different fundamental pitches.
just
a system of musical tuning that emphasizes intervals corresponding to frequency ratios that are exact ratios of small integers. Several musically important intervals sound smoother (more consonant) when tuned justly rather than in equal temperament. Examples include the perfect fifth, perfect fourth, major third, minor sixth, minor third, and major sixth.
major second
an interval equal to two semitones; a "whole step".
major seventh
an interval equal to eleven semitones, one semitone less than an octave.
major sixth
an interval equal to nine semitones. A Just interval corresponding to a frequency ratio of 5/3.
major third
an interval equal to four semitones. A Just interval corresponding to a frequency ratio of 5/4.
major triad
a chord composed of three notes: a root, a note a major third above the root, and a note a perfect fifth above the root. The upper two differ by a minor third.
minor second
an interval of a single semitone; a "half step."
minor seventh
an interval equal to ten semitones.
minor sixth
an interval equal to eight semitones. A Just interval corresponding to a frequency ratio of 8/5.
minor third
an interval equal to three semitones. A Just interval corresponding to a frequency ratio of 6/5.
minor triad
a chord composed of three notes: a root, a note a minor third above the root, and a note a perfect fifth above the root. The upper two notes differ by a major third.
octave
an interval corresponding to a factor of two in frequency; an interval of twelve semitones; also sometimes used to refer collectively to all notes within the span of an octave.
partial
a pure tone (sine wave at a single frequency), a constituent of a complex musical tone
perfect fifth
an interval equal to seven semitones. A Just interval corresponding to a frequency ratio of 3/2.
perfect fourth
an interval equal to five semitones. A Just interval corresponding to a frequency ratio of 4/3.
pure tone
a sine wave of a particular frequency; a single partial
root
the note that forms the basis for a chord. The root typically is the lowest note of the chord, but inversions of the chord may shift one or more of its notes by an octave without changing the identity of the root pitch. In terms of MusiCI's main window labeling, root always applies to the lowest fundamental.
semitone
a "half step"; the interval between adjacent notes in a chromatic scale; the smallest interval in a diatonic scale. In equal temperament, the interval corresponding to a frequency ratio of the twelfth root of two.
triad
any chord composed of complex tones based on three different fundamentals
tritone
an interval of six semitones. In equal temperament, exactly half of an octave.

MusiCI Addendum

A Non-Traditional Scale Option

N. B.: The information in this section should not be shared with any user who is a potential naïve subject for research involving the use of such stimuli in formal tests.

An additional hidden feature of MusiCI is a synthesizer capable of producing complex tones, intervals, and triads based on a non-traditional musical scale that has been called the Bohlen-Pierce (B-P) scale [see, for instance, M. V. Matthews and J. R. Pierce, "The Bohlen-Pierce Scale," in Current Directions in Computer Music Research, M. V. Matthews and J. R. Pierce Eds., MIT Press, Cambridge, 1989, pp. 165-173.] Whereas the traditional Equal Tempered scale has a uniform semitone interval corresponding to a frequency ratio equal to the twelfth root of two, the Bohlen-Pierce scale has a uniform semitone interval corresponding to a frequency ratio equal to the thirteenth root of three. Thus, where the traditional scale has twelve semitones per factor of two in frequency (octave), the B-P scale has thirteen semitones per factor of three in frequency.

For listeners with normal hearing, performances with the B-P scale are easily distinguished from "normal music." So long as complex tones are composed only of odd harmonics, however, intervals and chords in the B-P system exhibit patterns of consonance and dissonance quite analogous to those found in the traditional system. Thus availability of a B-P scale option makes it possible to test for and examine some of the percepts related to traditional judgments of consonance both within and without a traditional musical context.

The traditional system uses seven-note diatonic scales -- subsets of the twelve-note chromatic scale -- to span an octave (e.g., in the key of C Major, the white keys of a piano). Similarly, the non-traditional B-P system uses nine-note subsets to span what we call a "decave" by analogy. [Others have dubbed this interval a "tritave" since it corresponds to a factor of three in frequency.]

The traditional equal tempered scale provides close approximations to consonant just intervals with frequency ratios of 3:2, 4:3, 5:4, and 6:5 [perfect 5th, perfect 4th, Major 3rd, and minor 3rd, respectively]. Similarly, the equal-tempered B-P scale provides close approximations to intervals with frequency ratios of 5:3, 7:5, and 9:7, which are quite consonant if only odd harmonics are used. Analogous to the traditional exact Major triad, whose fundamentals have frequency ratios of 4:5:6, are the B-P "Major" triad 3:5:7 and "minor" triad 5:7:9.

The B-P scale option is contained within every copy of MusiCI and is always available to automated test scripts, at the level of individual stimulus specification lines [via the scale system switch, see Table 3 above]. Root fundamental frequencies are specified in B-P equal-tempered semitones over a three-decave range (0-38) with respect to a 110-Hz A1. Second and third fundamentals are specified in B-P semitone intervals above the root (1-18), but are synthesized as the more consonant small integer frequency ratios where possible. All other stimulus specification parameters are the same as for the traditional system.

A switch in each user's customization file [see Table 7 above] can be set to allow access to a window to design, compare, save and recall complex tones, intervals, and triads within the B-P system. That window -- opened by the "hot key" Alt-B and shown in Fig. 20 below -- in turn allows access to a comments window and to a utility for constructing, editing, saving, and recalling two-decave B-P scales, complete with an appropriate graphic musical keyboard. The latter window is illustrated in Fig. 21 below. These additional windows are essentially identical to those described in detail above for the traditional system. The significant variations are outlined in the following paragraphs.

{short description of image}

Figure 20. The main window for designing complex tones based on the Bohlen-Pierce nontraditional scale.

The nine B-P note names extend from A to I, with the four "black notes" labeled B-sharp, E-flat, G-flat, and I-flat. We have chosen the following labels for the B-P intervals, in successive semitones from the unison: minor 2nd, Major 2nd, 3rd (9:7), perfect 4th (7:5), minor 5th, Major 5th (5:3), 6th, minor 7th, Major 7th, 8th, minor 9th, Major 9th, and Decave (3:1).

Automatic selection of harmonics meeting the channel separation criteria is restricted to odd harmonics, although the user is always free to add even harmonics for comparison purposes. A separate musiciBP.cus file supplies customization information for the B-P fundamentals, and is identical in format to musici.cus except for the absence of a line supplying a master gain multiplier and a B-P window enabling switch [see Table 7 above].

{short description of image}

Figure 21. The Scale window for use with the Bohlen-Pierce nontraditional scale.

Stimulus specification files for individual B-P stimuli carry a .SBP extension, and specification files for full B-P scales carry a .BPS extension. These files have formats that are identical to those of .SPC and .SCS files, respectively. A full B-P scale includes 27 .SBP files and 27 .WAV files, with each set of names extending from nob1 to nob27. There are 27 lines in each .BPS file.

When switching back and forth between traditional and B-P design and comparison windows, one needs to keep in mind that there can be separate 1.SBP and 1.SPC files, but only one 1.WAV file at any moment (and similarly for 2.SBP, 2.SPC, and 2.WAV). So both specification files are preserved, but redisplay and resynthesis may be necessary when switching from one window to the other. Things were designed this way to allow users to listen to rapid "One" - "Two" comparisons across the scale types.

MusiCI

Possible future enhancements:

MIDI keyboard control over looped .WAV files prepared by the Full Scale options.

Generation of recordings for use in sampled keyboards (including ASDR envelopes). Such recordings would have four address pointers: (1) Attack segment onset, the beginning of the samples, (2) Sustain onset, the beginning of the looped segment, (3) Loop-back point, the end of the sustain loop and beginning of the release segment -- loop-back to point (2) would occur from here unless note release had been signaled, in which case playback would continue on into the release segment, and (4) End of the release segment and the final sample.