Open Speech Signal Processing Platform Workshop

October 17, 2014
National Institutes of Health, Bethesda, MD

Workshop Report

On this page:

Background and Charge of Workshop
Panel Presentations and Discussion
Poll
Recommendations

Background and Charge of Workshop

The National Institute on Deafness and Other Communication Disorders (NIDCD) held an internet-based workshop to consider the need and potential uses for a portable signal processing tool that provides substantial computing power for real-time processing of the acoustic environment. The vision for this research tool would be to incorporate open-source design principles to enable new types of basic psychophysical research studies beyond what is widely done today. Ideally, the research community would use this tool to lower barriers for hardware and software refinement, accelerate studies with novel acoustic processing algorithms, and facilitate translation of these advances into widespread use in hearing aids, cochlear implants, and consumer electronics devices.

The concept of hardware/software tools that can be readily reconfigured has become increasingly popular with the biomedical engineering community. Embedded microprocessor software development tools have emerged that provide high-quality integrated development environments (IDEs) at costs far less than were available just a decade ago; some of these are open-source software tools. Low-cost single board computers are commercially available that are both extensible and open. They support easy data acquisition, provide substantial computing power for signal processing of signals from multiple physiological sensors, and enable studies to be done outside of the lab environment.

Moore’s Law is used to describe the ongoing trend for a doubling of the transistor density within integrated circuits on a predictable interval. This trend has been clearly evident in consumer electronics devices that provide regular increases in the computational power available in portable devices and lowered costs for the computer chips that power these devices. Semiconductor chip manufacturers provide regular advances in the technology ready for use in commercial devices that could be used to accelerate research. New paradigms, such as the “internet of things,” are emerging based on low-cost, mobile sensors that provide increasingly sophisticated streams of data likely to enable development of new processing algorithms for mobile health. All of these technology trends could be used to accelerate research and development for acoustic signal processing algorithms.

Workshop participants were asked to consider these trends in preparation for the workshop. The meeting began with the charge to “consider the potential to accelerate research and development of bold new algorithms for speech signal processing through the use of a research tool that employs open design principles and provides next generation computing power in a portable form factor.”

Panel Presentations and Discussion

Six presenters described their experiences with the invention of new signal processing algorithms that showed promise for speech enhancement, as well as research studies based on well-known algorithms that required the investigators to reinvent that signal processing to support the research studies. Panel members were each asked to speak for five to ten minutes to provide their perspective on the concept. Their experiences indicated that this tool would have been used to accelerate research studies and improve translation of those results into widespread use. This was followed by an open discussion by all participants, and a poll was used to gather specific recommendations. The results are summarized below.

It was clear that the community is effectively using desktop computer tools to support rapid development and revision of signal processing software used in research studies. High-quality commercial and open-source tools already exist that provide powerful signal processing libraries and facilitate sharing of small sections or large libraries of computer source code. These integrated development environments are very capable, widely used by experts in the field, and represent a de facto standard for acoustic signal processing across academic groups.

Converting insights gained from research into a well-documented library of source code able to support real-time processing, however, requires an additional level of development that is difficult to implement within the constraints of a research lab. The knowledge, skills, and abilities required for extensive optimization, verification, and formal documentation of the source code library are substantial. The type of redevelopment needed for robust real-time execution requires a specialized skill set that is typically tied to a computer chip’s architecture. It is difficult for an academic lab to sustain this level of technical expertise for what is typically an intermittent task that is expended after publication of a research report in the peer-reviewed literature. However, there would be substantial value and widespread use for high-quality “research grade” software, if it were available.

Manufacturers of commercial hearing aids and personal sound processors have extensive experience with design, optimization, and marketing of sound-processing algorithms based on proprietary technologies. Proprietary integrated circuits are often used because of market constraints and the need to satisfy user demands for long battery life and concealability of the device. Intellectual property concerns limit the ability of outside groups to determine the computational resources available in these devices and the amount of engineering effort that might be required to adapt a novel algorithm coded in a high-level computer language into a working prototype based on a proprietary hearing aid.

The result is a wide “valley of death” that limits the ability to translate innovations from academic research into widespread commercial use. This could be narrowed by creating an open and extensible research platform that implements the basic functions of a personal sound processor or hearing aid with documented source code, optimized to function in real time. Over time, this effort would allow the research community to accumulate a large library of algorithms. Some of these could be shared as open-source code in the public domain, but the system must also allow use of code libraries that remain proprietary so that commercial translation is possible. Open code libraries are also needed to provide a base set of functions that could be readily ported to the technology of tomorrow, creating a research tool that could be widely used to lower the barriers for future research and development projects.

The panel noted a compelling need to sustain research that extracts every possible user benefit from existing signal processing algorithms currently known to the field. A shared library of well-documented computer source code would save researchers the time and expense required to write their own implementations, ensure that the details of its operation were not changed unintentionally, and make selection of operating parameters available for careful inspection and modification when needed for a particular study. The base code library should include basic hearing aid functions; multiband amplitude compression and acoustic feedback cancellation were noted as essential elements. Other software feature requests included: speech enhancement/noise reduction algorithms, spectral analysis tools, gain prescription tools, directional microphone processing libraries, frequency lowering functions, real-time learning algorithms, automated environment classification and logging, and binaural processing functions.

Specific software recommendations were made. The computer programming language thought to be most suitable for this task was C or C++. This would allow easy portability to new hardware platforms and the possibility of providing optimized sections of code in assembly language when absolutely necessary for real-time execution. It must be possible to incorporate commercial function libraries as compiled code. Response latency was noted as a key tradeoff for real-time operation. A period of 8 milliseconds is considered the maximum acceptable delay for an acoustic input, while other signals, such as a telephone input, could still function with as much as a 30 millisecond processing delay. Standardized benchtop verification test measures would be required, complete with test data and documentation of the procedures, so that validation procedures could be repeated and extended by other researchers.

Participants emphasized the need to continually incorporate new components into the signal processing chain as an equally compelling requirement for both the hardware and software architecture. This is an ambitious goal, since modifications in the signal made by one function are likely to affect operation of a later component. Parallel processing followed by selection of the most desirable output is another desirable feature. The provision for extensive software redevelopment, however, is a key requirement to support development of novel algorithms that provide speech enhancement, noise reduction, automated adjustment of the device based on environmental factors, and other new ideas not yet conceived.

Extensibility of the base hardware platform was identified as another key requirement, in order to access sustained increases in computing power and incorporate novel hardware components. Optional use of a field programmable gate array in the signal processing chain was one example cited. Provisions should be made to allow adoption of new generations of processors to provide regular increases in the computation power of the device over time. The ability to add wireless connectivity was of interest to many, and the architecture should allow incorporation of future wireless chip sets to access predictable advances from this field. Most participants believed that a wired earpiece would be sufficient for initial research studies, and this would avoid additional delays in the real-time signal processing loop required for wireless transmission of the signal from the device to the ear.

Portability was another key design requirement. This would enable field testing with novel signal processing functions in various environments beyond the acoustic isolation chamber, which is the test environment used for most academic studies. Limitations imposed by real-world environments can be identified and ultimately resolved if they are readily included in the development cycle. Portable devices also enable distribution of a complete working system to outside labs. This would facilitate collaborations between signal processing designers and investigators with the audiological skills needed to run studies with volunteers who have hearing loss.

Participants agreed that this research tool offered great potential to accelerate research and development efforts for acoustic signal processing. There will be many issues to address in the development of the device and the infrastructure required to use it effectively. Once developed, however, it would be widely used as a reference platform and could become a de facto standard for the field. It would lower barriers for development of new software and speed the incorporation of new hardware capabilities as they become available from commercial semiconductor chip manufacturers. Ideally, it would narrow the gap between academic development and commercial use of acoustic signal processing algorithms.

Poll

All workshop participants answered the following questions via a poll at the end of the workshop. The results of the short answer questions are reflected in the summary of the workshop discussion provided above.

Can you envision this tool lowering barriers or increasing efficiency for new or existing studies in your lab? (Check all types that apply.)

a. Tests with normal hearing human subjects

14/16 (88%)

b. Tests with hearing impaired human subjects

15/16 (94%)

c. Tests with sound-processing functions developed by outside labs

12/16 (75%)

d. Sharing sound-processing functions developed in your lab with outside labs

15/16 (94%)

Which feature set is most interesting for your group?

a. Easy modification of research grade, open software provided with the device

3/17 (18%)

b. Increased computational power to enable real-time tests with
novel algorithms

7/17 (41%)

c. Portability to allow mobile testing outside of the lab environment

7/17 (41%)

d. Wearability to allow longer-term studies that include learning and adaption

0/17 (0%)

Development of this platform will include creating a base set of reference software with open-source code making those functions easily and fully changeable by the user. Please specify the sound-processing functions that are essential for the base feature set, for example: frequency lowering, multiband compression, etc. [Responses are summarized above.]

What additional features would be “nice to have?” [Responses are summarized above.]

Is the ability to wirelessly send/receive real-time control parameters into the device essential?

Yes 9/16 (56%)

No 7/16 (44%)

Is the ability to wirelessly send an audio signal to the earpiece essential?

Yes 4/17 (25%)

No 12/17 (75%)

What studies might this device be used for, or what obstacles might this device overcome, that are currently difficult or impossible to do with a commercial sound processor or hearing aid? [Responses are summarized above.]

What essential validation test results should be provided with the hardware/software in order to provide sufficient assurances to investigators that this device can be used in a research study? [Responses are summarized above.]

Which sound-processing functions would be of most interest for your field?

a. Novel speech sound-processing strategies

5/17 (29%)

b. Novel noise removal sound-processing strategies

4/17 (24%)

c. Hearing aid functions but with the ability to modify code to control details of each operation

7/17 (41%)

d. Other (please describe)

Binaural sound-processing strategies

How much would you be willing to spend, per device in U.S. dollars, for each research device?

a. $100 or less

0/17 (0%)

b. $100 to $200

0/17 (0%)

c. $200 to $400

2/17 (12%)

d. $400 to $1000

9/17 (53%)

e. $1000 to $2000

6/17 (35%)

What other activities might the NIDCD support to foster adoption of this device? [Responses are summarized above.]

Recommendations

The following actions were recommended by workshop participants:

The NIDCD-supported research community has a critical need for an open, extensible, and portable device that supports acoustic signal processing in real time. The platform should provide substantial computing resources in its initial design and use an architecture that allows incorporation of faster processors and other integrated circuits, such as those that support wireless connectivity, in the near future.

Research studies would be accelerated through the release of an open and extensible reference library of sound-processing functions that implement basic hearing aid functions, including fitting algorithms. The base library would consist of open-source software that can be optimized to satisfy real-time sound-processing constraints. This code should be easily portable to new hardware platforms, include substantial documentation of its operation within the source code, and provide test data with documented verification procedures.

It must be extensible so that new signal processing libraries can be added as either open-source code or as binary code. Developers must be able to use the system to create new signal processing functions that can be either shared freely or as licensed, proprietary software that is commercialized for profit.

Today’s hearing aids provide limited improvements of speech intelligibility in noise; this represents a critical need of the hearing aid user. This research tool is needed to shorten the research and development cycle for innovative acoustic signal processing, speech enhancement, and noise reduction algorithms. Ideally, this tool will facilitate translation of algorithms from the research laboratory into widespread use by personal sound processors, hearing aids, cochlear implants, and consumer electronics devices.

Report authors:

Roger L. Miller, Ph.D.
Program Director, Neural Prosthesis Development
National Institute on Deafness and Other Communication Disorders
National Institutes of Health

Amy Donahue, Ph.D.
Program Officer, Hearing and Balance
Deputy Director, Division of Scientific Programs
National Institute on Deafness and Other Communication Disorders
National Institutes of Health

Last Updated Date:

June 16, 2015