« Previous - Version 19/27 (diff) - Next » - Current version
Adrian Georgescu, 07/25/2010 09:14 am


= Acoustic Echo cancellation =

<acronym title="Design*, depth=1">TOC</acronym>

SIP SIMPLE client is a Python software library that allows for easy development of Internet communications end-points based on SIP and related protocols for voice, rich presence, session based instant messaging (IM), file transfers and desktop sharing. Other media types can be easily added by using an extensible high-level API. SIP SIMPLE client uses [http://www.pjsip.org/pjmedia/docs/html/index.htm PJSIP media library] for audio processing (sound card abstraction, audio codec and acoustic echo cancellation).

For more information see http://sipsimpleclient.com. The project is maintained by AG Projects that can provide help to the developer to integrate its AEC implementation.

Background

The present implementation of the Acoustic Echo Canceller from PJSIP media library is not performing to the desired level of quality. Practically, the speakerphone function is not performing satisfactory. The result is that the software phone cannot be used without a head-set, which make people use proprietary applications like Skype or Google Talk.

[[Image(http://www.pjsip.org/images/media-flow.jpg)]]

Project goal

Replace the existing AEC from Pjmedia library (what is displayed as echo.h in the above diagram whihc employes the AEC of the speex project) with an alternative to be developed solution that provides a high quality speakerphone user experience. The acoustic echo cancelation should perform comparable with any other proven VoIP commercial solution like Skype or iChat making the need of a headset un-necessary.

The developed software will be released under an Open Source licence and distributed with SIP SIMPLE client library. The deliverable is C written program that can be applied to the pjsip cvs trunk 1.0 with the '''patch''' command.

Resources

General

PJMEDIA

This is the actual audio library used by SIP SIMPLE client. It is part of PJSIP, a complete framework for building SIP clients, having an open source license.

=== Speex AEC ===

This is the AEC algorithm used by PJMEDIA that needs to be replaced by a better solution. The actual code is maintained by http://speex.org project.

=== Andree Adrian AEC ===

Research for other implementations revealed a well documented algorithm together with source code as C++ implementation.

The white-paper gave enough insight and trust into its author understanding and capability of implementation for considering its blueprint as an alternative. Unfortunately, the author's claims about its high quality could not be tested, due to the fact that his C++ code dates back from 2004 and the application that used it could not be compiled on todays newer systems.

Finally, this AEC has been implemented by a student in plain C and integrated to the SIP SIMPLE project. It does not work properly, audio artifacts being present and further debugging requires DSP knowledge which the developer did not have.

The actual implementation can be used as working example for how a third party AEC algorithm can be integrated with the SIP SIMPLE project. See attached file for more information.

Next steps

At this stage the problem has not been solved due to the lack of knowledge of the developers in this specialized DSP area.

There are two choices for going further:

1. Implement an AEC from scratch and integrated with PJSIP based on its API
2. Debug and fix Andree Adrian AEC, an analysis of the code can reveal if the concept is valid but poorly implemented

The AEC software solution should have basically three blocks:

The adaptive filter

Here are to subject for testing NLMS and Kallman algorithms that will approximate the transfer function of the room for the LE (Local End).

Double-talk detector

DTD block will be used to detect the double talking in both ends (LE and FE-far end) for full duplex. To implement the DTD here can be considered the next algorithms: Geigel, Cross Correlation, VIRE- variable impulse response and the Gansler optimization (used in Speex)

Non-linear processor

NLP block will eliminate the rest of the echo already canceled (the laptop's fan, or the keyboard sounds or the HDD and other types of sounds that are considered noise non-speech. Also in this block can be added some processing tasks that are deeply nonlinears like: increasing the SNR for the small volumed speech signals, eliminate the clipping, smooth transitions between the small signals and loader signals (only voices) and correcting the errors from the previews processing algorithms.

For the supplementary block the idea is to expand the performance of the component and to perform robust cancellation (add more control logic, increase DTD using mutual information for stereo devices, improve the stability of the adaptive filter using the Wiener-Hammerstein method to keep on the convergence of the filter and using Volterra model) and to use the maximum information from all the codecs that are included.

ag-aec.tgz - Andre Adrian AEC implementation integrated with PJSIP (35.8 kB) Adrian Georgescu, 11/16/2009 06:15 pm