LanScape Support Forum: DC offset compensation

		Active Topics Member List Search Help Register Login

LanScape VOIP Media Engine™
LanScape Support Forum -> LanScape VOIP Media Engine™

Topic: DC offset compensation

Author

Message

<< Prev Topic | Next Topic >>

hermes
Junior
Junior

Joined: October 27 2006
Posts: 64

Posted: June 16 2008 at 4:44pm \| IP Logged

The most of the sound cards have different DC offsets. If you are talking with a phone it isn´t any problem but for example if you are recognizing voice over VoIP it is a handicap. There are a lot of algorithms to compensate DC offset. We use a fast dynamic compensation algorithm based in FFT to do this. If you are interested in integrate this improvement feel free to ask us any doubt.

Code:

Mic --> Offset Compensation module --> Codec module --> RTP packet

hermes
Junior
Junior

Joined: October 27 2006
Posts: 64

Posted: June 17 2008 at 4:17pm \| IP Logged

Perhaps it could be better, if there was a 'Microphone Sample Callback' like 'RTP Callback' to modify wave samples.

support
Administrator

Joined: January 26 2005
Location: United States
Posts: 1666

Posted: June 18 2008 at 9:36am \| IP Logged

Hi Hermes,

Good post.

You know, we have not paid much attention to DC offset in local multimedia recorded sample block data. If there is a DC offset component to the signal, it must be small, right?

If you can, elaborate a bit more on the DC offset characteristics you have experienced in your development career. Also elaborate on how this affects speech recognition engines you have used. We are curious about your findings.

In the mean time, you may want to fool around with the following undocumented API procedure:

Code:

// data struct passed to a user registered callback proc. It allows application

// software to access raw recorded audio buffers.

//

typedef struct

{

    AUDIO_BANDWIDTH AudioBandwidth;     // represents the format and rate of the sampled data.

    void *pSampleBuffer;                // address of the sample buffer.

    unsigned long BufferLengthInBytes;  // the number of bytes in the sample buffer.

    void *pUserData;                    // the user supplied callback data. this will be set to the same value

                                        // that was specified when SetAudioRecordCallback() proc was called

}AUDIO_RECORD_DATA;

typedef BOOL (VOIP_API *AUDIO_RECORD_CALLBACK_PROC)(AUDIO_RECORD_DATA *pAudioRecordData);

TELEPHONY_RETURN_VALUE VOIP_API SetAudioRecordCallback(

            SIPHANDLE hStateMachine,

            AUDIO_BANDWIDTH DesiredBandwidth,

            AUDIO_RECORD_CALLBACK_PROC pAudioRecordCallback,

            void *pUserData

            );

It will allow your app to gain access to locally recorded multimedia sample blocks just after they have been recorded and just before they get used by the media engine. The “DesiredBandwidth” parameter can be set to anything because it is ignored.

The callback that this API registered will be called every 20Ms with recorded 22k PCM sample block data. The samples are recorded by the mutimedia hardware and consist of 20Ms of 22050Hz PCM data. Samples have a data type of “short (signed 16 bit samples).

The sample block passed to the app can be modified however the app wants. This is where you could place your DC offset compensation logic.

Support

Notes:

This post discusses VOIP Media Engine undocumented API procedures that are used for internal test purposes. Do not use these API procedures in your VOIP applications.

hermes
Junior
Junior

Joined: October 27 2006
Posts: 64

Posted: June 18 2008 at 11:26am \| IP Logged

Thanks a lot!

It's just what I was looking for. Undocumented API procedures are saving our life.

We´ve tested a lot of sound cards. Most of them have got a small DC component.

Voice recognition engines are trained first with a lot of different voices. This voice files are recorded without DC offset. When you want to use one of these engines is better if you pass an adaptation process before you dictate.
When you use a sound card with DC offset, voice formants are different regarding 'base formants', even you can saturate your voice. For this reason, it is a good practice compensating DC offset.
I don´t know if I have explained it clearly.

support
Administrator

Joined: January 26 2005
Location: United States
Posts: 1666

Posted: June 19 2008 at 8:00am \| IP Logged

Good explanation.

We did not know that voice recognition accuracy and "training" has a sensitivity to DC offset in the voice signal. You would think the speech recognition engines would “filter” out a DC offset component themselves. Hmmm…….

hermes
Junior
Junior

Joined: October 27 2006
Posts: 64

Posted: June 19 2008 at 9:11am \| IP Logged

Not all speech recognition engines have got a DC compensation filter. DC isn´t critical in this aspect but it can help you.
We have worked with all kind of speech recognition engines and we prefer to compensate DC offset ourselves.
Thanks again.

If you wish to post a reply to this topic you must first login
If you are not already registered you must first register

Printable version

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot delete your posts in this forum
You cannot edit your posts in this forum
You cannot create polls in this forum
You cannot vote in polls in this forum