Document toolboxDocument toolbox

(8.D.2.5) GCCV7

Overview

Compute angular 2D direction of arrival of audio sources arriving at the mic array

Discussion

This module computes angular 2D direction of arrival of audio sources arriving at the mic array

Input Pins:

    Input Pin 1:   Multichannel frequency domain data. Usually output of a WOLA Analysis Module
                          Number of channels must match the number of microphones in the micGeometry argument

Output Pins:

    Output Pin 1:   Estimated angular direction of arrival for this WOLA block in degrees. This will be an integer between 0 and 360.

    Output Pin 2:    Confidence value for this angle estimate. This will be an integer. Larger values mean more confidence.

    Output Pin 3:    Array of 360 floats denoting a histogram of possible angles of arrival for this block. Output pin 1 will be the peak of this histogram.

Module Arguments:

    numMics:   number of microphones being used

    interpFactor:   Interpolation factor for better angular resolution at the cost of more CPU resources

    fftSize:    FFTSize used by WOLA Analysis block preceding the SbDOAV2 module

    Fs:    Sampling rate of time domain data entering WOLA Analysis block

    LowerBin:    Frequency bin index of lowest WOLA bin entering SbDOAV2 input pin

 

Inspector Tuning Variables:

    BlocksPerHistogram: Each WOLA block we get an angle estimate for each bin. These are accumulated over BlocksPerHistogram
    WOLA blocks into a histogram. Then the peak of the histogram is output as the angle estimate. Larger values of
    BlocksPerHistogram give more accuracy but slower response time. BlocksPerHistogram value does not affect MIPS.

    smoothingCoefficient: The smoothing coefficient, (SC), is a float which takes on values between 0 and 1.
    SC operates on the relevant portion of the time domain correlation data between every mic pair. SC performs
    exponential smoothing/averaging in time over each point of the correlation functions. The equation is:
    state(n) = SC*(new data) + (1-SC)*state(n-1). This smoothing increases the accuracy of the DOA algorithm.
    SC allows us to trade off accuracy and responsiveness. The smaller the value of SC, the longer the tail
    of our exponential smoothing/averaging.
    SC smaller ---> more smoothing more accuracy but less responsiveness
    SC larger ---> less smoothing more responsiveness but less accuracy

Type Definition

typedef struct _ModuleGCCV7 {     ModuleInstanceDescriptor instance;            // Common Audio Weaver module instance structure     INT32 numMics;                                // Number of mics in the array     INT32 interpFactor;                           // Interpolation factor     INT32 fftSize;                                // Size of FFT upstream from this module     INT32 MultiSource;                            // Select single or multiple audio sources reported per block     FLOAT32 Fs;                                   // Time domain sampling rate before WOLA/FFT     INT32 LowerBin;                               // Frequ bin index of lowest frequ component in the WOLA block     FLOAT32 smoothingCoefficient;                 // Smoothing coefficient for exponential smoothing of frequ domain CC data     INT32 BlocksPerHistogram;                     // Number of blocks used to fill histogram before doing smoothing and averaging     INT32 interpBS;                               // Block Size at interpolated rate     FLOAT32 interpSR;                             // Sample rate at interpolated rate     INT32 numDirs;                                // Number of look directions     INT32 maxNumSources;                          // Maximum number of sources to identify     INT32 useInputWeightingPin;                   // Use input weighting pin     INT32 smoothing;                              // 0 No Smoothing 1 Smoothing     INT32 useSmoothingPin;                        // Select use control pin for smoothing coefficient     INT32 combineCongruentMicPairs;               // Select combining congruent mic pairs for MIPS savings     INT32 boxcarDelay;                            // Half length of rectangular window for histogram smoothing     INT32 gaussDelay;                             // Half length of gaussian window for histogram smoothing     FLOAT32 gaussSigma;                           // Standard Deviation for Gaussian window     INT32 NumHistogramStates;                     // Number of full histograms used for temporal averaging     INT32 QCOpt;                                  // Select optimization. DONT CARE, NOT IMPLEMENTED YET     INT32 doHistogramProcessing;                  // Add histogram processing to module outputs     INT32 numBins;                                // Number of bins. This should be the input block size     INT32 numPairs;                               // Number of microphone pairs     INT32 xCorrsZeroSamp;                         // 0th sample index in the xcorr data     INT32 statesSize;                             // Size of states array for frequency domain smoothing     INT32 tdCurrSize;                             // Size of temp td array for time domain smoothing     INT32 maxDelay;                               // Maximum of gaussDelay, boxcarDelay     INT32 HistStatePtr;                           // Index of oldest entry in histogram buffer     INT32 blockCtr;                               // Counter to keep track of number of blocks used to update histogram     INT32 blockCtrReset;                          // Counter to keep track of number of blocks used between reset to zero of avgHistogram     INT32 blockCtrMultiOpt;                       // Counter for multisource/multiblock optimization     INT32 numPairsToUse;                          // Number of non-redunbdant microphone pairs     INT32 NSRMP;                                  // Number of sets of redundant mic pairs     INT32 maxNMPiS;                               // max number of redundant pairs in all sets of redundant pairs     INT32 NNRMP;                                  // Number of non-redundant mic pairs     FLOAT32 epsilon;                              // Small number     FLOAT32 c;                                    // Speed of sound in meters per sec     FLOAT32 windowSize;                           // How much to extend the look window when comparing measured delays with theoretical time delays. This is relative number.     INT32 removalWindowDelta;                     // Number of samples to remove when filtering DOA Peaks     INT32 doHistSmoothing;                        // Specifies whether to do smoothing of histogram     INT32 doHistAveraging;                        // Specifies whether to do temporal smoothing of histogram     FLOAT32 MinPeakThresh;                        // Minimum height of a histogram peak to be eligible to be reported as a DOA     INT32 MinDist;                                // Distance in integer units of degrees, between peaks in histogram to count as a distinct DOA     FLOAT32 noiseFloorDBOffset;                   // Offset, in DB, for noise floor weighting. Higher value will weight noise floor less.     INT32 lookDirLow;                             // Min look direction, in degrees. The low end of the DOA look direction.     INT32 lookDirHigh;                            // Max look direction, in degrees. The high end of the DOA look direction.     INT32 tauMaxMax;                              // Maximum theoretical time delays for all mic pair     INT32 tauMinMin;                              // Minimum theoretical time delays for all mic pair     INT32 tdZeroSamp;                             // windowed xcorr 0 delay sample index     INT32 tdLength;                               // Length of the xcorr vector, after windowing     FLOAT32* micArrayCoords;                      // Microphone geometry in cartesian coordinates and in meters. The vector going from (0,0) to (1,0) is considered 0 degrees.     FLOAT32* noiseFloorVar;                       // Microphone noise floor measurement, in db10 units.     FLOAT32* HistEnergies;                        // Holds energy of sources at each source direction for a data block     FLOAT32* HistAngles;                          // Holds angle for each source for a data block     FLOAT32* HistState;                           // Holds NumHistogramStates histograms     FLOAT32* avgHistogram;                        // Scratch array to hold average histogram     FLOAT32* currentHistogram;                    // Scratch array to hold current histogram     FLOAT32* smoothedHistogram;                   // Scratch array to hold smoothed histogram     FLOAT32* histExt;                             // Scratch array to hold smoothed histogram     FLOAT32* histCnv;                             // Scratch array to hold smoothed histogram     INT32* histSortedAngles;                      // Scratch array for histogram peak picking     FLOAT32* histSortedValues;                    // Scratch array for histogram peak picking     INT32* histAnglesUsed;                        // Scratch array for histogram peak picking     FLOAT32* valuesOutState;                      // Scratch array to hold output histogram counts     INT32* anglesOutState;                        // Scratch array to hold output angles     FLOAT32* GaussianWindow;                      // Gaussian window for smoothing histogram     FLOAT32* BoxcarWindow;                        // Rectangular window for smoothing histogram     INT32* taus;                                  // Theoretical time delays based on the mic geometry and other factors.     FLOAT32* deltaTaus;                           // Difference between taus and taus rounded to nearest sample delay.     FLOAT32* lookDirs;                            // Look directions, in degrees.     INT32* tauMax;                                // Maximum theoretical time delays for each mic pair     INT32* tauMin;                                // Minimum theoretical time delays for each mic pair     FLOAT32* doaValues;                           // Temporary variable from the processing function     FLOAT32* td;                                  // Temporary variable from the processing function     INT32* Map;                                   // Map from pair processed to index of pair in micPairsNR = array of non-redundant mic pairs     FLOAT32* states;                              // Holds current estimate of Rxx     FLOAT32* tdCurr;                              // Temp array for new td data when doing time domain smoothing     INT32* NMPiS;                                 // Array containing number of redundant mic pairs in each set     INT32* CIR;                                   // Array containing channel indices of redundant mic pairs     INT32* CINR;                                  // Array containing channel indices of non-redundant mic pairs     void * ifft_struct_pointer;                   // Points to an instance of an IFFT module } ModuleGCCV7Class;

Variables

Properties

Name

Type

Usage

isHidden

Default value

Range

Units

numMics

int

const

0

4

Unrestricted

 

interpFactor

int

const

0

4

Unrestricted

 

fftSize

int

const

0

256

Unrestricted

 

MultiSource

int

const

0

0

Unrestricted

 

Fs

float

const

0

16000

Unrestricted

 

LowerBin

int

const

0

0

Unrestricted

 

smoothingCoefficient

float

parameter

0

0.2

0:1

 

BlocksPerHistogram

int

parameter

0

32

1:200

 

interpBS

int

derived

0

512

Unrestricted

 

interpSR

float

derived

0

64000

Unrestricted

 

numDirs

int

const

1

360

Unrestricted

 

maxNumSources

int

const

1

1

Unrestricted

 

useInputWeightingPin

int

const

1

0

Unrestricted

 

smoothing

int

const

1

2

Unrestricted

 

useSmoothingPin

int

const

1

0

Unrestricted

 

combineCongruentMicPairs

int

const

1

1

Unrestricted

 

boxcarDelay

int

const

1

2

Unrestricted

 

gaussDelay

int

const

1

8

Unrestricted

 

gaussSigma

float

const

1

3.2

Unrestricted

 

NumHistogramStates

int

const

1

4

Unrestricted

 

QCOpt

int

const

1

1

Unrestricted

 

doHistogramProcessing

int

const

1

1

Unrestricted

 

numBins

int

const

1

129

Unrestricted

 

numPairs

int

const

1

6

Unrestricted

 

xCorrsZeroSamp

int

const

1

64

Unrestricted

 

statesSize

int

const

1

1

Unrestricted

 

tdCurrSize

int

const

1

1

Unrestricted

 

maxDelay

int

const

1

8

Unrestricted

 

HistStatePtr

int

state

1

0

Unrestricted

 

blockCtr

int

state

1

0

Unrestricted

 

blockCtrReset

int

state

1

0

Unrestricted

 

blockCtrMultiOpt

int

state

1

0

Unrestricted

 

numPairsToUse

int

const

1

6

Unrestricted

 

NSRMP

int

const

1

1

Unrestricted

 

maxNMPiS

int

const

1

1

Unrestricted

 

NNRMP

int

const

1

1

Unrestricted

 

epsilon

float

parameter

1

1e-16

Unrestricted

 

c

float

parameter

1

343

Unrestricted

 

windowSize

float

parameter

1

0

Unrestricted

 

removalWindowDelta

int

parameter

1

2

Unrestricted

 

doHistSmoothing

int

parameter

1

1

Unrestricted

 

doHistAveraging

int

parameter

1

1

Unrestricted

 

MinPeakThresh

float

parameter

1

1

0:100

 

MinDist

int

parameter

1

10

0:360

degrees

noiseFloorDBOffset

float

parameter

1

5

-20:20

 

lookDirLow

int

parameter

1

0

0:360

 

lookDirHigh

int

parameter

1

360

0:360

 

tauMaxMax

int

derived

1

0

Unrestricted

 

tauMinMin

int

derived

1

0

Unrestricted

 

tdZeroSamp

int

derived

1

1

Unrestricted

 

tdLength

int

derived

1

1

Unrestricted

 

micArrayCoords

float*

parameter

0

[4 x 2]

Unrestricted

 

noiseFloorVar

float*

parameter

1

[129 x 1]

-500:500

 

HistEnergies

float*

state

0

[1 x 1]

Unrestricted

 

HistAngles

float*

state

0

[1 x 1]

Unrestricted

 

HistState

float*

state

0

[1440 x 1]

Unrestricted

 

avgHistogram

float*

state

0

[360 x 1]

Unrestricted

 

currentHistogram

float*

state

0

[360 x 1]

Unrestricted

 

smoothedHistogram

float*

state

0

[360 x 1]

Unrestricted

 

histExt

float*

state

0

[376 x 1]

Unrestricted

 

histCnv

float*

state

0

[392 x 1]

Unrestricted

 

histSortedAngles

int*

state

0

[360 x 1]

Unrestricted

 

histSortedValues

float*

state

0

[360 x 1]

Unrestricted

 

histAnglesUsed

int*

state

0

[360 x 1]

Unrestricted

 

valuesOutState

float*

state

0

[10 x 1]

Unrestricted

 

anglesOutState

int*

state

0

[10 x 1]

Unrestricted

 

GaussianWindow

float*

parameter

1

[17 x 1]

Unrestricted

 

BoxcarWindow

float*

parameter

1

[5 x 1]

Unrestricted

 

taus

int*

derived

1

[6 x 360]

Unrestricted

 

deltaTaus

float*

derived

1

[6 x 360]

Unrestricted

 

lookDirs

float*

derived

1

[1 x 360]

Unrestricted

 

tauMax

int*

derived

1

[6 x 1]

Unrestricted

 

tauMin

int*

derived

1

[6 x 1]

Unrestricted

 

doaValues

float*

state

0

[360 x 1]

Unrestricted

 

td

float*

state

0

[1 x 6]

Unrestricted

 

Map

int*

derived

0

[6 x 1]

Unrestricted

 

states

float*

state

0

[1 x 1]

Unrestricted

 

tdCurr

float*

state

0

[1 x 1]

Unrestricted

 

NMPiS

int*

derived

0

[100 x 1]

Unrestricted

 

CIR

int*

derived

0

[100 x 1]

Unrestricted

 

CINR

int*

derived

0

[100 x 1]

Unrestricted

 

ifft_struct_pointer

void *

state

1

 

Unrestricted

 

Pins

Input Pins

Name: micIn

Description: Microphone array input - in frequency domain

Data type: float

Channel range: 4

Block size range: Unrestricted

Sample rate range: Unrestricted

Complex support: Complex

Output Pins

Name: estDirs

Description: Estimated directions, in degrees

Data type: float

Name: estEnergy

Description: Estimated relative energy for each estimated direction

Data type: float

Name: HistValues

Description: Histogram output values

Data type: float

Scratch Pins

Channel count: 1

Block size: 513

Sample rate: 48000

Channel count: 1

Block size: 1024

Sample rate: 48000

Channel count: 1

Block size: 1024

Sample rate: 48000

MATLAB Usage

File Name: gcc_v7_module.m

M = gcc_v7_module(NAME, numMics, interpFactor, fftSize, FS, LOWERBIN) GCCV7 module identifies the 2D angular direction of an audio source received at the mic array. The module takes multichannel frequency domain data as input, (usually the output of a WOLA Analysis module). The module produces 2 outputs per block: 1.)Angle = estimated angular direction of arrival in degrees. This will be an integer between 0 and 360. 2.)Confidence value for this angle estimate. Larger values mean more confidence in the angle estimate. Arguments: NAME - name of the module. numMics - number of microphones in the microphone array interpFactor - how much to interpolate for better estimation at the cost of more MIPS fftSize - Size of the FFT operation done upstream from this module. FS - Sampling rate of time domain data entering WOLA Analysis block LOWERBIN - Frequency bin index of lowest WOLA bin entering GCCV7 input pin

Â