Document toolboxDocument toolbox

(8.D.2.7) SbDOAV2

Overview

Compute angular 2D direction of arrival of audio sources arriving at the mic array

Discussion

This module computes angular 2D direction of arrival of audio sources arriving at the mic array

Input Pins:

Input Pin 1:   Multichannel frequency domain data. Usually output of a WOLA Analysis Module
                        Number of channels must match the number of microphones in the micGeometry argument

Input Pin 2:   Bit map to include/exclude any desired input bins from processing. This will usually be set to all ones
                          Block size must be equal to the block size, (number of complex values), of Pin 1

Output Pins:

    Output Pin 1:   Estimated angular direction of arrival for this WOLA block in degrees. This will be an integer between 0 and 360.

    Output Pin 2:    Confidence value for this angle estimate. This will be an integer. Larger values mean more confidence.

    Output Pin 3:    Array of 360 floats denoting a histogram of possible angles of arrival for this block. Output pin 1 will be the peak of this histogram.

Module Arguments:

    micGeometry:   Drop down menu to select mic geometry being used

    fftSize:    FFTSize used by WOLA Analysis block preceding the SbDOAV2 module

    Fs:    Sampling rate of time domain data entering WOLA Analysis block

    MicGeomSize:    Dimension in meters of selected mic geometry. This has a different meaning for each mic geometry.
        4_Mic_Square geometry MicGeomSize specifies side length of square.
        4_Mic_Linear geometry MicGeomSize specifies distance between mics.
        4_Mic_Trillium geometry MicGeomSize specifies radius of circle in which triangle is inscribed.
        4_Mics_of_6_7_Mic_Circle geometry MicGeomSize specifies radius of circle in which 4 mics are inscribed.
        Equilateral_Triangle geometry MicGeomSize specifies side length of triangle.
        2_Mic geometry MicGeomSize specifies distance between mics.
        7_Mic_Circle geometry MicGeomSize specifies radius of circle.
        6_Mic_Circle geometry MicGeomSize specifies radius of circle.
        Rectangular geometry MicGeomSize specifies side length of shorter side of 2x1 rectangle.

    LowerBin:    Frequency bin index of lowest WOLA bin entering SbDOAV2 input pin

    MultiSource: Allows reporting of multiple audio sources, and controls optimization level.
        MultiSource = 1 Report up to 5 audio sources per WOLA block. Angles are reported in the range 0-360 degrees.
        This setting is not optimized and has huge MIPS spikes. This is only suitable for native mode, not any embedded targets.
        MultiSource = 1 overrides any settings for OutputAngles and angles reported will always be in the range 0-360.
        MultiSource = 0 Report only 1 audio source. This setting is optimized for low MIPS and is suitable for embedded and
        native targets. The range of angles reported is controlled by the value of OuputAngles.

    OutputAngles: Select range of angles reported 0-360 or 180-360
        OutputAngles = 0 Report angles in the range 0-360
        OutputAngles = 1 Report angles in the range 180-360. This setting uses slightly less MIPS since we are
        reporting a smaller range of angles. This setting is intended for applications where only an angle range of 180
        is needed, (e.g. a conference room soundbar mounted against a wall)

Inspector Tuning Variables:

    BlocksPerHistogram: Each WOLA block we get an angle estimate for each bin. These are accumulated over BlocksPerHistogram
    WOLA blocks into a histogram. Then the peak of the histogram is output as the angle estimate. Larger values of
    BlocksPerHistogram give more accuracy but slower response time. BlocksPerHistogram value does not affect MIPS.

    DT1: Dynamic thresholding parameter. Controls sensitivity of algorithm to input signal global power. DT1 specifies the
    number of dB above the global threshold the average signal power for this block must be for an angle to be reported. This
    parameter can be used to avoid reporting angles of small background nuisance noises

Type Definition

typedef struct _ModuleSbDOAV2 { ModuleInstanceDescriptor instance; // Common Audio Weaver module instance structure FLOAT32 Fs; // Sample rate of time domain data entering WOLA analysis block INT32 fftSize; // FFT size used in computing WOLA block INT32 micGeometry; // Specifies which 4 mic geometry we are working with INT32 MultiSource; // MultiSource = 1, multiple audio sources reported each block. = 0, one audio source reported INT32 OutputAngles; // OutputAngles = 0, angles in range 0-360 reported, = 1 180-360 reported INT32 LowerBin; // Frequ bin index of lowest frequ component in the WOLA block FLOAT32 MicGeomSize; // Length in meters of each side of square mic array FLOAT32 DT1; // Dynamic threshold level INT32 BlocksPerHistogram; // Number of blocks used to fill histogram before doing smoothing and averaging INT32 boxcarDelay; // Half length of rectangular window for histogram smoothing INT32 gaussDelay; // Half length of gaussian window for histogram smoothing FLOAT32 gaussSigma; // Standard Deviation for Gaussian window INT32 NumHistogramStates; // Number of full histograms used for temporal averaging INT32 numStates; // Number of states in state buffer INT32 Opt; // Optimization level INT32 numMics; // Number of microphones for this mic geometry INT32 maxDelay; // Maximum of gaussDelay, boxcarDelay INT32 HistStatePtr; // Index of oldest entry in histogram buffer INT32 XStatePtr; // Index of oldest entry in state buffer INT32 blockCtr; // Counter to keep track of number of blocks used to update histogram INT32 blockCtrReset; // Counter to keep track of number of blocks used between reset to zero of Rxx and avgHistogram INT32 aboveThresh; // Input signal dBA greater than noise floor dBA. Changes once per BlocksPerHistogram blocks INT32 doHistSmoothing; // Specifies whether to do smoothing of histogram INT32 doHistAveraging; // Specifies whether to do temporal smoothing of histogram INT32 doNFTracking; // Specifies whether to do noise floor tracking INT32 doOnsetTracking; // Specifies whether to do onset tracking FLOAT32 speedSound; // Speed of sound in m/s FLOAT32 Test2Thresh; // Threshold for passing square mic array steering vector test 2 FLOAT32 Test1Thresh; // Threshold for passing square mic array steering vector test 1 FLOAT32 CTFact; // Coherence Test Factor, scale factor specifying how much larger dominant eigenvalue must be than the sum of all the other eigenvalues FLOAT32 MinPeakThresh; // Minimum height of a histogram peak to be eligible to be reported as a DOA INT32 MinDist; // Distance in integer units of degrees, between peaks in histogram to count as a distinct DOA INT32 BlocksPerReset; // Number of blocks between resets to zero of Rxx and avgHistogram FLOAT32 NFUp; // Noise floor ramp up coefficient FLOAT32 NFUpSlow; // Noise floor ramp up slow coefficient FLOAT32 NFDown; // Noise floor ramp down coefficient FLOAT32 NFMin; // Minimum allowable value for noise floor FLOAT32 NFSNR; // Noise Floor SNR INT32 NFFrameCount; // Max number of frames to ramp up fast FLOAT32 OSUp; // Onset ramp up coefficient FLOAT32 OSDown; // Onset ramp down coefficient FLOAT32 OSJump; // Onset jump coefficient INT32 OSDur; // Number of frames for onset decay FLOAT32* Rxx; // Holds current estimate of Rxx FLOAT32* RxxTemp; // Temporary buffer needed for squaring Rxx FLOAT32* RxxTemp2; // Temporary buffer needed for squaring Rxx FLOAT32* XState; // Holds numStates sets of numMics input data FLOAT32* HistState; // Holds NumHistogramStates histograms FLOAT32* TempReal; // Scratch array of numSB real values FLOAT32* noise_floor; // Noise floor per bin, array of numSB real values INT32* sig_count_bw; // Frame count for fast ramp up per bin, array of numSB real values FLOAT32* onset_threshold; // Onset threshold per bin, array of numSB real values INT32* count_onset; // Onset count per bin, array of numSB real values FLOAT32* xMagAvg; // Average magnitude per bin, array of numSB real values FLOAT32* NFAvg; // Noise floor average per bin, array of numSB real values FLOAT32* AWeights; // A weighting values FLOAT32* TempComplex; // Scratch array of numSB complex values FLOAT32* EvalEst; // Scratch array to hold estimates of dominant eigenvalues FLOAT32* Trace; // Scratch array to hold trace of Rxx for each subband INT32* freqBinMap; // Scratch array to hold mapping between valid bins and discrete frquency indices FLOAT32* avgHistogram; // Scratch array to hold average histogram FLOAT32* currentHistogram; // Scratch array to hold current histogram FLOAT32* smoothedHistogram; // Scratch array to hold smoothed histogram FLOAT32* histExt; // Scratch array to hold smoothed histogram FLOAT32* histCnv; // Scratch array to hold smoothed histogram INT32* histSortedAngles; // Scratch array for histogram peak picking FLOAT32* histSortedValues; // Scratch array for histogram peak picking INT32* histAnglesUsed; // Scratch array for histogram peak picking FLOAT32* valuesOutState; // Scratch array to hold output histogram counts INT32* anglesOutState; // Scratch array to hold output angles FLOAT32* GaussianWindow; // Gaussian window for smoothing histogram FLOAT32* BoxcarWindow; // Rectangular window for smoothing histogram } ModuleSbDOAV2Class;

Variables

Properties

Name

Type

Usage

isHidden

Default value

Range

Units

Fs

float

const

0

48000

Unrestricted

 

fftSize

int

const

0

512

Unrestricted

 

micGeometry

int

const

0

0

Unrestricted

 

MultiSource

int

const

0

0

Unrestricted

 

OutputAngles

int

const

0

0

Unrestricted

 

LowerBin

int

const

0

0

Unrestricted

 

MicGeomSize

float

const

0

0.04

Unrestricted

 

DT1

float

parameter

0

3

-5000:5000

 

BlocksPerHistogram

int

parameter

0

32

1:200

 

boxcarDelay

int

const

1

2

Unrestricted

 

gaussDelay

int

const

1

8

Unrestricted

 

gaussSigma

float

const

1

3.2

Unrestricted

 

NumHistogramStates

int

const

1

4

Unrestricted

 

numStates

int

const

1

4

Unrestricted

 

Opt

int

const

1

1

0:2

 

numMics

int

const

1

4

Unrestricted

 

maxDelay

int

const

1

8

Unrestricted

 

HistStatePtr

int

state

1

0

Unrestricted

 

XStatePtr

int

state

1

0

Unrestricted

 

blockCtr

int

state

1

0

Unrestricted

 

blockCtrReset

int

state

1

0

Unrestricted

 

aboveThresh

int

state

1

0

Unrestricted

 

doHistSmoothing

int

parameter

1

1

Unrestricted

 

doHistAveraging

int

parameter

1

1

Unrestricted

 

doNFTracking

int

parameter

1

0

Unrestricted

 

doOnsetTracking

int

parameter

1

0

Unrestricted

 

speedSound

float

parameter

1

343

300:400

metersPerSecond

Test2Thresh

float

parameter

1

0.2

0:2

 

Test1Thresh

float

parameter

1

0.2

0:2

 

CTFact

float

parameter

1

10

-1:1e+30

 

MinPeakThresh

float

parameter

1

1

-1:1000000

 

MinDist

int

parameter

1

10

0:360

degrees

BlocksPerReset

int

parameter

1

100

100:100000

 

NFUp

float

parameter

1

1.01

0:2

 

NFUpSlow

float

parameter

1

1.001

0:2

 

NFDown

float

parameter

1

0.99

0:1

 

NFMin

float

parameter

1

0.0001

0:1

 

NFSNR

float

parameter

1

2

0:10

 

NFFrameCount

int

parameter

1

3

0:100

 

OSUp

float

parameter

1

1.3

0:10

 

OSDown

float

parameter

1

0.95

0:1

 

OSJump

float

parameter

1

1

0:10

 

OSDur

int

parameter

1

2

0:25

 

Rxx

float*

state

0

[1024 x 1]

Unrestricted

 

RxxTemp

float*

state

0

[1024 x 1]

Unrestricted

 

RxxTemp2

float*

state

0

[1024 x 1]

Unrestricted

 

XState

float*

state

0

[1024 x 1]

Unrestricted

 

HistState

float*

state

0

[1440 x 1]

Unrestricted

 

TempReal

float*

state

0

[32 x 1]

Unrestricted

 

noise_floor

float*

state

0

[32 x 1]

Unrestricted

 

sig_count_bw

int*

state

0

[32 x 1]

Unrestricted

 

onset_threshold

float*

state

0

[32 x 1]

Unrestricted

 

count_onset

int*

state

0

[32 x 1]

Unrestricted

 

xMagAvg

float*

state

0

[32 x 1]

Unrestricted

 

NFAvg

float*

state

0

[32 x 1]

Unrestricted

 

AWeights

float*

parameter

0

[1 x 257]

Unrestricted

 

TempComplex

float*

state

0

[64 x 1]

Unrestricted

 

EvalEst

float*

state

0

[64 x 1]

Unrestricted

 

Trace

float*

state

0

[64 x 1]

Unrestricted

 

freqBinMap

int*

state

0

[32 x 1]

Unrestricted

 

avgHistogram

float*

state

0

[360 x 1]

Unrestricted

 

currentHistogram

float*

state

0

[360 x 1]

Unrestricted

 

smoothedHistogram

float*

state

0

[360 x 1]

Unrestricted

 

histExt

float*

state

0

[376 x 1]

Unrestricted

 

histCnv

float*

state

0

[392 x 1]

Unrestricted

 

histSortedAngles

int*

state

0

[360 x 1]

Unrestricted

 

histSortedValues

float*

state

0

[360 x 1]

Unrestricted

 

histAnglesUsed

int*

state

0

[360 x 1]

Unrestricted

 

valuesOutState

float*

state

0

[10 x 1]

Unrestricted

 

anglesOutState

int*

state

0

[10 x 1]

Unrestricted

 

GaussianWindow

float*

parameter

0

[17 x 1]

Unrestricted

 

BoxcarWindow

float*

parameter

0

[5 x 1]

Unrestricted

 

Pins

Input Pins

Name: Audio Description: audio input Data type: float Channel range: 4 Block size range: Unrestricted Sample rate range: Unrestricted Complex support: Complex Name: BitMap Description: bit map Data type: int Channel range: 1 Block size range: Unrestricted Sample rate range: Unrestricted Complex support: Real

Output Pins

Name: Angle Description: DOA Angle Data type: float Name: ConfidenceValue Description: DOA Confidence Value Data type: float Name: Histogram Description: Histogram Data Data type: float

MATLAB Usage

File Name: sb_doa_v2_module.m