Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The ‘Block by Block’ profiling will provide information MHz and Memory consumption of each individual element of the .awd at the time of the profiling, while ‘Peak’ profiling will provide information about average and peak CPU cycle usage over a user-specified time.

...

Image Removedimage-20240426-184052.pngImage Added

Selecting ‘Profile Block by Block’ will automatically run profiling on the entire running .awd layout at the time of selection, while ‘Profile Peak’ is manually started once the desired ‘Sampling Period’ and ‘Test Length” fields are set:

...

  • Total ticks per block process available (calculated and measured)

  • Average ticks per block

  • Instantaneous ticks per block

  • Peak ticks per block

  • Total memory usage of the system

  • Shared heap memory for multi-instance architectures

...

image-20240430-034601.pngImage Added

Profile Block by Block Terminology

...

Field

Definition

Unit

Total ticks per block process available

Measurement from the end of the CPU first interrupt, to the end of the next CPU interrupt (i.e., how many clock cycles have elapsed on the processor). This is a rough indication of how many clock cycles are available for processing. (You will not be able to utilize 100% of cycles because the audio interrupt handle requires some processing.) 

This is an especially important number to check when you are bringing up new hardware. The value shown here should be close to:

(Processor Speed) x (Block Size of Processing) / (Sample Rate)

If this doesn’t match, then there could be a mismatch in your processor speed, the audio sample rate, or the underlying “fundamental block size” of your implementation.

Lastly, there are two separate line items for ‘Total ticks per block process available’.

  • Calculated: Manually calculated based on the processor clock speed and the block size and sample rate of the layout

  • Measured: Processor clock ticks measured between blocks, based on the calling rate of the audio pump

Both of these units should nearly match, however be aware that if there are CPU overflows (>100% processing load), the ‘Measured’ metric will be doubled. In this scenario, since the audio pump is not completed during the current block of processing, the measured metric would include the next block as well.

CPU clock cycles

Average ticks per block used

Average of CPU clock cycles per block of audio data (10x)

CPU clock cycles

Instantaneous ticks per block used

CPU clock cycles required to process the last block of audio. This is the instantaneous measurement without smoothing. This number will change every time you profile.

CPU clock cycles

Peak ticks per block used

Peak instantaneous CPU clock cycles consumed when processing a block of audio data. This is a “sticky measurement” and shows the peak value since system startup. If you reprofile, then it will reset this value.

CPU clock cycles

Fast Heap

Memory usage from the memory allocated in the Fast Heap

Words

Fast Heap B

Memory usage from the memory allocated in the Fast Heap B

Words

Slow Heap

Memory usage from the memory allocated in the Slow Heap

Words

Total Memory

Fast Heap + Fast Heap B + Slow Heap

Words

Shared Heap

Memory usage from the allocated Shared Heap

Words

Heaps

At initialization time, memory to be used by the AWE Core instance for signal processing is allocated. The AWE Core refers to this memory as the heap. By default, AWE Core supports three heaps for which the BSP is responsible for allocating storage. Most commonly, heaps are allocated statically as large arrays. The heaps are:

  • FASTA: storage accessible using the least time

  • FASTB: a secondary bank of fast storage. Useful for memory that can be concurrently accessed with FASTA heaps

  • SLOW: storage usually external and so more slowly accessed

  • SHARED: Storage shared by and accessible to multiple Audio Weaver instances, used for multi-instance or IPC communication

To calculate Memory in MB: ((Total Heap Memory) * 4)/1000000

...

The rest of the profiling window provides profiling information for each individual module and wire (audio buffer) in the running .awd layout:

...

image-20240426-145935.pngImage Added

The ‘Top_0’ Module Name line item contains profiling information for the entire .awd layout’s processing.  In a multi-instance architecture, aggregate profiling information for each discrete Audio Weaver instance is labeled as ‘Top_<AWE Instance #>’.  If the .awd utilizes multiple audio processing threads, aggregate profiling information for each discrete thread is labeled as ‘Top_<AWE Instance #>_<Thread ID>’:

...

In a multi-instance architecture, by default the profiling pop-up window displays profiling information for all of the Audio Weaver instances used:

...

image-20240426-145629.pngImage Added

To display profiling information for only one specific Audio Weaver instance, you can do so by selecting the desired Audio Weaver instance from the Instance drop down menu in the upper left-hand corner:

...

Image Removedimage-20240426-185955.pngImage Added

Peak Profiling

When selecting the Profile Peak real time profiling option, the Peak Profile Window will pop up:

...

Image Removedimage-20240426-190054.pngImage Added

As discussed above, the ‘Sampling Period’ and ‘Test Length’ fields must be entered in order to run the peak profiling.  The time unit for both fields are seconds, and the default values are 0.5s sampling period and 10s test length.  When ready to start the peak profiling, simply click the ‘Start’ button.

...

The ‘Peak vs Average Cycles’ graph at the top of the Peak Profiling pop-up window displays peak and average CPU usage percentages of the processor Audio Weaver is running on.  If the running .awd has a multi-instance architecture or contains multi-threading, multiple profiling measurements for each Audio Weaver instance and thread will also display.  The x-axis is time in seconds and the y-axis is CPU percentage:

...

image-20240426-191149.pngImage Added

The bottom portion of the Peak Profiling pop-up window displays a Legend for the Peak vs Average Cycles line graph and an Instance List to select which Audio Weaver instance profiling measurements to display:

...

As mentioned earlier in this application note, Audio Weaver also features Manual Profiling, which collects the same exact profiling information as the real-time block by block profiling but allows a user to select a specific number of audio frames to process.  This may be useful for .awd layouts that cannot be run in real-time or for obtaining profiling information on targets that haven’t been configured for real-time audio yetin a different manner. Rather than profiling the runtime input audio stream, during manual profiling, real-time processing is halted on the target and all layouts present in the AWD/AWJ signal flow are processed one at a time through tuning commands (pump_layout), for the user specified amount of audio frames. The yielded profiling results may be preferred in some cases, as this decouples discrete layout processing (and thus layout thread priority) from real-time audio device interrupts. This allows for accurate profiling even in cases when the target is not configured for real-time audio, or if a layout is unable to finish executing in the allotted clock cycles (calculated total ticks per block process available).

To Manual Profile an .awd layout, navigate to ‘Tools > Profile Running Layout > Manual Profile Layout’ in the Audio Weaver Designer toolbar while in Tuning Mode:

...

For multi-instance architectures, every Audio Weaver profiling utility enables users to export either profiling data for all instances or individual profiling data for a selected Audio Weaver instance:

...

...

image-20240429-145818.pngImage Added

Image RemovedImage Removedimage-20240429-145854.pngImage Added

For block by block and manual profiling of multi-instance architectures, if profiling data for all instances are selected for export, Audio Weaver will generate one aggregate profiling CSV file and individual CSV files for each Audio Weaver instance in the system:

...

Below are some diagrams that further illustrate the real time profiling function in Audio Weaver.

...

Image RemovedImage Added

Image RemovedImage Added

CycleBurner, BiquadLoading, and FIRLoading Modules

...

This module is used to check the memory bandwidth of the target. At instantiation time you specify the size of the memory buffer (memSize) and in which heap it should be allocated (memHeap). Then at run time, the module writes a block counter value into every value of the array. It repeats this blockWriteCount times per block process. That is, every time the processing function is called, the module performs a total blockWriteCount*memSize memory write operations. All write operations write the current value of the block counter.

image-20240323-010748.pngImage Removedimage-20240323-010748.pngImage Added

Additional Notes on Audio Weaver Profiling

...