Audio (Signal, TimeData, FrequencyData)

The following documents the audio classes and arithmethic operations for audio data. More details and background is given in the concepts ( audio classes, Fourier transform, arithmetic operations).

Classes:

FrequencyData(data, frequencies[, comment, ...])

Class for frequency data.

Signal(data, sampling_rate[, n_samples, ...])

Class for audio signals.

TimeData(data, times[, comment, dtype])

Class for time data.

Functions:

add(data[, domain])

Add pyfar audio objects, array likes, and scalars.

divide(data[, domain])

Divide pyfar audio objects, array likes, and scalars.

multiply(data[, domain])

Multiply pyfar audio objects, array likes, and scalars.

power(data[, domain])

Power of pyfar audio objects, array likes, and scalars.

subtract(data[, domain])

Subtract pyfar audio objects, array likes, and scalars.

class pyfar.classes.audio.FrequencyData(data, frequencies, comment=None, dtype=<class 'complex'>)[source]

Class for frequency data.

Objects of this class contain frequency data which is not directly convertable to the time domain, i.e., non-equidistantly spaced bins or incomplete spectra.

Methods:

__getitem__(key)

Get copied slice of the audio object at key.

__init__(data, frequencies[, comment, dtype])

Create FrequencyData with data, and frequencies.

__setitem__(key, value)

Set channels of audio object at key.

copy()

Return a copy of the audio object.

find_nearest_frequency(value)

Return the index that is closest to the query frequency.

flatten()

Return flattened copy of the audio object.

reshape(newshape)

Return reshaped copy of the audio object.

Attributes:

comment

Get comment.

cshape

Return channel shape.

domain

The domain the data is stored in.

dtype

The data type of the audio object.

freq

Return the data in the frequency domain.

frequencies

Frequencies of the discrete signal spectrum.

n_bins

Number of frequency bins.

__getitem__(key)

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]
__init__(data, frequencies, comment=None, dtype=<class 'complex'>)[source]

Create FrequencyData with data, and frequencies.

Parameters
  • data (array, double) – Raw data in the frequency domain. The memory layout of Data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 frequency bins each.

  • frequencies (array, double) – Frequencies of the data in Hz. The number of frequencies must match the size of the last dimension of data.

  • comment (str, optional) – A comment related to the data. The default is 'none'.

  • dtype (string, optional) – Raw data type of the audio object. The default is float64.

Notes

FrequencyData objects do not support an FFT norm, because this requires knowledge about the sampling rate or the number of samples of the time signal 1.

References

1

J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.

__setitem__(key, value)

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)
property comment

Get comment.

copy()

Return a copy of the audio object.

property cshape

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain

The domain the data is stored in.

property dtype

The data type of the audio object. This can be any data type and precision supported by numpy.

find_nearest_frequency(value)[source]

Return the index that is closest to the query frequency.

Parameters

value (float, array-like) – The frequencies for which the indices are to be returned

Returns

indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.

Return type

int, array-like

flatten()

Return flattened copy of the audio object.

Returns

flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)

Return type

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq

Return the data in the frequency domain.

property frequencies

Frequencies of the discrete signal spectrum.

property n_bins

Number of frequency bins.

reshape(newshape)

Return reshaped copy of the audio object.

Parameters

newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.

Returns

reshaped – reshaped copy of the audio object.

Return type

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

class pyfar.classes.audio.Signal(data, sampling_rate, n_samples=None, domain='time', fft_norm='none', comment=None, dtype=<class 'numpy.float64'>)[source]

Class for audio signals.

Objects of this class contain data which is directly convertable between time and frequency domain (equally spaced samples and frequency bins).

Methods:

__getitem__(key)

Get copied slice of the audio object at key.

__init__(data, sampling_rate[, n_samples, ...])

Create Signal with data, and sampling rate.

__iter__()

Iterator for Signal objects.

__setitem__(key, value)

Set channels of audio object at key.

copy()

Return a copy of the audio object.

find_nearest_frequency(value)

Return the index that is closest to the query frequency.

find_nearest_time(value)

Return the index that is closest to the query time.

flatten()

Return flattened copy of the audio object.

reshape(newshape)

Return reshaped copy of the audio object.

Attributes:

comment

Get comment.

cshape

Return channel shape.

domain

The domain the data is stored in.

dtype

The data type of the audio object.

fft_norm

The normalization for the Discrete Fourier Transform (DFT).

freq

Return the normalized frequency domain data.

freq_raw

Return the frequency domain data without normalization.

frequencies

Frequencies of the discrete signal spectrum.

n_bins

Number of frequency bins.

n_samples

The number of samples.

sampling_rate

The sampling rate of the signal.

signal_length

The length of the data in seconds.

signal_type

The signal type is 'energy' if the fft_norm = None and 'power' otherwise.

time

Return the data in the time domain.

times

Time instances the signal is sampled at.

__getitem__(key)

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]
__init__(data, sampling_rate, n_samples=None, domain='time', fft_norm='none', comment=None, dtype=<class 'numpy.float64'>)[source]

Create Signal with data, and sampling rate.

Parameters
  • data (ndarray, double) – Raw data of the signal in the time or frequency domain. The memory layout of data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 samples or frequency bins each. Frequency data must be provided as single sided spectra, i.e., for all frequencies between 0 Hz and half the sampling rate.

  • sampling_rate (double) – Sampling rate in Hz

  • n_samples (int, optional) – Number of samples of the time signal. Required if domain is 'freq'. The default is None, which assumes an even number of samples if the data is provided in the frequency domain.

  • domain ('time', 'freq', optional) – Domain of data. The default is 'time'

  • fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be 'none', 'unitary', 'amplitude', 'rms', 'power', or 'psd'. See normalization and 2 for more information. The default is 'none', which is typically used for energy signals, such as impulse responses.

  • comment (str) – A comment related to data. The default is None.

  • dtype (string, optional) – Raw data type of the audio object. The default is float64

References

2

J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.

__iter__()[source]

Iterator for Signal objects.

Iterate across the first dimension of a Signal. The actual iteration is handled through numpy’s array iteration.

Examples

Iterate channels of a Signal

>>> import pyfar as pf
>>> signal = pf.signals.impulse(2, amplitude=[1, 1, 1])
>>> for idx, channel in enumerate(signal):
>>>     channel.time *= idx
>>>     signal[idx] = channel
__setitem__(key, value)

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)
property comment

Get comment.

copy()

Return a copy of the audio object.

property cshape

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain

The domain the data is stored in.

property dtype

The data type of the audio object. This can be any data type and precision supported by numpy.

property fft_norm

The normalization for the Discrete Fourier Transform (DFT).

See normalization and FFT concepts for more information.

find_nearest_frequency(value)

Return the index that is closest to the query frequency.

Parameters

value (float, array-like) – The frequencies for which the indices are to be returned

Returns

indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.

Return type

int, array-like

find_nearest_time(value)

Return the index that is closest to the query time.

Parameters

value (float, array-like) – The times for which the indices are to be returned

Returns

indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.

Return type

int, array-like

flatten()

Return flattened copy of the audio object.

Returns

flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)

Return type

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq

Return the normalized frequency domain data.

The normalized data is usually used for inspecting the data, e.g., using plots or when extracting information such as the amplitude of harmonic components. Most processing operations, e.g., frequency domain convolution, require the non-normalized data stored as freq_raw.

property freq_raw

Return the frequency domain data without normalization.

Most processing operations, e.g., frequency domain convolution, require the non-normalized data. The normalized data stored as freq is usually used for inspecting the data, e.g., using plots or when extracting information such as the amplitude of harmonic components.

property frequencies

Frequencies of the discrete signal spectrum.

property n_bins

Number of frequency bins.

property n_samples

The number of samples.

reshape(newshape)

Return reshaped copy of the audio object.

Parameters

newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.

Returns

reshaped – reshaped copy of the audio object.

Return type

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property sampling_rate

The sampling rate of the signal.

property signal_length

The length of the data in seconds.

property signal_type

The signal type is 'energy' if the fft_norm = None and 'power' otherwise.

property time

Return the data in the time domain.

property times

Time instances the signal is sampled at.

class pyfar.classes.audio.TimeData(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]

Class for time data.

Objects of this class contain time data which is not directly convertable to frequency domain, i.e., non-equidistant samples.

Methods:

__getitem__(key)

Get copied slice of the audio object at key.

__init__(data, times[, comment, dtype])

Create TimeData object with data, and times.

__setitem__(key, value)

Set channels of audio object at key.

copy()

Return a copy of the audio object.

find_nearest_time(value)

Return the index that is closest to the query time.

flatten()

Return flattened copy of the audio object.

reshape(newshape)

Return reshaped copy of the audio object.

Attributes:

comment

Get comment.

cshape

Return channel shape.

domain

The domain the data is stored in.

dtype

The data type of the audio object.

n_samples

The number of samples.

signal_length

The length of the data in seconds.

time

Return the data in the time domain.

times

Time in seconds at which the signal is sampled.

__getitem__(key)

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]
__init__(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]

Create TimeData object with data, and times.

Parameters
  • data (array, double) – Raw data in the time domain. The memory layout of data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 samples each.

  • times (array, double) – Times in seconds at which the data is sampled. The number of times must match the size of the last dimension of data.

  • comment (str, optional) – A comment related to data. The default is 'none'.

  • dtype (string, optional) – Raw data type of the audio object. The default is float64.

__setitem__(key, value)

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)
property comment

Get comment.

copy()

Return a copy of the audio object.

property cshape

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain

The domain the data is stored in.

property dtype

The data type of the audio object. This can be any data type and precision supported by numpy.

find_nearest_time(value)[source]

Return the index that is closest to the query time.

Parameters

value (float, array-like) – The times for which the indices are to be returned

Returns

indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.

Return type

int, array-like

flatten()

Return flattened copy of the audio object.

Returns

flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)

Return type

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property n_samples

The number of samples.

reshape(newshape)

Return reshaped copy of the audio object.

Parameters

newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.

Returns

reshaped – reshaped copy of the audio object.

Return type

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property signal_length

The length of the data in seconds.

property time

Return the data in the time domain.

property times

Time in seconds at which the signal is sampled.

pyfar.classes.audio.add(data: tuple, domain='freq')[source]

Add pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters
  • data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be added. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.

  • for

  • domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See pyfar.dsp.fft.normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.divide(data: tuple, domain='freq')[source]

Divide pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters
  • data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be divided. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.

  • domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.multiply(data: tuple, domain='freq')[source]

Multiply pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters
  • data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be multiplied. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.

  • domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.power(data: tuple, domain='freq')[source]

Power of pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters
  • data (tuple of the form (data_1, data_2, ..., data_N)) – The base for which the power is calculated. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.

  • domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.subtract(data: tuple, domain='freq')[source]

Subtract pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters
  • data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be subtracted. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.

  • domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array