Audio (Signal, TimeData, FrequencyData)¶

The following documents the audio classes and arithmethic operations for audio data. More details and background is given in the concepts ( audio classes, Fourier transform, arithmetic operations).

Classes:

`FrequencyData`(data, frequencies[, comment, ...])	Class for frequency data.
`Signal`(data, sampling_rate[, n_samples, ...])	Class for audio signals.
`TimeData`(data, times[, comment, dtype])	Class for time data.

Functions:

`add`(data[, domain])	Add pyfar audio objects, array likes, and scalars.
`divide`(data[, domain])	Divide pyfar audio objects, array likes, and scalars.
`multiply`(data[, domain])	Multiply pyfar audio objects, array likes, and scalars.
`power`(data[, domain])	Power of pyfar audio objects, array likes, and scalars.
`subtract`(data[, domain])	Subtract pyfar audio objects, array likes, and scalars.

class pyfar.classes.audio.FrequencyData(data, frequencies, comment=None, dtype=<class 'complex'>)[source]¶

Class for frequency data.

Objects of this class contain frequency data which is not directly convertable to the time domain, i.e., non-equidistantly spaced bins or incomplete spectra.

Methods:

`__getitem__`(key)	Get copied slice of the audio object at key.
`__init__`(data, frequencies[, comment, dtype])	Create FrequencyData with data, and frequencies.
`__setitem__`(key, value)	Set channels of audio object at key.
`copy`()	Return a copy of the audio object.
`find_nearest_frequency`(value)	Return the index that is closest to the query frequency.
`flatten`()	Return flattened copy of the audio object.
`reshape`(newshape)	Return reshaped copy of the audio object.

Attributes:

`comment`	Get comment.
`cshape`	Return channel shape.
`domain`	The domain the data is stored in.
`dtype`	The data type of the audio object.
`freq`	Return the data in the frequency domain.
`frequencies`	Frequencies of the discrete signal spectrum.
`n_bins`	Number of frequency bins.

__getitem__(key)¶

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]

__init__(data, frequencies, comment=None, dtype=<class 'complex'>)[source]¶

Create FrequencyData with data, and frequencies.

Parameters

data (array, double) – Raw data in the frequency domain. The memory layout of Data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 frequency bins each.
frequencies (array, double) – Frequencies of the data in Hz. The number of frequencies must match the size of the last dimension of data.
comment (str, optional) – A comment related to the data. The default is 'none'.
dtype (string, optional) – Raw data type of the audio object. The default is float64.

Notes

FrequencyData objects do not support an FFT norm, because this requires knowledge about the sampling rate or the number of samples of the time signal 1.

References

1: J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.

__setitem__(key, value)¶

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)

property comment¶: Get comment.

copy()¶: Return a copy of the audio object.

property cshape¶

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain¶: The domain the data is stored in.

property dtype¶: The data type of the audio object. This can be any data type and precision supported by numpy.

find_nearest_frequency(value)[source]¶

Return the index that is closest to the query frequency.

Parameters: value (float, array-like) – The frequencies for which the indices are to be returned
Returns: indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

flatten()¶

Return flattened copy of the audio object.

Returns: flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq¶: Return the data in the frequency domain.

property frequencies¶: Frequencies of the discrete signal spectrum.

property n_bins¶: Number of frequency bins.

reshape(newshape)¶

Return reshaped copy of the audio object.

Parameters: newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.
Returns: reshaped – reshaped copy of the audio object.
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

class pyfar.classes.audio.Signal(data, sampling_rate, n_samples=None, domain='time', fft_norm='none', comment=None, dtype=<class 'numpy.float64'>)[source]¶

Class for audio signals.

Objects of this class contain data which is directly convertable between time and frequency domain (equally spaced samples and frequency bins).

Methods:

`__getitem__`(key)	Get copied slice of the audio object at key.
`__init__`(data, sampling_rate[, n_samples, ...])	Create Signal with data, and sampling rate.
`__iter__`()	Iterator for `Signal` objects.
`__setitem__`(key, value)	Set channels of audio object at key.
`copy`()	Return a copy of the audio object.
`find_nearest_frequency`(value)	Return the index that is closest to the query frequency.
`find_nearest_time`(value)	Return the index that is closest to the query time.
`flatten`()	Return flattened copy of the audio object.
`reshape`(newshape)	Return reshaped copy of the audio object.

Attributes:

`comment`	Get comment.
`cshape`	Return channel shape.
`domain`	The domain the data is stored in.
`dtype`	The data type of the audio object.
`fft_norm`	The normalization for the Discrete Fourier Transform (DFT).
`freq`	Return the normalized frequency domain data.
`freq_raw`	Return the frequency domain data without normalization.
`frequencies`	Frequencies of the discrete signal spectrum.
`n_bins`	Number of frequency bins.
`n_samples`	The number of samples.
`sampling_rate`	The sampling rate of the signal.
`signal_length`	The length of the data in seconds.
`signal_type`	The signal type is `'energy'` if the `fft_norm = None` and `'power'` otherwise.
`time`	Return the data in the time domain.
`times`	Time instances the signal is sampled at.

__getitem__(key)¶

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]

__init__(data, sampling_rate, n_samples=None, domain='time', fft_norm='none', comment=None, dtype=<class 'numpy.float64'>)[source]¶

Create Signal with data, and sampling rate.

Parameters

data (ndarray, double) – Raw data of the signal in the time or frequency domain. The memory layout of data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 samples or frequency bins each. Frequency data must be provided as single sided spectra, i.e., for all frequencies between 0 Hz and half the sampling rate.
sampling_rate (double) – Sampling rate in Hz
n_samples (int, optional) – Number of samples of the time signal. Required if domain is 'freq'. The default is None, which assumes an even number of samples if the data is provided in the frequency domain.
domain ('time', 'freq', optional) – Domain of data. The default is 'time'
fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be 'none', 'unitary', 'amplitude', 'rms', 'power', or 'psd'. See normalization and 2 for more information. The default is 'none', which is typically used for energy signals, such as impulse responses.
comment (str) – A comment related to data. The default is None.
dtype (string, optional) – Raw data type of the audio object. The default is float64

References

2: J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.

__iter__()[source]¶

Iterator for Signal objects.

Iterate across the first dimension of a Signal. The actual iteration is handled through numpy’s array iteration.

Examples

Iterate channels of a Signal

>>> import pyfar as pf
>>> signal = pf.signals.impulse(2, amplitude=[1, 1, 1])
>>> for idx, channel in enumerate(signal):
>>>     channel.time *= idx
>>>     signal[idx] = channel

__setitem__(key, value)¶

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)

property comment¶: Get comment.

copy()¶: Return a copy of the audio object.

property cshape¶

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain¶: The domain the data is stored in.

property dtype¶: The data type of the audio object. This can be any data type and precision supported by numpy.

property fft_norm¶

The normalization for the Discrete Fourier Transform (DFT).

See normalization and FFT concepts for more information.

find_nearest_frequency(value)¶

Return the index that is closest to the query frequency.

Parameters: value (float, array-like) – The frequencies for which the indices are to be returned
Returns: indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

find_nearest_time(value)¶

Return the index that is closest to the query time.

Parameters: value (float, array-like) – The times for which the indices are to be returned
Returns: indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

flatten()¶

Return flattened copy of the audio object.

Returns: flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq¶

Return the normalized frequency domain data.

The normalized data is usually used for inspecting the data, e.g., using plots or when extracting information such as the amplitude of harmonic components. Most processing operations, e.g., frequency domain convolution, require the non-normalized data stored as freq_raw.

property freq_raw¶

Return the frequency domain data without normalization.

Most processing operations, e.g., frequency domain convolution, require the non-normalized data. The normalized data stored as freq is usually used for inspecting the data, e.g., using plots or when extracting information such as the amplitude of harmonic components.

property frequencies¶: Frequencies of the discrete signal spectrum.

property n_bins¶: Number of frequency bins.

property n_samples¶: The number of samples.

reshape(newshape)¶

Return reshaped copy of the audio object.

Parameters: newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.
Returns: reshaped – reshaped copy of the audio object.
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property sampling_rate¶: The sampling rate of the signal.

property signal_length¶: The length of the data in seconds.

property signal_type¶: The signal type is 'energy' if the fft_norm = None and 'power' otherwise.

property time¶: Return the data in the time domain.

property times¶: Time instances the signal is sampled at.

class pyfar.classes.audio.TimeData(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶

Class for time data.

Objects of this class contain time data which is not directly convertable to frequency domain, i.e., non-equidistant samples.

Methods:

`__getitem__`(key)	Get copied slice of the audio object at key.
`__init__`(data, times[, comment, dtype])	Create TimeData object with data, and times.
`__setitem__`(key, value)	Set channels of audio object at key.
`copy`()	Return a copy of the audio object.
`find_nearest_time`(value)	Return the index that is closest to the query time.
`flatten`()	Return flattened copy of the audio object.
`reshape`(newshape)	Return reshaped copy of the audio object.

Attributes:

`comment`	Get comment.
`cshape`	Return channel shape.
`domain`	The domain the data is stored in.
`dtype`	The data type of the audio object.
`n_samples`	The number of samples.
`signal_length`	The length of the data in seconds.
`time`	Return the data in the time domain.
`times`	Time in seconds at which the signal is sampled.

__getitem__(key)¶

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]

__init__(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶

Create TimeData object with data, and times.

Parameters

data (array, double) – Raw data in the time domain. The memory layout of data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 samples each.
times (array, double) – Times in seconds at which the data is sampled. The number of times must match the size of the last dimension of data.
comment (str, optional) – A comment related to data. The default is 'none'.
dtype (string, optional) – Raw data type of the audio object. The default is float64.

__setitem__(key, value)¶

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)

property comment¶: Get comment.

copy()¶: Return a copy of the audio object.

property cshape¶

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain¶: The domain the data is stored in.

property dtype¶: The data type of the audio object. This can be any data type and precision supported by numpy.

find_nearest_time(value)[source]¶

Return the index that is closest to the query time.

Parameters: value (float, array-like) – The times for which the indices are to be returned
Returns: indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

flatten()¶

Return flattened copy of the audio object.

Returns: flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property n_samples¶: The number of samples.

reshape(newshape)¶

Return reshaped copy of the audio object.

Parameters: newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.
Returns: reshaped – reshaped copy of the audio object.
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property signal_length¶: The length of the data in seconds.

property time¶: Return the data in the time domain.

property times¶: Time in seconds at which the signal is sampled.

pyfar.classes.audio.add(data: tuple, domain='freq')[source]¶

Add pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be added. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.
for –
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See pyfar.dsp.fft.normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.divide(data: tuple, domain='freq')[source]¶

Divide pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be divided. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.multiply(data: tuple, domain='freq')[source]¶

Multiply pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be multiplied. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.power(data: tuple, domain='freq')[source]¶

Power of pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – The base for which the power is calculated. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.subtract(data: tuple, domain='freq')[source]¶

Subtract pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be subtracted. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together. See arithmetic operations for possible combinations of Signal FFT normalizations.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array