Audio (Signal, TimeData, FrequencyData)¶

Container classes and arithmethic operations for audio data.

The classes TimeData and FrequencyData are intended to store incomplete or non-equidistant audio data in the time and frequency domain. The class Signal can be used to store equidistant and complete audio data that can be converted between the time and frequency domain by means of the Fourier transform.

Arithmetic operations can be applied in the time and frequency domain and are implemented in the methods add, subtract, multiply, divide, and power. For example, two Signal, TimeData, or FrequencyData instances can be added in the time domain by

>>> result = pyfar.classes.audio.add((signal_1, signal_2), 'time')

and in the frequency domain by

>>> result = pyfar.classes.audio.add((signal_1, signal_2), 'freq')

This also works with more than two instances and supports array likes and scalar values, e.g.,

>>> result = pyfar.classes.audio.add((signal_1, 1), 'time')

In this case the scalar 1 is broadcasted, i.e., it is is added to every sample of signal (or every bin in case of a frequency domain operation).

The operators +, -, *, /, and ** are overloaded for convenience. Note, however, that their behavior depends on the Audio object. Frequency domain operations are applied for Signal and FrequencyData objects, i.e,

>>> result = signal1 + signal2

is equivalent to

>>> result = pyfar.classes.audio.add((signal1, signal2), 'freq')

Time domain operations are applied for TimeData objects, i.e.,

>>> result = time_data_1 + time_data_2

is equivalent to

>>> result = pyfar.classes.audio.add((time_data_1, time_data_2), 'time')

In addition to the arithmetic operations, the equality operator is overloaded to allow comparisons

>>> signal_1 == signal_2

Classes:

`FrequencyData`(data, frequencies[, fft_norm, ...])	Class for frequency data.
`Signal`(data, sampling_rate[, n_samples, ...])	Class for audio signals.
`TimeData`(data, times[, comment, dtype])	Class for time data.

Functions:

`add`(data[, domain])	Add pyfar audio objects, array likes, and scalars.
`divide`(data[, domain])	Divide pyfar audio objects, array likes, and scalars.
`multiply`(data[, domain])	Multiply pyfar audio objects, array likes, and scalars.
`power`(data[, domain])	Power of pyfar audio objects, array likes, and scalars.
`subtract`(data[, domain])	Subtract pyfar audio objects, array likes, and scalars.

class pyfar.classes.audio.FrequencyData(data, frequencies, fft_norm=None, comment=None, dtype=<class 'complex'>)[source]¶

Bases: pyfar.classes.audio._Audio

Class for frequency data.

Objects of this class contain frequency data which is not directly convertable to the time domain, i.e., non-equidistantly spaced bins or incomplete spectra.

Methods:

`__getitem__`(key)	Get copied slice of the audio object at key.
`__init__`(data, frequencies[, fft_norm, ...])	Create FrequencyData with data, and frequencies.
`__setitem__`(key, value)	Set channels of audio object at key.
`copy`()	Return a copy of the audio object.
`find_nearest_frequency`(value)	Return the index that is closest to the query frequency.
`flatten`()	Return flattened copy of the audio object.
`reshape`(newshape)	Return reshaped copy of the audio object.

Attributes:

`comment`	Get comment.
`cshape`	Return channel shape.
`domain`	The domain the data is stored in.
`dtype`	The data type of the audio object.
`fft_norm`	The normalization for the Discrete Fourier Transform (DFT).
`freq`	Return the data in the frequency domain.
`frequencies`	Frequencies of the discrete signal spectrum.
`n_bins`	Number of frequency bins.

__getitem__(key)¶

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]

__init__(data, frequencies, fft_norm=None, comment=None, dtype=<class 'complex'>)[source]¶

Create FrequencyData with data, and frequencies.

Parameters

data (array, double) – Raw data in the frequency domain. The memory layout of Data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 frequency bins each.
frequencies (array, double) – Frequencies of the data in Hz. The number of frequencies must match the size of the last dimension of data.
fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be 'none', 'unitary', 'amplitude', 'rms', 'power', or 'psd'. See normalization and 1 for more information. The default is 'none', which is typically used for energy signals, such as impulse responses.
comment (str, optional) – A comment related to the data. The default is 'none'.
dtype (string, optional) – Raw data type of the audio object. The default is float64.

References

1: J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.

__setitem__(key, value)¶

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)

property comment¶: Get comment.

copy()¶: Return a copy of the audio object.

property cshape¶

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain¶: The domain the data is stored in.

property dtype¶: The data type of the audio object. This can be any data type and precision supported by numpy.

property fft_norm¶

The normalization for the Discrete Fourier Transform (DFT).

See normalization for more information.

find_nearest_frequency(value)[source]¶

Return the index that is closest to the query frequency.

Parameters: value (float, array-like) – The frequencies for which the indices are to be returned
Returns: indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

flatten()¶

Return flattened copy of the audio object.

Returns: flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq¶: Return the data in the frequency domain.

property frequencies¶: Frequencies of the discrete signal spectrum.

property n_bins¶: Number of frequency bins.

reshape(newshape)¶

Return reshaped copy of the audio object.

Parameters: newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.
Returns: reshaped – reshaped copy of the audio object.
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

class pyfar.classes.audio.Signal(data, sampling_rate, n_samples=None, domain='time', fft_norm=None, comment=None, dtype=<class 'numpy.float64'>)[source]¶

Bases: pyfar.classes.audio.FrequencyData, pyfar.classes.audio.TimeData

Class for audio signals.

Objects of this class contain data which is directly convertable between time and frequency domain (equally spaced samples and frequency bins).

Methods:

`__getitem__`(key)	Get copied slice of the audio object at key.
`__init__`(data, sampling_rate[, n_samples, ...])	Create Signal with data, and sampling rate.
`__iter__`()	Iterator for `Signal` objects.
`__setitem__`(key, value)	Set channels of audio object at key.
`copy`()	Return a copy of the audio object.
`find_nearest_frequency`(value)	Return the index that is closest to the query frequency.
`find_nearest_time`(value)	Return the index that is closest to the query time.
`flatten`()	Return flattened copy of the audio object.
`reshape`(newshape)	Return reshaped copy of the audio object.

Attributes:

`comment`	Get comment.
`cshape`	Return channel shape.
`domain`	The domain the data is stored in.
`dtype`	The data type of the audio object.
`fft_norm`	The normalization for the Discrete Fourier Transform (DFT).
`freq`	Return the data in the frequency domain.
`frequencies`	Frequencies of the discrete signal spectrum.
`n_bins`	Number of frequency bins.
`n_samples`	The number of samples.
`sampling_rate`	The sampling rate of the signal.
`signal_length`	The length of the data in seconds.
`signal_type`	The signal type is `'energy'` if the `fft_norm = None` and `'power'` otherwise.
`time`	Return the data in the time domain.
`times`	Time instances the signal is sampled at.

__getitem__(key)¶

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]

__init__(data, sampling_rate, n_samples=None, domain='time', fft_norm=None, comment=None, dtype=<class 'numpy.float64'>)[source]¶

Create Signal with data, and sampling rate.

Parameters

data (ndarray, double) – Raw data of the signal in the time or frequency domain. The memory layout of data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 samples or frequency bins each. Frequency data must be provided as single sided spectra, i.e., for all frequencies between 0 Hz and half the sampling rate.
sampling_rate (double) – Sampling rate in Hz
n_samples (int, optional) – Number of samples of the time signal. Required if domain is 'freq'. The default is None, which assumes an even number of samples if the data is provided in the frequency domain.
domain ('time', 'freq', optional) – Domain of data. The default is 'time'
fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be 'none', 'unitary', 'amplitude', 'rms', 'power', or 'psd'. See normalization and 2 for more information. The default is 'none', which is typically used for energy signals, such as impulse responses.
comment (str) – A comment related to data. The default is None.
dtype (string, optional) – Raw data type of the audio object. The default is float64

References

2: J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.

__iter__()[source]¶

Iterator for Signal objects.

Iterate across the first dimension of a Signal. The actual iteration is handled through numpy’s array iteration.

Examples

Iterate channels of a Signal

>>> import pyfar as pf
>>> signal = pf.signals.impulse(2, amplitude=[1, 1, 1])
>>> for idx, channel in enumerate(signal):
>>>     channel.time *= idx
>>>     signal[idx] = channel

__setitem__(key, value)¶

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)

property comment¶: Get comment.

copy()¶: Return a copy of the audio object.

property cshape¶

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain¶: The domain the data is stored in.

property dtype¶: The data type of the audio object. This can be any data type and precision supported by numpy.

property fft_norm¶

The normalization for the Discrete Fourier Transform (DFT).

See normalization for more information.

find_nearest_frequency(value)¶

Return the index that is closest to the query frequency.

Parameters: value (float, array-like) – The frequencies for which the indices are to be returned
Returns: indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

find_nearest_time(value)¶

Return the index that is closest to the query time.

Parameters: value (float, array-like) – The times for which the indices are to be returned
Returns: indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

flatten()¶

Return flattened copy of the audio object.

Returns: flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq¶: Return the data in the frequency domain.

property frequencies¶: Frequencies of the discrete signal spectrum.

property n_bins¶: Number of frequency bins.

property n_samples¶: The number of samples.

reshape(newshape)¶

Return reshaped copy of the audio object.

Parameters: newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.
Returns: reshaped – reshaped copy of the audio object.
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property sampling_rate¶: The sampling rate of the signal.

property signal_length¶: The length of the data in seconds.

property signal_type¶: The signal type is 'energy' if the fft_norm = None and 'power' otherwise.

property time¶: Return the data in the time domain.

property times¶: Time instances the signal is sampled at.

class pyfar.classes.audio.TimeData(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶

Bases: pyfar.classes.audio._Audio

Class for time data.

Objects of this class contain time data which is not directly convertable to frequency domain, i.e., non-equidistant samples.

Methods:

`__getitem__`(key)	Get copied slice of the audio object at key.
`__init__`(data, times[, comment, dtype])	Create TimeData object with data, and times.
`__setitem__`(key, value)	Set channels of audio object at key.
`copy`()	Return a copy of the audio object.
`find_nearest_time`(value)	Return the index that is closest to the query time.
`flatten`()	Return flattened copy of the audio object.
`reshape`(newshape)	Return reshaped copy of the audio object.

Attributes:

`comment`	Get comment.
`cshape`	Return channel shape.
`domain`	The domain the data is stored in.
`dtype`	The data type of the audio object.
`n_samples`	The number of samples.
`signal_length`	The length of the data in seconds.
`time`	Return the data in the time domain.
`times`	Time in seconds at which the signal is sampled.

__getitem__(key)¶

Get copied slice of the audio object at key.

Examples

Get the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> first_channel = signal[0]

__init__(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶

Create TimeData object with data, and times.

Parameters

data (array, double) – Raw data in the time domain. The memory layout of data is ‘C’. E.g. data of shape = (3, 2, 1024) has 3 x 2 channels with 1024 samples each.
times (array, double) – Times in seconds at which the data is sampled. The number of times must match the size of the last dimension of data.
comment (str, optional) – A comment related to data. The default is 'none'.
dtype (string, optional) – Raw data type of the audio object. The default is float64.

__setitem__(key, value)¶

Set channels of audio object at key.

Examples

Set the first channel of a multi channel audio object

>>> import pyfar as pf
>>> signal = pf.signals.noise(10, rms=[1, 1])
>>> signal[0] = pf.signals.noise(10, rms=2)

property comment¶: Get comment.

copy()¶: Return a copy of the audio object.

property cshape¶

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain¶: The domain the data is stored in.

property dtype¶: The data type of the audio object. This can be any data type and precision supported by numpy.

find_nearest_time(value)[source]¶

Return the index that is closest to the query time.

Parameters: value (float, array-like) – The times for which the indices are to be returned
Returns: indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
Return type: int, array-like

flatten()¶

Return flattened copy of the audio object.

Returns: flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property n_samples¶: The number of samples.

reshape(newshape)¶

Return reshaped copy of the audio object.

Parameters: newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.
Returns: reshaped – reshaped copy of the audio object.
Return type: Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property signal_length¶: The length of the data in seconds.

property time¶: Return the data in the time domain.

property times¶: Time in seconds at which the signal is sampled.

pyfar.classes.audio.add(data: tuple, domain='freq')[source]¶

Add pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be added. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See pyfar.dsp.fft.normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.divide(data: tuple, domain='freq')[source]¶

Divide pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be divided. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.multiply(data: tuple, domain='freq')[source]¶

Multiply pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be multiplied. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.power(data: tuple, domain='freq')[source]¶

Power of pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – The base for which the power is calculated. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array

pyfar.classes.audio.subtract(data: tuple, domain='freq')[source]¶

Subtract pyfar audio objects, array likes, and scalars.

Pyfar audio objects are: Signal, TimeData, and FrequencyData.

Parameters

data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be subtracted. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g., TimeData and FrequencyData objects do not work together.
domain ('time', 'freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (See normalization). The default is 'freq'.

Returns

results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is 'none' if all FFT norms are 'none'. Otherwise the first fft_norm that is not 'none' is used.

Return type

Signal, TimeData, FrequencyData, numpy array