Audio (Signal, TimeData, FrequencyData)¶
Container classes and arithmethic operations for audio data.
The classes TimeData and FrequencyData are intended to
store incomplete or non-equidistant audio data in the time and frequency
domain. The class Signal can be used to store equidistant and
complete audio data that can be converted between the time and frequency
domain by means of the Fourier transform.
Arithmetic operations can be applied in the time and frequency domain and
are implemented in the methods add, subtract, multiply, divide,
and power. For example, two Signal, TimeData, or
FrequencyData instances can be added in the time domain by
>>> result = pyfar.classes.audio.add((signal_1, signal_2), 'time')
and in the frequency domain by
>>> result = pyfar.classes.audio.add((signal_1, signal_2), 'freq')
This also works with more than two instances and supports array likes and scalar values, e.g.,
>>> result = pyfar.classes.audio.add((signal_1, 1), 'time')
In this case the scalar 1 is broadcasted, i.e., it is is added to every sample of signal (or every bin in case of a frequency domain operation).
The operators +, -, *, /, and ** are overloaded for
convenience. Note, however, that their behavior depends on the Audio object.
Frequency domain operations are applied for Signal and
FrequencyData objects, i.e,
>>> result = signal1 + signal2
is equivalent to
>>> result = pyfar.classes.audio.add((signal1, signal2), 'freq')
Time domain operations are applied for TimeData objects, i.e.,
>>> result = time_data_1 + time_data_2
is equivalent to
>>> result = pyfar.classes.audio.add((time_data_1, time_data_2), 'time')
In addition to the arithmetic operations, the equality operator is overloaded to allow comparisons
>>> signal_1 == signal_2
Classes:
|
Class for frequency data. |
|
Class for audio signals. |
|
Class for time data. |
Functions:
|
Add pyfar audio objects, array likes, and scalars. |
|
Divide pyfar audio objects, array likes, and scalars. |
|
Multiply pyfar audio objects, array likes, and scalars. |
|
Power of pyfar audio objects, array likes, and scalars. |
|
Subtract pyfar audio objects, array likes, and scalars. |
- class pyfar.classes.audio.FrequencyData(data, frequencies, fft_norm=None, comment=None, dtype=<class 'complex'>)[source]¶
Bases:
pyfar.classes.audio._AudioClass for frequency data.
Objects of this class contain frequency data which is not directly convertable to the time domain, i.e., non-equidistantly spaced bins or incomplete spectra.
Methods:
__getitem__(key)Get copied slice of the audio object at key.
__init__(data, frequencies[, fft_norm, ...])Create FrequencyData with data, and frequencies.
__setitem__(key, value)Set channels of audio object at key.
copy()Return a copy of the audio object.
find_nearest_frequency(value)Return the index that is closest to the query frequency.
flatten()Return flattened copy of the audio object.
reshape(newshape)Return reshaped copy of the audio object.
Attributes:
Get comment.
Return channel shape.
The domain the data is stored in.
The data type of the audio object.
The normalization for the Discrete Fourier Transform (DFT).
Return the data in the frequency domain.
Frequencies of the discrete signal spectrum.
Number of frequency bins.
- __getitem__(key)¶
Get copied slice of the audio object at key.
Examples
Get the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> first_channel = signal[0]
- __init__(data, frequencies, fft_norm=None, comment=None, dtype=<class 'complex'>)[source]¶
Create FrequencyData with data, and frequencies.
- Parameters
data (array, double) – Raw data in the frequency domain. The memory layout of Data is ‘C’. E.g. data of
shape = (3, 2, 1024)has 3 x 2 channels with 1024 frequency bins each.frequencies (array, double) – Frequencies of the data in Hz. The number of frequencies must match the size of the last dimension of data.
fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be
'none','unitary','amplitude','rms','power', or'psd'. Seenormalizationand 1 for more information. The default is'none', which is typically used for energy signals, such as impulse responses.comment (str, optional) – A comment related to the data. The default is
'none'.dtype (string, optional) – Raw data type of the audio object. The default is float64.
References
- 1
J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.
- __setitem__(key, value)¶
Set channels of audio object at key.
Examples
Set the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> signal[0] = pf.signals.noise(10, rms=2)
- property comment¶
Get comment.
- copy()¶
Return a copy of the audio object.
- property cshape¶
Return channel shape.
The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.
- property domain¶
The domain the data is stored in.
- property dtype¶
The data type of the audio object. This can be any data type and precision supported by numpy.
- property fft_norm¶
The normalization for the Discrete Fourier Transform (DFT).
See
normalizationfor more information.
- find_nearest_frequency(value)[source]¶
Return the index that is closest to the query frequency.
- Parameters
value (float, array-like) – The frequencies for which the indices are to be returned
- Returns
indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- flatten()¶
Return flattened copy of the audio object.
- Returns
flat – Flattened copy of audio object with
flat.cshape = np.prod(audio.cshape)- Return type
Notes
The number of samples and frequency bins always remains the same, e.g., an audio object of
cshape=(4,3)andn_samples=512will havecshape=(12, )andn_samples=512after flattening.
- property freq¶
Return the data in the frequency domain.
- property frequencies¶
Frequencies of the discrete signal spectrum.
- property n_bins¶
Number of frequency bins.
- reshape(newshape)¶
Return reshaped copy of the audio object.
- Parameters
newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be
-1. In this case, the value is inferred from the remaining dimensions.- Returns
reshaped – reshaped copy of the audio object.
- Return type
Notes
The number of samples and frequency bins always remains the same.
- class pyfar.classes.audio.Signal(data, sampling_rate, n_samples=None, domain='time', fft_norm=None, comment=None, dtype=<class 'numpy.float64'>)[source]¶
Bases:
pyfar.classes.audio.FrequencyData,pyfar.classes.audio.TimeDataClass for audio signals.
Objects of this class contain data which is directly convertable between time and frequency domain (equally spaced samples and frequency bins).
Methods:
__getitem__(key)Get copied slice of the audio object at key.
__init__(data, sampling_rate[, n_samples, ...])Create Signal with data, and sampling rate.
__iter__()Iterator for
Signalobjects.__setitem__(key, value)Set channels of audio object at key.
copy()Return a copy of the audio object.
find_nearest_frequency(value)Return the index that is closest to the query frequency.
find_nearest_time(value)Return the index that is closest to the query time.
flatten()Return flattened copy of the audio object.
reshape(newshape)Return reshaped copy of the audio object.
Attributes:
Get comment.
Return channel shape.
The domain the data is stored in.
The data type of the audio object.
The normalization for the Discrete Fourier Transform (DFT).
Return the data in the frequency domain.
Frequencies of the discrete signal spectrum.
Number of frequency bins.
The number of samples.
The sampling rate of the signal.
The length of the data in seconds.
The signal type is
'energy'if thefft_norm = Noneand'power'otherwise.Return the data in the time domain.
Time instances the signal is sampled at.
- __getitem__(key)¶
Get copied slice of the audio object at key.
Examples
Get the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> first_channel = signal[0]
- __init__(data, sampling_rate, n_samples=None, domain='time', fft_norm=None, comment=None, dtype=<class 'numpy.float64'>)[source]¶
Create Signal with data, and sampling rate.
- Parameters
data (ndarray, double) – Raw data of the signal in the time or frequency domain. The memory layout of data is ‘C’. E.g. data of
shape = (3, 2, 1024)has 3 x 2 channels with 1024 samples or frequency bins each. Frequency data must be provided as single sided spectra, i.e., for all frequencies between 0 Hz and half the sampling rate.sampling_rate (double) – Sampling rate in Hz
n_samples (int, optional) – Number of samples of the time signal. Required if domain is
'freq'. The default isNone, which assumes an even number of samples if the data is provided in the frequency domain.domain (
'time','freq', optional) – Domain of data. The default is'time'fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be
'none','unitary','amplitude','rms','power', or'psd'. Seenormalizationand 2 for more information. The default is'none', which is typically used for energy signals, such as impulse responses.comment (str) – A comment related to data. The default is
None.dtype (string, optional) – Raw data type of the audio object. The default is float64
References
- 2
J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.
- __iter__()[source]¶
Iterator for
Signalobjects.Iterate across the first dimension of a
Signal. The actual iteration is handled through numpy’s array iteration.Examples
Iterate channels of a
Signal>>> import pyfar as pf >>> signal = pf.signals.impulse(2, amplitude=[1, 1, 1]) >>> for idx, channel in enumerate(signal): >>> channel.time *= idx >>> signal[idx] = channel
- __setitem__(key, value)¶
Set channels of audio object at key.
Examples
Set the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> signal[0] = pf.signals.noise(10, rms=2)
- property comment¶
Get comment.
- copy()¶
Return a copy of the audio object.
- property cshape¶
Return channel shape.
The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.
- property domain¶
The domain the data is stored in.
- property dtype¶
The data type of the audio object. This can be any data type and precision supported by numpy.
- property fft_norm¶
The normalization for the Discrete Fourier Transform (DFT).
See
normalizationfor more information.
- find_nearest_frequency(value)¶
Return the index that is closest to the query frequency.
- Parameters
value (float, array-like) – The frequencies for which the indices are to be returned
- Returns
indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- find_nearest_time(value)¶
Return the index that is closest to the query time.
- Parameters
value (float, array-like) – The times for which the indices are to be returned
- Returns
indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- flatten()¶
Return flattened copy of the audio object.
- Returns
flat – Flattened copy of audio object with
flat.cshape = np.prod(audio.cshape)- Return type
Notes
The number of samples and frequency bins always remains the same, e.g., an audio object of
cshape=(4,3)andn_samples=512will havecshape=(12, )andn_samples=512after flattening.
- property freq¶
Return the data in the frequency domain.
- property frequencies¶
Frequencies of the discrete signal spectrum.
- property n_bins¶
Number of frequency bins.
- property n_samples¶
The number of samples.
- reshape(newshape)¶
Return reshaped copy of the audio object.
- Parameters
newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be
-1. In this case, the value is inferred from the remaining dimensions.- Returns
reshaped – reshaped copy of the audio object.
- Return type
Notes
The number of samples and frequency bins always remains the same.
- property sampling_rate¶
The sampling rate of the signal.
- property signal_length¶
The length of the data in seconds.
- property signal_type¶
The signal type is
'energy'if thefft_norm = Noneand'power'otherwise.
- property time¶
Return the data in the time domain.
- property times¶
Time instances the signal is sampled at.
- class pyfar.classes.audio.TimeData(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶
Bases:
pyfar.classes.audio._AudioClass for time data.
Objects of this class contain time data which is not directly convertable to frequency domain, i.e., non-equidistant samples.
Methods:
__getitem__(key)Get copied slice of the audio object at key.
__init__(data, times[, comment, dtype])Create TimeData object with data, and times.
__setitem__(key, value)Set channels of audio object at key.
copy()Return a copy of the audio object.
find_nearest_time(value)Return the index that is closest to the query time.
flatten()Return flattened copy of the audio object.
reshape(newshape)Return reshaped copy of the audio object.
Attributes:
Get comment.
Return channel shape.
The domain the data is stored in.
The data type of the audio object.
The number of samples.
The length of the data in seconds.
Return the data in the time domain.
Time in seconds at which the signal is sampled.
- __getitem__(key)¶
Get copied slice of the audio object at key.
Examples
Get the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> first_channel = signal[0]
- __init__(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶
Create TimeData object with data, and times.
- Parameters
data (array, double) – Raw data in the time domain. The memory layout of data is ‘C’. E.g. data of
shape = (3, 2, 1024)has 3 x 2 channels with 1024 samples each.times (array, double) – Times in seconds at which the data is sampled. The number of times must match the size of the last dimension of data.
comment (str, optional) – A comment related to data. The default is
'none'.dtype (string, optional) – Raw data type of the audio object. The default is float64.
- __setitem__(key, value)¶
Set channels of audio object at key.
Examples
Set the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> signal[0] = pf.signals.noise(10, rms=2)
- property comment¶
Get comment.
- copy()¶
Return a copy of the audio object.
- property cshape¶
Return channel shape.
The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.
- property domain¶
The domain the data is stored in.
- property dtype¶
The data type of the audio object. This can be any data type and precision supported by numpy.
- find_nearest_time(value)[source]¶
Return the index that is closest to the query time.
- Parameters
value (float, array-like) – The times for which the indices are to be returned
- Returns
indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- flatten()¶
Return flattened copy of the audio object.
- Returns
flat – Flattened copy of audio object with
flat.cshape = np.prod(audio.cshape)- Return type
Notes
The number of samples and frequency bins always remains the same, e.g., an audio object of
cshape=(4,3)andn_samples=512will havecshape=(12, )andn_samples=512after flattening.
- property n_samples¶
The number of samples.
- reshape(newshape)¶
Return reshaped copy of the audio object.
- Parameters
newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be
-1. In this case, the value is inferred from the remaining dimensions.- Returns
reshaped – reshaped copy of the audio object.
- Return type
Notes
The number of samples and frequency bins always remains the same.
- property signal_length¶
The length of the data in seconds.
- property time¶
Return the data in the time domain.
- property times¶
Time in seconds at which the signal is sampled.
- pyfar.classes.audio.add(data: tuple, domain='freq')[source]¶
Add pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form
(data_1, data_2, ..., data_N)) – Data to be added. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,TimeDataandFrequencyDataobjects do not work together.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seepyfar.dsp.fft.normalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.divide(data: tuple, domain='freq')[source]¶
Divide pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be divided. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.multiply(data: tuple, domain='freq')[source]¶
Multiply pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be multiplied. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.power(data: tuple, domain='freq')[source]¶
Power of pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – The base for which the power is calculated. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.subtract(data: tuple, domain='freq')[source]¶
Subtract pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be subtracted. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array