Audio (Signal, TimeData, FrequencyData)¶
The following documents the audio classes and arithmethic operations for
audio data. More details and background is given in the concepts (
audio classes,
Fourier transform,
arithmetic operations).
Classes:
|
Class for frequency data. |
|
Class for audio signals. |
|
Class for time data. |
Functions:
|
Add pyfar audio objects, array likes, and scalars. |
|
Divide pyfar audio objects, array likes, and scalars. |
|
Multiply pyfar audio objects, array likes, and scalars. |
|
Power of pyfar audio objects, array likes, and scalars. |
|
Subtract pyfar audio objects, array likes, and scalars. |
- class pyfar.classes.audio.FrequencyData(data, frequencies, comment=None, dtype=<class 'complex'>)[source]¶
Class for frequency data.
Objects of this class contain frequency data which is not directly convertable to the time domain, i.e., non-equidistantly spaced bins or incomplete spectra.
Methods:
__getitem__(key)Get copied slice of the audio object at key.
__init__(data, frequencies[, comment, dtype])Create FrequencyData with data, and frequencies.
__setitem__(key, value)Set channels of audio object at key.
copy()Return a copy of the audio object.
find_nearest_frequency(value)Return the index that is closest to the query frequency.
flatten()Return flattened copy of the audio object.
reshape(newshape)Return reshaped copy of the audio object.
Attributes:
Get comment.
Return channel shape.
The domain the data is stored in.
The data type of the audio object.
Return the data in the frequency domain.
Frequencies of the discrete signal spectrum.
Number of frequency bins.
- __getitem__(key)¶
Get copied slice of the audio object at key.
Examples
Get the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> first_channel = signal[0]
- __init__(data, frequencies, comment=None, dtype=<class 'complex'>)[source]¶
Create FrequencyData with data, and frequencies.
- Parameters
data (array, double) – Raw data in the frequency domain. The memory layout of Data is ‘C’. E.g. data of
shape = (3, 2, 1024)has 3 x 2 channels with 1024 frequency bins each.frequencies (array, double) – Frequencies of the data in Hz. The number of frequencies must match the size of the last dimension of data.
comment (str, optional) – A comment related to the data. The default is
'none'.dtype (string, optional) – Raw data type of the audio object. The default is float64.
Notes
FrequencyData objects do not support an FFT norm, because this requires knowledge about the sampling rate or the number of samples of the time signal 1.
References
- 1
J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.
- __setitem__(key, value)¶
Set channels of audio object at key.
Examples
Set the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> signal[0] = pf.signals.noise(10, rms=2)
- property comment¶
Get comment.
- copy()¶
Return a copy of the audio object.
- property cshape¶
Return channel shape.
The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.
- property domain¶
The domain the data is stored in.
- property dtype¶
The data type of the audio object. This can be any data type and precision supported by numpy.
- find_nearest_frequency(value)[source]¶
Return the index that is closest to the query frequency.
- Parameters
value (float, array-like) – The frequencies for which the indices are to be returned
- Returns
indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- flatten()¶
Return flattened copy of the audio object.
- Returns
flat – Flattened copy of audio object with
flat.cshape = np.prod(audio.cshape)- Return type
Notes
The number of samples and frequency bins always remains the same, e.g., an audio object of
cshape=(4,3)andn_samples=512will havecshape=(12, )andn_samples=512after flattening.
- property freq¶
Return the data in the frequency domain.
- property frequencies¶
Frequencies of the discrete signal spectrum.
- property n_bins¶
Number of frequency bins.
- reshape(newshape)¶
Return reshaped copy of the audio object.
- Parameters
newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be
-1. In this case, the value is inferred from the remaining dimensions.- Returns
reshaped – reshaped copy of the audio object.
- Return type
Notes
The number of samples and frequency bins always remains the same.
- class pyfar.classes.audio.Signal(data, sampling_rate, n_samples=None, domain='time', fft_norm='none', comment=None, dtype=<class 'numpy.float64'>)[source]¶
Class for audio signals.
Objects of this class contain data which is directly convertable between time and frequency domain (equally spaced samples and frequency bins).
Methods:
__getitem__(key)Get copied slice of the audio object at key.
__init__(data, sampling_rate[, n_samples, ...])Create Signal with data, and sampling rate.
__iter__()Iterator for
Signalobjects.__setitem__(key, value)Set channels of audio object at key.
copy()Return a copy of the audio object.
find_nearest_frequency(value)Return the index that is closest to the query frequency.
find_nearest_time(value)Return the index that is closest to the query time.
flatten()Return flattened copy of the audio object.
reshape(newshape)Return reshaped copy of the audio object.
Attributes:
Get comment.
Return channel shape.
The domain the data is stored in.
The data type of the audio object.
The normalization for the Discrete Fourier Transform (DFT).
Return the normalized frequency domain data.
Return the frequency domain data without normalization.
Frequencies of the discrete signal spectrum.
Number of frequency bins.
The number of samples.
The sampling rate of the signal.
The length of the data in seconds.
The signal type is
'energy'if thefft_norm = Noneand'power'otherwise.Return the data in the time domain.
Time instances the signal is sampled at.
- __getitem__(key)¶
Get copied slice of the audio object at key.
Examples
Get the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> first_channel = signal[0]
- __init__(data, sampling_rate, n_samples=None, domain='time', fft_norm='none', comment=None, dtype=<class 'numpy.float64'>)[source]¶
Create Signal with data, and sampling rate.
- Parameters
data (ndarray, double) – Raw data of the signal in the time or frequency domain. The memory layout of data is ‘C’. E.g. data of
shape = (3, 2, 1024)has 3 x 2 channels with 1024 samples or frequency bins each. Frequency data must be provided as single sided spectra, i.e., for all frequencies between 0 Hz and half the sampling rate.sampling_rate (double) – Sampling rate in Hz
n_samples (int, optional) – Number of samples of the time signal. Required if domain is
'freq'. The default isNone, which assumes an even number of samples if the data is provided in the frequency domain.domain (
'time','freq', optional) – Domain of data. The default is'time'fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be
'none','unitary','amplitude','rms','power', or'psd'. Seenormalizationand 2 for more information. The default is'none', which is typically used for energy signals, such as impulse responses.comment (str) – A comment related to data. The default is
None.dtype (string, optional) – Raw data type of the audio object. The default is float64
References
- 2
J. Ahrens, C. Andersson, P. Höstmad, and W. Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals,” Vienna, Austria, May 2020, p. e-Brief 600.
- __iter__()[source]¶
Iterator for
Signalobjects.Iterate across the first dimension of a
Signal. The actual iteration is handled through numpy’s array iteration.Examples
Iterate channels of a
Signal>>> import pyfar as pf >>> signal = pf.signals.impulse(2, amplitude=[1, 1, 1]) >>> for idx, channel in enumerate(signal): >>> channel.time *= idx >>> signal[idx] = channel
- __setitem__(key, value)¶
Set channels of audio object at key.
Examples
Set the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> signal[0] = pf.signals.noise(10, rms=2)
- property comment¶
Get comment.
- copy()¶
Return a copy of the audio object.
- property cshape¶
Return channel shape.
The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.
- property domain¶
The domain the data is stored in.
- property dtype¶
The data type of the audio object. This can be any data type and precision supported by numpy.
- property fft_norm¶
The normalization for the Discrete Fourier Transform (DFT).
See
normalizationandFFT conceptsfor more information.
- find_nearest_frequency(value)¶
Return the index that is closest to the query frequency.
- Parameters
value (float, array-like) – The frequencies for which the indices are to be returned
- Returns
indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- find_nearest_time(value)¶
Return the index that is closest to the query time.
- Parameters
value (float, array-like) – The times for which the indices are to be returned
- Returns
indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- flatten()¶
Return flattened copy of the audio object.
- Returns
flat – Flattened copy of audio object with
flat.cshape = np.prod(audio.cshape)- Return type
Notes
The number of samples and frequency bins always remains the same, e.g., an audio object of
cshape=(4,3)andn_samples=512will havecshape=(12, )andn_samples=512after flattening.
- property freq¶
Return the normalized frequency domain data.
The normalized data is usually used for inspecting the data, e.g., using plots or when extracting information such as the amplitude of harmonic components. Most processing operations, e.g., frequency domain convolution, require the non-normalized data stored as
freq_raw.
- property freq_raw¶
Return the frequency domain data without normalization.
Most processing operations, e.g., frequency domain convolution, require the non-normalized data. The normalized data stored as
freqis usually used for inspecting the data, e.g., using plots or when extracting information such as the amplitude of harmonic components.
- property frequencies¶
Frequencies of the discrete signal spectrum.
- property n_bins¶
Number of frequency bins.
- property n_samples¶
The number of samples.
- reshape(newshape)¶
Return reshaped copy of the audio object.
- Parameters
newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be
-1. In this case, the value is inferred from the remaining dimensions.- Returns
reshaped – reshaped copy of the audio object.
- Return type
Notes
The number of samples and frequency bins always remains the same.
- property sampling_rate¶
The sampling rate of the signal.
- property signal_length¶
The length of the data in seconds.
- property signal_type¶
The signal type is
'energy'if thefft_norm = Noneand'power'otherwise.
- property time¶
Return the data in the time domain.
- property times¶
Time instances the signal is sampled at.
- class pyfar.classes.audio.TimeData(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶
Class for time data.
Objects of this class contain time data which is not directly convertable to frequency domain, i.e., non-equidistant samples.
Methods:
__getitem__(key)Get copied slice of the audio object at key.
__init__(data, times[, comment, dtype])Create TimeData object with data, and times.
__setitem__(key, value)Set channels of audio object at key.
copy()Return a copy of the audio object.
find_nearest_time(value)Return the index that is closest to the query time.
flatten()Return flattened copy of the audio object.
reshape(newshape)Return reshaped copy of the audio object.
Attributes:
Get comment.
Return channel shape.
The domain the data is stored in.
The data type of the audio object.
The number of samples.
The length of the data in seconds.
Return the data in the time domain.
Time in seconds at which the signal is sampled.
- __getitem__(key)¶
Get copied slice of the audio object at key.
Examples
Get the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> first_channel = signal[0]
- __init__(data, times, comment=None, dtype=<class 'numpy.float64'>)[source]¶
Create TimeData object with data, and times.
- Parameters
data (array, double) – Raw data in the time domain. The memory layout of data is ‘C’. E.g. data of
shape = (3, 2, 1024)has 3 x 2 channels with 1024 samples each.times (array, double) – Times in seconds at which the data is sampled. The number of times must match the size of the last dimension of data.
comment (str, optional) – A comment related to data. The default is
'none'.dtype (string, optional) – Raw data type of the audio object. The default is float64.
- __setitem__(key, value)¶
Set channels of audio object at key.
Examples
Set the first channel of a multi channel audio object
>>> import pyfar as pf >>> signal = pf.signals.noise(10, rms=[1, 1]) >>> signal[0] = pf.signals.noise(10, rms=2)
- property comment¶
Get comment.
- copy()¶
Return a copy of the audio object.
- property cshape¶
Return channel shape.
The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.
- property domain¶
The domain the data is stored in.
- property dtype¶
The data type of the audio object. This can be any data type and precision supported by numpy.
- find_nearest_time(value)[source]¶
Return the index that is closest to the query time.
- Parameters
value (float, array-like) – The times for which the indices are to be returned
- Returns
indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.
- Return type
int, array-like
- flatten()¶
Return flattened copy of the audio object.
- Returns
flat – Flattened copy of audio object with
flat.cshape = np.prod(audio.cshape)- Return type
Notes
The number of samples and frequency bins always remains the same, e.g., an audio object of
cshape=(4,3)andn_samples=512will havecshape=(12, )andn_samples=512after flattening.
- property n_samples¶
The number of samples.
- reshape(newshape)¶
Return reshaped copy of the audio object.
- Parameters
newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be
-1. In this case, the value is inferred from the remaining dimensions.- Returns
reshaped – reshaped copy of the audio object.
- Return type
Notes
The number of samples and frequency bins always remains the same.
- property signal_length¶
The length of the data in seconds.
- property time¶
Return the data in the time domain.
- property times¶
Time in seconds at which the signal is sampled.
- pyfar.classes.audio.add(data: tuple, domain='freq')[source]¶
Add pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form
(data_1, data_2, ..., data_N)) – Data to be added. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,TimeDataandFrequencyDataobjects do not work together. Seearithmetic operationsfor possible combinations of Signal FFT normalizations.for –
domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seepyfar.dsp.fft.normalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.divide(data: tuple, domain='freq')[source]¶
Divide pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be divided. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together. Seearithmetic operationsfor possible combinations of Signal FFT normalizations.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.multiply(data: tuple, domain='freq')[source]¶
Multiply pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be multiplied. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together. Seearithmetic operationsfor possible combinations of Signal FFT normalizations.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.power(data: tuple, domain='freq')[source]¶
Power of pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – The base for which the power is calculated. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together. Seearithmetic operationsfor possible combinations of Signal FFT normalizations.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array
- pyfar.classes.audio.subtract(data: tuple, domain='freq')[source]¶
Subtract pyfar audio objects, array likes, and scalars.
Pyfar audio objects are:
Signal,TimeData, andFrequencyData.- Parameters
data (tuple of the form (data_1, data_2, ..., data_N)) – Data to be subtracted. Can contain pyfar audio objects, array likes, and scalars. Pyfar audio objects can not be mixed, e.g.,
TimeDataandFrequencyDataobjects do not work together. Seearithmetic operationsfor possible combinations of Signal FFT normalizations.domain (
'time','freq', optional) – Flag to indicate if the operation should be performed in the time or frequency domain. If working in the frequency domain, the FFT normalization is removed before the operation (Seenormalization). The default is'freq'.
- Returns
results – Result of the operation as numpy array, if data contains only array likes and numbers. Result as pyfar audio object if data contains an audio object. The fft_norm is
'none'if all FFT norms are'none'. Otherwise the first fft_norm that is not'none'is used.- Return type
Signal, TimeData, FrequencyData, numpy array