INTRODUCTION

On various occasions, algorithms to estimate instantaneous-frequency from a cyclic (seasonal) sequence to detect slow changes are needed. Examples of this kind of situations comprise the estimation of the variations of the respiratory rate or the pitch variation during a sustained vowel phonation for diagnostic purposes; or the estimate of the angular velocity of an engine for speed control or fault detection; as well as the assessment of seasonal component variations in time series of financial/economic or environmental/climatic events.

Respiratory movement signals and respiration rate

The time-varying respiration rate (RR) or instantaneous-respiratory frequency is an important variable to characterize the physiological condition of a subject in clinical and fitness applications. Numerous measurement methods can acquire the respiration information from which to estimate the instantaneous-frequency. For instance, spirometry directly measures the flow rate of the breathing air but estimating certain parameters of the breathing airflow (for example, temperature, pressure, humidity, and CO_{2} concentration) can also provide alternative procedures (^{Al‐Khalidi et al., 2011}). Nasal or oral/nasal thermistors (^{Verginis et al., 2010}) or infrared thermography (^{Abbas et al., 2011}) can measure the temperature of the inhaling (cooler) and exhaling (warmer) air, as well as mouthpieces or facemasks can be used for monitoring the associated pressures (^{Thurnheer et al., 2001}). Electrical impedance pneumography (^{Freundlich and Erickson, 1974}), (^{Jeyhani et al., 2017}), (^{Wang et al., 2015}), inductance pneumography (^{Hill et al., 1982}), and capnography (^{Von Schéele and Von Schéele, 1999}) have been extensively used, along with signals as ECG (^{Lee and Kim, 2018}), (^{Mirmohamadsadeghi and Vesin, 2014}), and PPG (^{Leonard et al., 2006}), (^{Mirmohamadsadeghi et al., 2016}) from which respiration movement can be also derived. More recently, volumetric surrogate signals have been acquired using radars (^{Nejadgholi et al., 2016}), (^{Ren et al., 2015}), or videocams (^{Reyes et al., 2017}), (^{Vázquez-Segura et al., 2018}). The correlation indexes between respiratory movement signals obtained by the different monitoring methods should be high.

This respiratory movement signal oscillates at around 12 breath cycles per minute (0.2 Hz), continuously varying following several conditions. The upper plot of Figure **1** shows that slow-drifts or trends can affect this signal, here obtained with a temperature sensor, and distort the waveform. However, the instantaneous-frequency successfully shows the evolution of the frequency content of this mono-component respiratory movement signal with time. From that signal, a proper method can estimate the instantaneous-respiratory frequency or respiration rate (lower plot in Figure **1**), showing the variations in time.

Although the instantaneous-frequency estimation problem has been studied in general (^{Boashash, 1992}a), (^{Boashash, 1992b}), (^{O’Shea, 2002}), or some other specific contexts even nowadays (^{Kowalski et al., 2018}), it has not been sufficiently addressed regarding respiration rate estimation except for electrical impedance pneumography signals (^{Jeyhani et al., 2017}). This work tries to select ‘the best’ approach for estimating the instantaneous-respiratory frequency from a volumetric surrogate signal.

METHODOLOGY

There are several approaches that have been used to estimate the instantaneous-frequency in different environments (^{Boashash, 1992}a), (^{O’Shea, 2002}), (^{Kowalski et al., 2018}), including respiratory rate (^{Jeyhani et al., 2017}), (^{Wang et al., 2015}), (^{Lee and Kim, 2018}), (^{Mirmohamadsadeghi and Vesin, 2014}), (^{Leonard et al., 2006}), (^{Mirmohamadsadeghi et al., 2016}), (^{Reyes et al., 2017}). We can easily implement and use most of those algorithms in MATLAB. That includes an analytic signal-based algorithm, and one which estimates the instantaneous-frequency as the first conditional spectral moment of the time-frequency distribution of the input signal, within the instfreq.m function of MATLAB. For this work, some other functions were implemented in MATLAB, comprising: one based on peaks detection in the time domain, several ones based on auto-regressive (AR) models, and on the Eigenvectors and the root MUltiple SIgnal Classification (MUSIC) algorithm.

We generated some synthetic signals resembling the real-world respiratory volume surrogated signals, from which to estimate the known respiration rates, to compare the algorithms for instantaneous-respiratory frequency estimation. The measures for assessment were the maximum absolute and mean-squared errors.

The algorithm based on the analytic signal (Hilbert)

Despite certain criticism (^{Boashash, 1992}a), (^{Huang et al., 2009}), the analytic signal via the Hilbert transform is still used to compute the instantaneous-frequency. For any real-valued signal, x(t), an analytical signal, xa(t), is associated as a complex-valued signal defined as

where HT(*x*) is the Hilbert transform of *x*. In the frequency domain, *X*
_{
a
} , which is the Fourier transform of *x*
_{
a
} , is a single-sided Fourier transform, that is, with the negative frequency values removed, doubling the strictly positive ones, and the DC component unchanged:

Therefore, the analytic signal, *x*
_{
a
} , can be obtained from the real signal, *x*(*t*), by forcing to zero its spectrum for the negative frequencies, without altering the information content since, for a real signal, *X*(*-(*) = *X*
^{
*
} (*(*).

It is then possible to uniquely define the concept of instantaneous-frequency, *f*
_{
inst
} (*t*), by the first derivative of the argument or phase of the analytic signal as:

This method is inside the MATLAB function instfreq.m, as ‘Hilbert’ ‘Method’, and within a function of similar name in the Time-Frequency Toolbox (^{Auger et al., 1996}).

The algorithm based on the moment of the time-frequency distribution

The ‘tfmoment’ ‘Method’ in the instfreq.m MATLAB function estimates the instantaneous-frequency as the first conditional spectral moment of the time-frequency distribution of the input signal. This method computes the spectrogram power spectrum, *P*(*t*,*f*), of the input using the pspectrum.m function, which uses the spectrum as a time-frequency distribution and estimates the instantaneous-frequency as

Remember that the spectrogram is defined as the squared modulus of the Short-Time Fourier Transform (*STFT*) and interpreted as a measure of the energy signal contained in the *t*-*f* domain, centered on (*t*,*f*). The Short-Time Fourier Transform is:

where *x*(*t*) is the input signal in the time domain, *X*(*f*) is its equivalent representation in the frequency domain, and *w*(*t*) is the short-time analysis window around *t*=0 and *f*=0 (*W*(*f*) is its version in the frequency domain). The *STFT* is a local spectrum of the signal *x*(*(*) around *t*.

For a sampled signal, *x*[*n*], as in this case of the respiratory movement sequence, we can efficiently implement the *STFT* by using the FFT algorithm as

where *k* is a multiple of the sampling period step to displace *w* for windowing the data.

Algorithms based on peaks detection in the time domain

An intuitive way of estimating the instantaneous-frequency consists of counting the number of peaks of the respiration movement signal in a known time interval and computing how many may occur in a 1-min window, assuming that the frequency is constant in that period. We measure the time interval between two consecutive peaks (for example, *t*
_{1} between the 1^{st} and the 2^{nd} peaks in Figura 2) and compute the frequency as its inverse (multiplied by 60 to express it in breaths per minute, bpm):

The original estimated instantaneous-frequencies coincide with the positions right between the peaks. We need some interpolation to assign values of instantaneous-frequency to other regions. We solved this by using interp1.m MATLAB function with several interpolation methods:

Linear interpolation ('linear') of the values at neighboring grid points in each respective dimension obtains the value at a query point.

Nearest neighbor interpolation (‘nearest') is the value at a query point.

Next neighbor interpolation ('next') is the value at a query point.

Previous neighbor interpolation ('previous') is the value at a query point.

Modified Akima cubic Hermite interpolation (‘makima'), a piecewise function of polynomials with a degree at most three, obtains the value at a query point.

Spline interpolation using not-a-knot end conditions (‘spline’), where a cubic interpolation of the values at neighboring grid points in each respective dimension obtains the value at a query point.

Shape-preserving piecewise cubic interpolation (‘pchip') of the values at neighboring grid points obtains the value at a query point. That is used as default here.

Algorithms based on Auto-Regressive (AR) models

The respiratory movement sequence exhibits serial autocorrelation, that is, a linear association between lagged observations. The autoregressive (AR), or process models the conditional mean of*x*(*t*) as a function of past observations,*x*(*t-*1), *x*(*t-*2),…, *x*(*t-p*). An AR model of degree*p*, AR(*p*), depends on*p*past observations. The AR equation may model spectra with narrow peaks by placing zeroes of the *A* polynomial (*A*(*w*)=1+*a*
_{1}
*e*
^{
-jw
} +…+*a*
_{
p
}
*e*
^{
-jpw
} ) in

close to the unit circle.

We could consider the respiration movement signal as a mono-component signal and, therefore, we can model it with an AR(2). We solve a system of linear equations to estimate the parameters of the AR signal models, guarantying the stability of the AR polynomial. In this work, we implemented in MATLAB:

Burg’s lattice-based method (‘burg’), which solves the lattice filter equations using the harmonic mean of forward- and backward- squared prediction errors.

The geometric lattice approach (‘gl’), which is similar to Burg’s method, but uses the geometric mean instead of the harmonic mean during minimization.

The least-squares approach (‘ls’) minimizes the standard sum of squared forward-prediction errors.

The Yule-Walker approach (‘yw’) solves the Yule-Walker equations, formed from sample covariances.

The forward-backward approach (‘fb’) minimizes the sum of a least-squares criterion for a forward model and the analogous criterion for a time-reversed model. We used this algorithm by default.

We specified the information about the data outside the measured time interval (past and future values) as:

Post-windowing (‘pow’), where zeros replace missing end values, and the summation extends to time

*N*+*n*(*N*is the number of observations).Pre-windowing (‘prw’), where zeros replace missing past values and, therefore, the summation in the criteria can start at a time equal to zero.

Pre- and post-windowing (‘ppw’), which the Yule-Walker approach uses.

No windowing (‘now’), where the regression vectors formation uses only measured data. The summation in the criteria starts at the sample index equal to

*n*+1. That is the default value, except for ‘yw’.

For this AR(2) model,

With this method, we compute the instantaneous-frequency for an interval of *T* samples every *dt* samples. The length of the analysis window (*T*) is not critical here. We suggest a number-of-samples close to the expected period of the respiration cycle, that is, *T* = 128, which is around 5 s at *f*
_{
s
} = 25 sps (one period of the common 12-bpm breathing). The updating interval should be around 1 s, assuring an overlapping of about 80 % of two consecutive analysis windows. After this step, we use an interpolation method similar to those in the algorithm based on peaks.

First of all, an optional decimation process, which includes low-pass filtering (FIR window method with a Hamming window of length 30), could reduce the number of samples in the analysis-window to better catch the dynamics of the signal with the AR(2) model.

The approach based on the root MUSIC algorithm

The root MUSIC algorithm, which is an improvement to Pisarenko’s method, can estimate the instantaneous-frequency of a mono-component signal as the respiration movement using an eigenspace. That assumes that the signal comprises *p* = 2 complex exponentials (one for positive frequency and one for negative frequency) in the presence of Gaussian white noise. We sort the eigenvalues in decreasing order. Therefore, the eigenvectors corresponding to the two largest eigenvalues span the signal subspace while the rest is considered only noise. The frequency estimation function here is:

where *p* is the number of complex exponentials (2 here), *M* is the order of the autocorrelation matrix of the signal (*M* × *M* autocorrelation matrix of the respiration movement signal, *R*
_{
rm
} ), the v i are the eigenvectors of noise, while e is the steering vector,

By locating the highest peaks of the estimation function 𝑃 , sometimes called the pseudo spectrum, we find the frequency estimates for the p components. In this case, we keep the positive frequency and ignore the negative one.

Pisarenko’s method uses only a single eigenvector. It takes a set of autoregressive coefficients to analytically find its zeros or by polynomial root-finding algorithms. Root MUSIC, however, assumes that zeros may not be present. It locates local minima instead by searching the estimation function for peaks, which can evaluate any frequency, not just those of DFT bins. That estimates instantaneous-frequencies with an accuracy higher than one sample, being a super-resolution method.

For this implementation, similar to that of the AR models, we used an analysis window of T length (around 5 s), an updating period of dt length (approximately 1 s), and an interpolation stage to get an instantaneous-respiratory frequency signal at the same sampling frequency as the respiration movement signal (f_{
s
} = 25 sps in this experiments). We could use a previous decimation stage as well.

Evaluation of the algorithms

For the evaluation of the implemented algorithms for instantaneous-respiration frequency estimation, we used sets of 1000 runs with synthetic signals from the simplest to the completest realistic model. First, we generated segments of 2 min, 0.2-Hz sinusoids (12 bpm), sampled at 25 sps, and with low-level noise (with high signal-to-noise ratio, around 10:1, though) and with a random original phase.

Then, we reran experiments, all the implemented algorithms competing to estimate the instantaneous-frequency in more realistic scenarios, including:

different signal-to-noise ratios (SNR around 100:1 and 10:1),

baseline wandering (15 times slower than the center frequency of 0.2 Hz),

along with several levels of (amplitude and frequency) modulations at known variation laws:

sinusoids about 0.02 Hz, producing variations of ( six bpm by frequency modulation,

and amplitude changes of around 30 %.

We compared all the estimated instantaneous-frequency signals (obtained by the different algorithms) with the known generating law of the instantaneous-frequency by using the maximum absolute and the mean-squared errors (MAE and MSE, respectively) according to:

And

where x(t) is the ideal (known) instantaneous-frequency signal, 𝑥 𝑡 is the estimated instantaneous-frequency signal by the algorithm under assessment, and N is the number of samples in the 2-min segment (around 3 000 samples at 25 Hz).

In the end, we used the best algorithms to find the instantaneous-frequency signals from real-world respiratory movement signals acquired from two subjects (one male and one female). Volunteers were performing three different respiratory maneuvers: with controlled respiration at a fixed rate of 12 breaths per minute (that is, 0.2 Hz, following a metronome), still and moving back and forth, and with spontaneous frequency respiration as well.

RESULTS AND DISCUSSION

In this section, we discuss the results of the instantaneous-frequency estimators under assessment with ideal sinusoidal signals, more realistic signals resembling the respiratory volume surrogated signals, and real-life signals.

Instantaneous frequency estimators in the presence of ideal sinusoidal signals

For the ideally clean sinusoid of constant instantaneous-frequency (f_{
o
} = 0.2019 Hz, that is, 12.114 bpm), sampled at a much higher sampling frequency (f_{
s
} = 25 Hz), several approaches reach a perfect estimation of the instantaneous-frequency during the whole 2-min sequence. In this experiment, with 1 000 repetitions of the random initial phase, the forward-backward auto-regressive (AR) approach and the least-square AR, with no windowing, as well as the root MUSIC algorithm, got zero mean-squared and maximum absolute errors.

Most of the algorithms under assessment acceptably estimate the instantaneous-frequency. However, even for this easy signal, some well-established methods had some troubles.

The method based on the first conditional spectral moment of the time-frequency distribution (TF moment) and the one based on the derivative of the phase of the analytic signal using Hilbert transform (Hilbert) show a kind of oscillatory behavior around the perfect estimation value during the 2-min sequence. The instantaneous-frequency estimated by the spectrogram (based on the Short-Time Fourier Transform) depends on the specific segment of the signal analyzed (not time-invariant). Similar behavior, but even more pronounced, exhibits the method that computes the derivative of the phase of the analytic signal. The maximum absolute error of the TF moment method is above 1.9 bpm, with a mean squared error in the order of 0.9 bpm^{2}. The maximum absolute error of the Hilbert method is 0.6 bpm, with a mean squared error of 0.01 bpm^{2}. Recall that for this simple signal, the forward-backward AR approach with no windowing, the least-square AR with no windowing, and the root MUSIC algorithm performed impeccably.

Instantaneous frequency estimators in the presence of more realistic signals

The AR methods are seriously affected by more realistic respiratory sequences that could contain combinations of additive slow-drifts or trends, some amplitude modulation, and frequency modulation, as well as some additive noise. Before the instantaneous-frequency estimation, we attenuate the ‘high-frequency’ additive noise and the ‘low-frequency’ drift by using proper bandpass filtering or using detrending techniques as the empirical mode decomposition (^{Wang et al., 2015}), (^{Reyes et al., 2017}). The frequency band was approximately 0.1 to 0.6 Hz, where the respiration rate appears (that is, between 6 and 36 bpm). However, certain levels of them will remain, affecting the estimation. We can see the amplitude modulation as a multiplicative effect, almost impossible to get rid of it. The frequency modulation carries the useful information of the time-varying instantaneous-frequency.

Different experiments, with 1 000 runs each, were performed at different SNR and different levels of (amplitude and frequency) modulations, in which, consistently, three algorithms under test outperform the rest. These algorithms were Hilbert, TF moment, and root MUSIC ranked in that same order, being the root MUSIC the best of all.

Actually, for these realistic signals, with a combination of effects (baseline wandering, amplitude modulation, frequency modulation, etc.), but with high SNR (around 100:1), the best performing algorithms were again, as in the case of the perfect sinusoids, the forward-backward AR approach with no windowing, the least-square AR with no windowing, the eigenvector-based method and the root MUSIC algorithm. However, for SNR in the order of 10:1 and lower, the AR estimators deteriorate and are no longer among the best-performing ones, and then Hilbert, TF moment, and root MUSIC ranked the best.

Figure 3 illustrates an example (one run) of the respiration movement signal generated with effects resembling the real-world signal (upper plot) and the resulting instantaneous-frequency signals estimated by the best performing methods: Hilbert, TF moment, and root MUSIC. Observe that root MUSIC (yellow) estimated a respiration rate signal looking like the ideal reference (magenta) in the lower plot of Figure 3 .

Figure 4 shows that even when the root MUSIC algorithm implementation has an execution time longer than TF moment and Hilbert, this is not that critical, taking less than 200 ms to execute a run. That is, the estimation of the 2-min instantaneous-respiration frequency takes less than 200 ms, although the TF moment and the Hilbert implementations ran faster. The maximum absolute error, which is the maximum one in the whole length estimated signal, of the root MUSIC algorithm, however, is much lower than the other two (that is, TF moment and Hilbert). That is, less than two bpm as an average, for root MUSIC, versus estimates of 5 bpm and more than 14 bpm, for TF moment and Hilbert, respectively. The mean-squared error (MSE) of the root MUSIC algorithm (less than 0.5 bpm^{2} as an average) is also much lower than the MSE of the TF moment algorithm (4 bpm^{2} as an average) and the Hilbert algorithm (~ 10.5 bpm^{2}).

The box plots obtained for 1 000 runs (Figure 4) suggest that the differences between the root MUSIC algorithm and the others under assessment here, according to MAE and MSE, are statistically significant. We corroborated that by different statistical tests obtaining all p-values of 0 or very close to 0. Therefore, after this experimentation, we can recommend the root MUSIC algorithm for estimating the instantaneous-respiration frequency from respiratory volume surrogated signals.

Instantaneous-frequency estimators in the presence of real-world signals

We, finally, estimated the respiration rate signals from real-world respiratory movement signals (by using a thermistor close to the mouth and the nose) provided for the CLAIB 2019 Scientific Challenge held in Cancun, Mexico, in October 2019 (^{Paz Reyes et al., 2019}). This dataset comprises six respiratory-movement (reference) signals and six related instantaneous-frequency (inst_freq) signals for two subjects (s11 and s34) and three conditions (resting and moving, with a metronome guiding a 0.2-Hz respiration rate, and spontaneous breathing at rest). Then, we computed the mean-squared error of the estimated instantaneous-frequency signals from the reference signals, assuming the inst_freq signals as references. Table 1 shows the results.

We can see from Table 1 that there are several here implemented estimators of the instantaneous-frequency signals reaching comparable results to the inst_freq provided with the dataset. Among the estimators, the auto-regressive based approaches (Burg, Geometric Lattice, Least-Squares, and Forward-backward), the eigenvectors, and the root MUSIC got the best performances, assuming the inst_freq signals provided as gold standards. Root MUSIC is always ranked among the best-performing algorithms, reinforcing the idea of the previous results with the simulated signals.

CONCLUSIONS

Different algorithms for instantaneous-frequency estimation, or even the same algorithms with other input parameters, may obtain very different results.

Most algorithms have limited frequency resolution and are shift-variant. However, the AR-like methods, based on the roots of the time-varying polynomials (for example, AR, eigenvectors, root MUSIC), are better in this regard. However, the AR estimators fail in the presence of a certain level of noise (SNR<10) because it is almost impossible to obtain acceptably instantaneous-covariance estimations.

AR, eigenvectors, and root MUSIC algorithms are not as seriously affected by the values of the length of the analysis window (T) and the updating period (dt) as some other algorithms as the TF moment.

Some algorithms, as TF moment and Hilbert, exhibit oscillatory behavior, generating peaks and oscillations not due to the nature of the signals from which we estimated the instantaneous-frequency but because of the estimation process itself.

The root MUSIC algorithm outperforms the others under assessment, showing its superiority for instantaneous-respiratory frequency estimation from a volumetric surrogate signal. Therefore, root MUSIC is the algorithm of choice to estimate the time-varying respiration rate.

In future researches, we should implement and assess some other more recent high-resolution techniques, as the Iterative Sparse Asymptotic Minimum Variance Based Approach, and extend their use beyond the mono-component signals. We need this study for several applications, for example, for estimating the time-varying instantaneous-frequency of the alpha peak in the electroencephalography power spectrum.