Time Series

Filters

Apodization Window

@author: kkappler

Module to manage windowing prior to FFT. Intended to support most apodization windows available via scipy.signal.get_window()

Supported Window types = [‘boxcar’, ‘triang’, ‘blackman’, ‘hamming’, ‘hann’,
‘bartlett’, ‘flattop’, ‘parzen’, ‘bohman’, ‘blackmanharris’,
‘nuttall’, ‘barthann’, ‘kaiser’, ‘gaussian’, ‘general_gaussian’,
‘slepian’, ‘chebwin’]

have_additional_args = {
‘kaiser’ : ‘beta’,
‘gaussian’ : ‘std’,
‘general_gaussian’ : (‘power’, ‘width’),
‘slepian’ : ‘width’,
‘chebwin’ : ‘attenuation’,
}

The Taper Config has 2 possible forms: 1. Standard form for accessing scipy.signal: [“taper_family”, “num_samples_window”, “additional_args”] 2. User-defined : for defining custom tapers

Example 1 : Standard form “taper_family” = “hamming” “num_samples_window” = 128 “additional_args” = {}

Example 2 : Standard form “taper_family” = “kaiser” “num_samples_window” = 64 “additional_args” = {“beta”:8}

Examples 3 : User Defined 2. user-defined: [“array”] In this case num_samples_window is defined by the array. “array” = [1, 2, 3, 4, 5, 4, 3, 2, 1] If “array” is non-empty then assume the user-defined case.

It is a little bit unsatisfying that the args need to be ordered for scipy.signal.get_window(). Probably use OrderedDict() for any windows that have more than one additional args.

For example “taper_family” = ‘general_gaussian’ “additional_args” = OrderedDict(“power”:1.5, “sigma”:7)

class aurora.time_series.apodization_window.ApodizationWindow(**kwargs)[source]

Bases: object

Instantiate an apodization window object. Example usages: apod_window = ApodizationWindow() taper=ApodizationWindow(taper_family=’hanning’, num_samples_window=55 )

Window factors S1, S2, CG, ENBW are modelled after Heinzel et al. p12-14 [1] Spectrum and spectral density estimation by the Discrete Fourier transform (DFT), including a comprehensive list of window functions and some new flat-top windows. G. Heinzel, A. Roudiger and R. Schilling, Max-Planck Institut fur Gravitationsphysik (Albert-Einstein-Institut) Teilinstitut Hannover February 15, 2002 See Also [2] Harris FJ. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE. 1978 Jan;66(1):51-83.

Nomenclature from Heinzel et al. ENBW: Effective Noise BandWidth, see Equation (22) NENBW Normalized Equivalent Noise BandWidth, see Equation (21)

Parameters:
  • taper_family (string) – Specify the taper type - boxcar, kaiser, hanning, etc

  • num_samples_window (int) – The number of samples in the taper

  • taper (numpy array) – The actual window coefficients themselves. This can be passed if a particular custom window is desired.

  • additional_args (dictionary) – These are any additional requirements scipy needs in order to generate the window.

Attributes:
S1

sum of the window coefficients

S2

sum of squares of the window coefficients

apodization_factor
coherent_gain

DC gain of the window normalized by window length

nenbw

NENBW Normalized Equivalent Noise BandWidth, see Equation (21) in

num_samples_window
summary

Returns

taper

Methods

enbw(fs)

Notes that unlike NENBW, CG, S1, S2, this is not a pure property of the window -- but instead this is a property of the window combined with the sample rate.

make()

this is just a wrapper call to scipy.signal Note: see scipy.signal.get_window for a description of what is expected in args[1:].

test_linear_spectral_density_factor()

This is just a test to verify some algebra Claim: The lsd_calibration factors A (1./coherent_gain)*np.sqrt((2*dt)/(nenbw*N)) and B np.sqrt(2/(sample_rate*self.S2))

property S1

sum of the window coefficients

property S2

sum of squares of the window coefficients

property apodization_factor
property coherent_gain

DC gain of the window normalized by window length

enbw(fs)[source]

Notes that unlike NENBW, CG, S1, S2, this is not a pure property of the window – but instead this is a property of the window combined with the sample rate. :param fs: :type fs: sampling frequency (1/dt)

make()[source]

this is just a wrapper call to scipy.signal Note: see scipy.signal.get_window for a description of what is expected in args[1:]. http://docs.scipy.org/doc/scipy/reference/ generated/scipy.signal.get_window.html

note: this is just repackaging the args so that scipy.signal.get_window() accepts all cases.

property nenbw

NENBW Normalized Equivalent Noise BandWidth, see Equation (21) in Heinzel et al 2002

property num_samples_window
property summary

returns: out_str – String comprised of the taper_family, number_of_samples, and True/False if self.taper is not None :rtype: str

property taper
test_linear_spectral_density_factor()[source]

This is just a test to verify some algebra Claim: The lsd_calibration factors A (1./coherent_gain)*np.sqrt((2*dt)/(nenbw*N)) and B np.sqrt(2/(sample_rate*self.S2))

are identical.

Note sqrt(2*dt)==sqrt(2*sample_rate) so we can cancel these terms and A=B IFF

(1./coherent_gain) * np.sqrt(1/(nenbw*N)) == 1/np.sqrt(S2) which I show in githib aurora issue #3 via . (CG**2) * NENBW *N = S2

Decorators

Here is the decorator pattern
def decorator(func):
@functools.wraps(func)
def wrapper_decorator(*args, **kwargs):
# Do something before
value = func*args, **kwargs)
# Do something after
return value
return wrapper_decorator
aurora.time_series.decorators.can_use_xr_dataarray(func)[source]

Intended as a decorator. Most of the windowed time series methods are intended to work with xarray.Dataset class. But I would like to be able to pass them xarray.DataArray objects. This class casts a DataArray to a Dataset, runs it through func and casts back to a DataArray.

A similar decorator could be written for numpy arrays. :param func:

Frequency Band Helpers

aurora.time_series.frequency_band_helpers.adjust_band_for_coherence_sorting(frequency_band, spectrogram, rule='min3')[source]
Parameters:
  • frequency_band

  • spectrogram (Spectrogram) –

  • rule

aurora.time_series.frequency_band_helpers.check_time_axes_synched(X, Y)[source]

Utility function for checking that time axes agree

Parameters:
  • X (xarray) –

  • Y (xarray) –

aurora.time_series.frequency_band_helpers.cross_spectra(X, Y)[source]
aurora.time_series.frequency_band_helpers.extract_band(frequency_band, fft_obj, channels=[], epsilon=1e-07)[source]

Stand alone method that operates on an xr.DataArray, and is wrapped with Spectrogram

Note #1: 20230902 drop=True does not play nice with h5py and Dataset, results in a type error. File “stringsource”, line 2, in h5py.h5r.Reference.__reduce_cython__ TypeError: no default __reduce__ due to non-trivial __cinit__ However, it works OK with DataArray, so maybe use data array in general

Parameters:
  • frequency_band (mt_metadata.transfer_functions.processing.aurora.band.Band) – Specifies interval corresponding to a frequency band

  • fft_obj (xarray.core.dataset.Dataset) – To be replaced with an fft_obj() class in future

  • epsilon (float) – Use this when you are worried about missing a frequency due to round off error. This is in general not needed if we use a df/2 pad around true harmonics.

Returns:

band – The frequencies within the band passed into this function

Return type:

xr.DataArray

aurora.time_series.frequency_band_helpers.get_band_for_tf_estimate(band, dec_level_config, local_stft_obj, remote_stft_obj)[source]

Returns spectrograms X, Y, RR for harmonics within the given band

Parameters:
  • band (mt_metadata.transfer_functions.processing.aurora.FrequencyBands) – object with lower_bound and upper_bound to tell stft object which subarray to return

  • config (mt_metadata.transfer_functions.processing.aurora.decimation_level.DecimationLevel) – information about the input and output channels needed for TF estimation problem setup

  • local_stft_obj (xarray.core.dataset.Dataset or None) – Time series of Fourier coefficients for the station whose TF is to be estimated

  • remote_stft_obj (xarray.core.dataset.Dataset or None) – Time series of Fourier coefficients for the remote reference station

Returns:

X, Y, RR – data structures as local_stft_object and remote_stft_object, but restricted only to input_channels, output_channels, reference_channels and also the frequency axes are restricted to being within the frequency band given as an input argument.

Return type:

xarray.core.dataset.Dataset or None

Time Axis Helpers

aurora.time_series.time_axis_helpers.decide_time_axis_method(sample_rate)[source]
aurora.time_series.time_axis_helpers.do_some_tests()[source]
aurora.time_series.time_axis_helpers.fast_arange(t0, n_samples, sample_rate)[source]
aurora.time_series.time_axis_helpers.main()[source]
aurora.time_series.time_axis_helpers.make_time_axis(t0, n_samples, sample_rate)[source]
aurora.time_series.time_axis_helpers.slow_comprehension(t0, n_samples, sample_rate)[source]
aurora.time_series.time_axis_helpers.test_generate_time_axis(t0, n_samples, sample_rate)[source]

Two obvious ways to generate an axis of timestamps here. One method is slow and more precise, the other is fast but drops some nanoseconds due to integer roundoff error.

To see this, consider the example of say 3Hz, we are 333333333ns between samples, which drops 1ns per second if we scale a nanoseconds=np.arange(N) The issue here is that the nanoseconds granularity forces a roundoff error

Probably will use logic like: | if there_are_integer_ns_per_sample: | time_stamps = do_it_the_fast_way() | else: | time_stamps = do_it_the_slow_way() | return time_stamps

Parameters:
  • t0 (_type_) – _description_

  • n_samples (_type_) – _description_

  • sample_rate (_type_) – _description_

Window Helpers

Notes in google doc: https://docs.google.com/document/d/1CsRhSLXsRG8HQxM4lKNqVj-V9KA9iUQAvCOtouVzFs0/edit?usp=sharing

aurora.time_series.window_helpers.apply_fft_to_windowed_array(windowed_array)[source]

This will operate row-wise as well :param windowed_array:

aurora.time_series.window_helpers.available_number_of_windows_in_array(n_samples_array, n_samples_window, n_advance)[source]
Parameters:
  • n_samples_array (int) – The length of the time series

  • n_samples_window (int) – The length of the window (in samples)

  • n_advance (int) – The number of samples the window advances at each step

Returns:

available_number_of_strides – The number of windows the time series will yield

Return type:

int

aurora.time_series.window_helpers.check_all_sliding_window_functions_are_equivalent()[source]

simple sanity check that runs each sliding window function on a small array and confirms the results are numerically identical. Note that striding window will return int types where others return float.

aurora.time_series.window_helpers.do_some_tests()[source]
aurora.time_series.window_helpers.main()[source]
aurora.time_series.window_helpers.sliding_window_crude(data, num_samples_window, num_samples_advance, num_windows=None)[source]
Parameters:
  • data (np.ndarray) – The time series data to be windowed

  • num_samples_window (int) – The length of the window (in samples)

  • num_samples_advance (int) – The number of samples the window advances at each step

  • num_windows (int) – The number of windows to “take”. Must be less or equal to the number of available windows.

Returns:

output_array – The windowed time series

Return type:

numpy.ndarray

aurora.time_series.window_helpers.sliding_window_numba(data, num_samples_window, num_samples_advance, num_windows)[source]
Parameters:
  • data (np.ndarray) – The time series data to be windowed

  • num_samples_window (int) – The length of the window (in samples)

  • num_samples_advance (int) – The number of samples the window advances at each step

  • num_windows (int) – The number of windows to “take”.

Returns:

output_array – The windowed time series

Return type:

numpy.ndarray

aurora.time_series.window_helpers.striding_window(data, num_samples_window, num_samples_advance, num_windows=None)[source]

Applies a striding window to an array. We use 1D arrays here. Note that this method is extendable to N-dimensional arrays as was once shown at http://www.johnvinyard.com/blog/?p=268

Karl has an implementation of this code but chose to restict to 1D here. This is becuase of several warnings encountered, on the notes of stride_tricks.py, as well as for example here: https://stackoverflow.com/questions/4936620/using-strides-for-an-efficient-moving-average-filter

While we can possibly setup Aurora so that no copies of the strided window are made downstream, we cannot guarantee that another user may not add methods that require copies. For robustness we will use 1d implementation only for now.

Another clean example of this method can be found in the razorback codes from brgm.

result is 2d: result[i] is the i-th window

>>> sliding_window(np.arange(15), 4, 3, 2)
array([[0, 1, 2],
       [2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])
Parameters:
  • data (np.ndarray) – The time series data to be windowed

  • num_samples_window (int) – The length of the window (in samples)

  • num_samples_advance (int) – The number of samples the window advances at each step

  • num_windows (int) – The number of windows to “take”. Must be less or equal to the number of available windows.

Returns:

strided_window – The windowed time series

Return type:

numpy.ndarray

aurora.time_series.window_helpers.test_apply_taper()[source]

Windowed Time Series

class aurora.time_series.windowed_time_series.WindowedTimeSeries[source]

Bases: object

Time series that has been chopped into (possibly) overlapping windows.

This is a place where we can put methods that operate on these sorts of objects.

The assumption is that we take xarrays keyed by “channel”

Specific methods:

Demean Detrend Prewhiten stft invert_prewhitening

probably make these @staticmethod s so we import WindowedTimeSeries and then call the static methods

Methods

apply_stft([data, sample_rate, ...])

Only supports xr.Dataset at this point

apply_taper

staticmethod(function) -> method

delay_correction(dataset, run_obj)

param dataset:

detrend([data, detrend_axis, detrend_type, ...])

Notes: overwrite data=True probably best for most applications but be careful

static apply_stft(data=None, sample_rate=None, detrend_type=None, spectral_density_calibration=1.0, fft_axis=None)[source]

Only supports xr.Dataset at this point

Parameters:
  • data

  • sample_rate

  • detrend_type

apply_taper()

staticmethod(function) -> method

Convert a function to be a static method.

A static method does not receive an implicit first argument. To declare a static method, use this idiom:

class C:

@staticmethod def f(arg1, arg2, …):

It can be called either on the class (e.g. C.f()) or on an instance (e.g. C().f()). Both the class and the instance are ignored, and neither is passed implicitly as the first argument to the method.

Static methods in Python are similar to those found in Java or C++. For a more advanced concept, see the classmethod builtin.

delay_correction(dataset, run_obj)[source]
Parameters:
  • dataset (xr.Dataset) –

  • run_obj

static detrend(data=None, detrend_axis=None, detrend_type=None, inplace=True)[source]
Notes: overwrite data=True probably best for most applications but be careful

with that. Do we want to avoid this in general? Could we be possibly overwriting stuff on MTH5 in future? Also, is overwrite even working how I think it is here? Overwrite_data not working right in scipy.signal, dont use it for now

Parameters:
  • data (xarray Dataset) –

  • detrend_axis (string) –

  • detrend_type (string) – “linear” or “constant” This argument is provided to scipy.signal.detrend

aurora.time_series.windowed_time_series.get_time_coordinate_axis(dataset)[source]

It is common to pass an argument to scipy.signal methods axis=int where that integer specifies along which dimension we are applying the operator. This method helps ensure that we have the correct axis. :param dataset: :type dataset: xarray.Dataset

aurora.time_series.windowed_time_series.validate_coordinate_ordering_time_domain(dataset)[source]

Check that the data dimensions are what you expect. This may evolve, but for now, just want to make sure that we are operating along the correct axes when we demean, detrend, taper, etc. :param dataset: :type dataset: xarray.Dataset

Windowing Scheme

The windowing scheme defines the chunking and chopping of the time series for the Short Time Fourier Transform. Often referred to as a “sliding window” or a “striding window”. Iin its most basic form it is a taper with a rule to say how far to advance at each stride (or step).

To generate an array of data-windows from a data series we only need the two parameters window_length (L) and window_overlap (V). The parameter “window_advance” (L-V) can be used in lieu of overlap. Sliding windows are normally described terms of overlap but it is cleaner to code in terms of advance.

Choices L and V are usually made with some knowledge of time series sample rate, duration, and the frequency band of interest. In aurora because this is used to prep for STFT, L is typically a power of 2.

In general we will need one instance of this class per decimation level, but in practice often leave the windowing scheme the same for each decimation level.

This class is a key part of the “gateway” to frequency domain, so it has been given a sampling_rate attribute. While sampling rate is a property of the data, and not the windowing scheme per se, it is good for this class to be aware of the sampling rate.

Future modifications could involve: - binding this class with a time series. - Making a subclass with only L, V, and then having an extension with sample_rate

When 2D arrays are generated how should we index them? | [[ 0 1 2] | [ 2 3 4] | [ 4 5 6] | [ 6 7 8] | [ 8 9 10] | [10 11 12] | [12 13 14]]

In this example the rows are indexing the individual windows … and so they should be associated with the time of each window. We will need to set a standard for this. Obvious options are center_time of window and time_of_first sample. I prefer time_of_first sample. This can always be transformed to center time or another standard later. We can call this the “window time axis”. The columns are indexing “steps of delta-t”. The actual times are different for every row, so it would be best to use something like [0, dt, 2*dt] for that axis to keep it general. We can call this the “within-window sample time axis”

TODO: Regarding the optional time_vector input to self.apply_sliding_window() … this current implementation takes as input numpy array data. We need to also allow for an xarray to be implemented. In the simplest case we would take an xarray in and extract its “time” axis as time vector

20210529 This class is going to be modified to only accept xarray as input data. We can force any incoming numpy arrays to be either xr.DataArray or xr.Dataset. Similarly, output will be only xr.DataArray or xr.Dataset

class aurora.time_series.windowing_scheme.WindowingScheme(**kwargs)[source]

Bases: ApodizationWindow

20210415: Casting window length, overlap, advance, etc. in terms of number of samples or “points” here as this is common signal processing the nomenclature. We may provide an interface to define these things in terms of percent, duration in seconds etc. in a supporting module.

Note that sample_rate is actually a property of the data and not of the window … still not sure if we want to make sample_rate an attr here or if its better to put properties like window_duration() as a method of some composition of time series and windowing scheme.

Attributes:
S1

sum of the window coefficients

S2

sum of squares of the window coefficients

apodization_factor
coherent_gain

DC gain of the window normalized by window length

dt

comes from data

duration_advance
linear_spectral_density_calibration_factor

Returns ——- float calibration_factor: Following Hienzel et al 2002, Equations 24 and 25 for Linear Spectral Density correction for a single sided spectrum.

nenbw

NENBW Normalized Equivalent Noise BandWidth, see Equation (21) in

num_samples_advance

A derived property.

num_samples_window
summary

Returns

taper
window_duration

units are SI seconds assuming dt is SI seconds

Methods

apply_fft(data[, ...])

param data:

apply_sliding_window(data[, time_vector, ...])

I would like this method to support numpy arrays as well as xarrays.

apply_spectral_density_calibration(dataset)

param dataset:

apply_taper(data)

modifies the data in place by applying a taper to each window TODO: consider adding an option to return a copy of the data without the taper applied

available_number_of_windows(num_samples_data)

param num_samples_data:

The number of samples in the time series to be windowed by self.

cast_windowed_data_to_xarray(windowed_array, ...)

TODO?: Factor this method to a standalone function in window_helpers?

compute_window_edge_indices(num_samples_data)

This has been useful in the past but maybe not needed here

downsample_time_axis(time_axis)

param time_axis:

This is the time axis associated with the time-series prior to

enbw(fs)

Notes that unlike NENBW, CG, S1, S2, this is not a pure property of the window -- but instead this is a property of the window combined with the sample rate.

make()

this is just a wrapper call to scipy.signal Note: see scipy.signal.get_window for a description of what is expected in args[1:].

test_linear_spectral_density_factor()

This is just a test to verify some algebra Claim: The lsd_calibration factors A (1./coherent_gain)*np.sqrt((2*dt)/(nenbw*N)) and B np.sqrt(2/(sample_rate*self.S2))

clone

frequency_axis

left_hand_window_edge_indices

apply_fft(data, spectral_density_correction=True, detrend_type='linear')[source]
Parameters:
  • data (xarray.core.dataset.Dataset) –

  • spectral_density_correction (boolean) –

  • detrend_type (string) –

Return type:

spectral_ds

Assume we have already applied sliding window and taper. Things to think about: We want to assign the frequency axis during this method

apply_sliding_window(data, time_vector=None, dt=None, return_xarray=False)[source]

I would like this method to support numpy arrays as well as xarrays.

Parameters:
  • data (1D numpy array, xr.DataArray, xr.Dataset) – The data to break into ensembles.

  • time_vector (1D numpy array) – The time axis of the data.

  • dt (float) – The sample interval of the data (reciprocal of sample_rate)

  • return_xarray (boolean) – If True will return an xarray object, even if the input object was a numpy array

Returns:

windowed_obj – Normally an object of type xarray.core.dataarray.DataArray Could be numpy array as well.

Return type:

arraylike

apply_spectral_density_calibration(dataset)[source]
Parameters:

dataset

apply_taper(data)[source]

modifies the data in place by applying a taper to each window TODO: consider adding an option to return a copy of the data without the taper applied

available_number_of_windows(num_samples_data)[source]
Parameters:

num_samples_data (int) – The number of samples in the time series to be windowed by self.

Returns:

number_of_windows – Count of the number of windows returned from time series of num_samples_data. Only take as many windows as available without wrapping. Start with one window for free, move forward by num_samples_advance and don’t walk over the cliff.

Return type:

int

cast_windowed_data_to_xarray(windowed_array, time_vector, dt=None)[source]

TODO?: Factor this method to a standalone function in window_helpers?

Parameters:
  • windowed_array

  • time_vector

  • dt

clone()[source]
compute_window_edge_indices(num_samples_data)[source]

This has been useful in the past but maybe not needed here

downsample_time_axis(time_axis)[source]
Parameters:

time_axis (arraylike) – This is the time axis associated with the time-series prior to the windowing operation.

Returns:

window_time_axis – This is a time axis for the windowed data. Say that we had 1Hz data starting at t=0 and 100 samples. Then we window, with window length 10, and advance 10, the window time axis is [0, 10, 20 , … 90]. Say the same window length, but now advance is 5. Then [0, 5, 10, 15, … 90] is the result.

Return type:

array-like

property dt

comes from data

property duration_advance
frequency_axis(dt)[source]
left_hand_window_edge_indices(num_samples_data)[source]
property linear_spectral_density_calibration_factor
Returns:

Following Hienzel et al 2002,

Equations 24 and 25 for Linear Spectral Density correction for a single sided spectrum.

Return type:

calibration_factor

Return type:

float

property num_samples_advance

A derived property. If we made this a fundamental defined property then overlap would become a derived property. Overlap is more conventional than advance in the literature however so we choose it as our property label.

property window_duration

units are SI seconds assuming dt is SI seconds

aurora.time_series.windowing_scheme.fft_xr_ds(dataset, sample_rate, detrend_type=None, prewhitening=None)[source]

This should call window_helpers.apply_fft_to_windowed_array or get moved to window_helpers.py

The returned harmonics do not include the Nyquist frequency. To modify this add +1 to n_fft_harmonics. Also, only 1-sided ffts are returned.

For each channel within the Dataset, fft is applied along the within-window-time axis of the associated numpy array

Parameters:
  • dataset (xr.Dataset) – Data are 2D (windowed univariate time series).

  • sample_rate (float) –

  • detrend_type

  • prewhitening

aurora.time_series.windowing_scheme.window_scheme_from_decimation(decimation)[source]

Helper function to workaround mt_metadata to not import form aurora

Parameters:
  • decimation (mt_metadata.transfer_function.processing.aurora.decimation_level) –

  • .DecimationLevel

Return type:

windowing_scheme aurora.time_series.windowing_scheme.WindowingScheme