Synthetic Data Tutorial

This notebook shows how to process MTH5 data from a synthetic dataset.

It also shows how to modify processing so that Fourier coefficients are saved in the mth5. These FCs can be used to perform TF estimation using different regression settings. The same FCs will be used for feature extraction in future (This section is a work in Progress).

Contents:

  1. Process Synthetic Data with Aurora

  2. Fourier coefficient storage in MTH5

Process Synthetic Data with Aurora

Here is a minimal example of running aurora processing on an mth5 populated with synthetic time series.

Steps:

  1. Create the synthetic mth5

  2. Get a Run Summary from the mth5

  3. Select the station to process and optionally the remote reference station

  4. Create a processing config

  5. Generate TFs

  6. Archive the TFs (in emtf_xml or z-file)

Here are the modules we will need to import

[ ]:
import pathlib
import warnings

from aurora.config.config_creator import ConfigCreator
from aurora.pipelines.process_mth5 import process_mth5
from mth5.data.make_mth5_from_asc import create_test12rr_h5
from mth5.data.paths import SyntheticTestPaths
from mth5.processing import RunSummary, KernelDataset

warnings.filterwarnings('ignore')
/home/kkappler/software/irismt/mtpy-v2/mtpy/modeling/simpeg/recipes/inversion_2d.py:39: UserWarning: Pardiso not installed see https://github.com/simpeg/pydiso/blob/main/README.md.
  warnings.warn(

Define target folder and mth5 path

By default, the synthetic mth5 file is used for testing in aurora/tests/synthetic/ and probably already exists on your system if you have run the tests. In the code below, we check if the file exists already, and if not we make it.

NOTE: If using a read-only HPC installation, you may not be able to write to the directory where aurora is installed. In that case, defining your target path as somewhere you have write permission. In that case, uncommment the READ ONLY INSTALLATION block below.

[2]:
synthetic_test_paths = SyntheticTestPaths()
target_folder = synthetic_test_paths.mth5_path

## READ ONLY INSTALLATION
# home = pathlib.Path.home()
# target_folder = home.joinpath("aurora_test_folder")
# target_folder.mkdir(parents=True, exist_ok=True)

mth5_path = target_folder.joinpath("test12rr.h5")

If the mth5 doesn’t already exist, or you want to re-make it, call create_test12rr_h5()

[3]:
# Uncomment this to start with a fresh mth5 file

# mth5_path.unlink()
[4]:
if not mth5_path.exists():
    create_test12rr_h5(target_folder=target_folder)

Get a Run Summary

Note that we didn’t need to explicitly open the mth5 to do that, we can pass the path if we want. Run summary takes a list of mth5 paths as input argument.

[5]:
mth5_run_summary = RunSummary()
mth5_run_summary.from_mth5s([mth5_path,])
run_summary = mth5_run_summary.clone()
run_summary.mini_summary
25:03:08T12:58:52 | INFO | line:777 |mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/mth5/mth5/data/mth5/test12rr.h5
[5]:
survey station run start end duration
0 EMTF Synthetic test1 001 1980-01-01 00:00:00+00:00 1980-01-01 11:06:39+00:00 39999.0
1 EMTF Synthetic test2 001 1980-01-01 00:00:00+00:00 1980-01-01 11:06:39+00:00 39999.0

Define a Kernel Dataset

[6]:
kernel_dataset = KernelDataset()
kernel_dataset.from_run_summary(run_summary, "test1", "test2")
kernel_dataset.mini_summary
25:03:08T12:58:52 | INFO | line:262 |mtpy.processing.kernel_dataset | _add_columns | KernelDataset DataFrame needs column fc, adding and setting dtype to <class 'pandas._libs.missing.NAType'>.
25:03:08T12:58:52 | INFO | line:262 |mtpy.processing.kernel_dataset | _add_columns | KernelDataset DataFrame needs column remote, adding and setting dtype to <class 'bool'>.
25:03:08T12:58:52 | INFO | line:262 |mtpy.processing.kernel_dataset | _add_columns | KernelDataset DataFrame needs column run_dataarray, adding and setting dtype to <class 'NoneType'>.
25:03:08T12:58:52 | INFO | line:262 |mtpy.processing.kernel_dataset | _add_columns | KernelDataset DataFrame needs column stft, adding and setting dtype to <class 'NoneType'>.
25:03:08T12:58:52 | INFO | line:262 |mtpy.processing.kernel_dataset | _add_columns | KernelDataset DataFrame needs column mth5_obj, adding and setting dtype to <class 'NoneType'>.
[6]:
survey station run start end duration
0 EMTF Synthetic test1 001 1980-01-01 00:00:00+00:00 1980-01-01 11:06:39+00:00 39999.0
1 EMTF Synthetic test2 001 1980-01-01 00:00:00+00:00 1980-01-01 11:06:39+00:00 39999.0

Now define the processing Configuration

The only things we need to provide are our band processing scheme, and the data sample rate to generate a default processing configuration.

The config will get its information about the specific stations to process via the kernel dataset.

NOTE: When doing only single station processing you need to specify RME processing (rather than remote reference processing which expects extra time series from another station)

[7]:
cc = ConfigCreator()
config = cc.create_from_kernel_dataset(kernel_dataset)

# you can export the config to a json by uncommenting the following line
# cfg_json = config.to_json()
25:03:08T12:58:52 | INFO | line:108 |aurora.config.config_creator | determine_band_specification_style | Bands not defined; setting to EMTF BANDS_DEFAULT_FILE

Take a look at the processing configuration

[8]:
config
[8]:
{
    "processing": {
        "band_setup_file": "/home/kkappler/software/irismt/aurora/aurora/config/emtf_band_setup/bs_test.cfg",
        "band_specification_style": "EMTF",
        "channel_nomenclature.ex": "ex",
        "channel_nomenclature.ey": "ey",
        "channel_nomenclature.hx": "hx",
        "channel_nomenclature.hy": "hy",
        "channel_nomenclature.hz": "hz",
        "decimations": [
            {
                "decimation_level": {
                    "bands": [
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.23828125,
                                "frequency_min": 0.19140625,
                                "index_max": 30,
                                "index_min": 25
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.19140625,
                                "frequency_min": 0.15234375,
                                "index_max": 24,
                                "index_min": 20
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.15234375,
                                "frequency_min": 0.12109375,
                                "index_max": 19,
                                "index_min": 16
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.12109375,
                                "frequency_min": 0.09765625,
                                "index_max": 15,
                                "index_min": 13
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.09765625,
                                "frequency_min": 0.07421875,
                                "index_max": 12,
                                "index_min": 10
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.07421875,
                                "frequency_min": 0.05859375,
                                "index_max": 9,
                                "index_min": 8
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.05859375,
                                "frequency_min": 0.04296875,
                                "index_max": 7,
                                "index_min": 6
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 0,
                                "frequency_max": 0.04296875,
                                "frequency_min": 0.03515625,
                                "index_max": 5,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.anti_alias_filter": "default",
                    "decimation.factor": 1.0,
                    "decimation.level": 0,
                    "decimation.method": "default",
                    "decimation.sample_rate": 1.0,
                    "estimator.engine": "RME_RR",
                    "estimator.estimate_per_channel": true,
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "ex",
                        "ey",
                        "hz"
                    ],
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "regression.r0": 1.5,
                    "regression.tolerance": 0.005,
                    "regression.u0": 2.8,
                    "regression.verbosity": 0,
                    "save_fcs": false,
                    "stft.harmonic_indices": [
                        -1
                    ],
                    "stft.method": "fft",
                    "stft.min_num_stft_windows": 2,
                    "stft.per_window_detrend_type": "linear",
                    "stft.pre_fft_detrend_type": "linear",
                    "stft.prewhitening_type": "first difference",
                    "stft.recoloring": true,
                    "stft.window.clock_zero_type": "ignore",
                    "stft.window.normalized": true,
                    "stft.window.num_samples": 128,
                    "stft.window.overlap": 32,
                    "stft.window.type": "boxcar"
                }
            },
            {
                "decimation_level": {
                    "bands": [
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 1,
                                "frequency_max": 0.0341796875,
                                "frequency_min": 0.0263671875,
                                "index_max": 17,
                                "index_min": 14
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 1,
                                "frequency_max": 0.0263671875,
                                "frequency_min": 0.0205078125,
                                "index_max": 13,
                                "index_min": 11
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 1,
                                "frequency_max": 0.0205078125,
                                "frequency_min": 0.0166015625,
                                "index_max": 10,
                                "index_min": 9
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 1,
                                "frequency_max": 0.0166015625,
                                "frequency_min": 0.0126953125,
                                "index_max": 8,
                                "index_min": 7
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 1,
                                "frequency_max": 0.0126953125,
                                "frequency_min": 0.0107421875,
                                "index_max": 6,
                                "index_min": 6
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 1,
                                "frequency_max": 0.0107421875,
                                "frequency_min": 0.0087890625,
                                "index_max": 5,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.anti_alias_filter": "default",
                    "decimation.factor": 4.0,
                    "decimation.level": 1,
                    "decimation.method": "default",
                    "decimation.sample_rate": 0.25,
                    "estimator.engine": "RME_RR",
                    "estimator.estimate_per_channel": true,
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "ex",
                        "ey",
                        "hz"
                    ],
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "regression.r0": 1.5,
                    "regression.tolerance": 0.005,
                    "regression.u0": 2.8,
                    "regression.verbosity": 0,
                    "save_fcs": false,
                    "stft.harmonic_indices": [
                        -1
                    ],
                    "stft.method": "fft",
                    "stft.min_num_stft_windows": 2,
                    "stft.per_window_detrend_type": "linear",
                    "stft.pre_fft_detrend_type": "linear",
                    "stft.prewhitening_type": "first difference",
                    "stft.recoloring": true,
                    "stft.window.clock_zero_type": "ignore",
                    "stft.window.normalized": true,
                    "stft.window.num_samples": 128,
                    "stft.window.overlap": 32,
                    "stft.window.type": "boxcar"
                }
            },
            {
                "decimation_level": {
                    "bands": [
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 2,
                                "frequency_max": 0.008544921875,
                                "frequency_min": 0.006591796875,
                                "index_max": 17,
                                "index_min": 14
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 2,
                                "frequency_max": 0.006591796875,
                                "frequency_min": 0.005126953125,
                                "index_max": 13,
                                "index_min": 11
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 2,
                                "frequency_max": 0.005126953125,
                                "frequency_min": 0.004150390625,
                                "index_max": 10,
                                "index_min": 9
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 2,
                                "frequency_max": 0.004150390625,
                                "frequency_min": 0.003173828125,
                                "index_max": 8,
                                "index_min": 7
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 2,
                                "frequency_max": 0.003173828125,
                                "frequency_min": 0.002685546875,
                                "index_max": 6,
                                "index_min": 6
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 2,
                                "frequency_max": 0.002685546875,
                                "frequency_min": 0.002197265625,
                                "index_max": 5,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.anti_alias_filter": "default",
                    "decimation.factor": 4.0,
                    "decimation.level": 2,
                    "decimation.method": "default",
                    "decimation.sample_rate": 0.0625,
                    "estimator.engine": "RME_RR",
                    "estimator.estimate_per_channel": true,
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "ex",
                        "ey",
                        "hz"
                    ],
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "regression.r0": 1.5,
                    "regression.tolerance": 0.005,
                    "regression.u0": 2.8,
                    "regression.verbosity": 0,
                    "save_fcs": false,
                    "stft.harmonic_indices": [
                        -1
                    ],
                    "stft.method": "fft",
                    "stft.min_num_stft_windows": 2,
                    "stft.per_window_detrend_type": "linear",
                    "stft.pre_fft_detrend_type": "linear",
                    "stft.prewhitening_type": "first difference",
                    "stft.recoloring": true,
                    "stft.window.clock_zero_type": "ignore",
                    "stft.window.normalized": true,
                    "stft.window.num_samples": 128,
                    "stft.window.overlap": 32,
                    "stft.window.type": "boxcar"
                }
            },
            {
                "decimation_level": {
                    "bands": [
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 3,
                                "frequency_max": 0.00274658203125,
                                "frequency_min": 0.00213623046875,
                                "index_max": 22,
                                "index_min": 18
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 3,
                                "frequency_max": 0.00213623046875,
                                "frequency_min": 0.00164794921875,
                                "index_max": 17,
                                "index_min": 14
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 3,
                                "frequency_max": 0.00164794921875,
                                "frequency_min": 0.00115966796875,
                                "index_max": 13,
                                "index_min": 10
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 3,
                                "frequency_max": 0.00115966796875,
                                "frequency_min": 0.00079345703125,
                                "index_max": 9,
                                "index_min": 7
                            }
                        },
                        {
                            "band": {
                                "center_averaging_type": "geometric",
                                "closed": "left",
                                "decimation_level": 3,
                                "frequency_max": 0.00079345703125,
                                "frequency_min": 0.00054931640625,
                                "index_max": 6,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.anti_alias_filter": "default",
                    "decimation.factor": 4.0,
                    "decimation.level": 3,
                    "decimation.method": "default",
                    "decimation.sample_rate": 0.015625,
                    "estimator.engine": "RME_RR",
                    "estimator.estimate_per_channel": true,
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "ex",
                        "ey",
                        "hz"
                    ],
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "regression.r0": 1.5,
                    "regression.tolerance": 0.005,
                    "regression.u0": 2.8,
                    "regression.verbosity": 0,
                    "save_fcs": false,
                    "stft.harmonic_indices": [
                        -1
                    ],
                    "stft.method": "fft",
                    "stft.min_num_stft_windows": 2,
                    "stft.per_window_detrend_type": "linear",
                    "stft.pre_fft_detrend_type": "linear",
                    "stft.prewhitening_type": "first difference",
                    "stft.recoloring": true,
                    "stft.window.clock_zero_type": "ignore",
                    "stft.window.normalized": true,
                    "stft.window.num_samples": 128,
                    "stft.window.overlap": 32,
                    "stft.window.type": "boxcar"
                }
            }
        ],
        "id": "test1_rr_test2_sr1",
        "stations.local.id": "test1",
        "stations.local.mth5_path": "/home/kkappler/software/irismt/mth5/mth5/data/mth5/test12rr.h5",
        "stations.local.remote": false,
        "stations.local.runs": [
            {
                "run": {
                    "id": "001",
                    "input_channels": [
                        {
                            "channel": {
                                "id": "hx",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hy",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "output_channels": [
                        {
                            "channel": {
                                "id": "ex",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "ey",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hz",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "sample_rate": 1.0,
                    "time_periods": [
                        {
                            "time_period": {
                                "end": "1980-01-01T11:06:39+00:00",
                                "start": "1980-01-01T00:00:00+00:00"
                            }
                        }
                    ]
                }
            }
        ],
        "stations.remote": [
            {
                "station": {
                    "id": "test2",
                    "mth5_path": "/home/kkappler/software/irismt/mth5/mth5/data/mth5/test12rr.h5",
                    "remote": true,
                    "runs": [
                        {
                            "run": {
                                "id": "001",
                                "input_channels": [
                                    {
                                        "channel": {
                                            "id": "hx",
                                            "scale_factor": 1.0
                                        }
                                    },
                                    {
                                        "channel": {
                                            "id": "hy",
                                            "scale_factor": 1.0
                                        }
                                    }
                                ],
                                "output_channels": [
                                    {
                                        "channel": {
                                            "id": "ex",
                                            "scale_factor": 1.0
                                        }
                                    },
                                    {
                                        "channel": {
                                            "id": "ey",
                                            "scale_factor": 1.0
                                        }
                                    },
                                    {
                                        "channel": {
                                            "id": "hz",
                                            "scale_factor": 1.0
                                        }
                                    }
                                ],
                                "sample_rate": 1.0,
                                "time_periods": [
                                    {
                                        "time_period": {
                                            "end": "1980-01-01T11:06:39+00:00",
                                            "start": "1980-01-01T00:00:00+00:00"
                                        }
                                    }
                                ]
                            }
                        }
                    ]
                }
            }
        ]
    }
}

Call process_mth5

[9]:
show_plot = True
tf_cls = process_mth5(config,
                    kernel_dataset,
                    units="MT",
                    show_plot=show_plot,
                    z_file_path=None,
                )
25:03:08T12:58:52 | INFO | line:291 |aurora.pipelines.transfer_function_kernel | show_processing_summary | Processing Summary Dataframe:
25:03:08T12:58:52 | INFO | line:292 |aurora.pipelines.transfer_function_kernel | show_processing_summary |
   duration  has_data  n_samples  run station          survey       run_hdf5_reference   station_hdf5_reference    fc  remote  stft mth5_obj dec_level  dec_factor  sample_rate  window_duration  num_samples_window  num_samples  num_stft_windows
0   39999.0      True      40000  001   test1  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>   False  None     None         0         1.0     1.000000            128.0                 128      39999.0             416.0
1   39999.0      True      40000  001   test1  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>   False  None     None         1         4.0     0.250000            512.0                 128       9999.0             103.0
2   39999.0      True      40000  001   test1  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>   False  None     None         2         4.0     0.062500           2048.0                 128       2499.0              25.0
3   39999.0      True      40000  001   test1  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>   False  None     None         3         4.0     0.015625           8192.0                 128        624.0               6.0
4   39999.0      True      40000  001   test2  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>    True  None     None         0         1.0     1.000000            128.0                 128      39999.0             416.0
5   39999.0      True      40000  001   test2  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>    True  None     None         1         4.0     0.250000            512.0                 128       9999.0             103.0
6   39999.0      True      40000  001   test2  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>    True  None     None         2         4.0     0.062500           2048.0                 128       2499.0              25.0
7   39999.0      True      40000  001   test2  EMTF Synthetic  <HDF5 object reference>  <HDF5 object reference>  <NA>    True  None     None         3         4.0     0.015625           8192.0                 128        624.0               6.0
25:03:08T12:58:52 | INFO | line:680 |aurora.pipelines.transfer_function_kernel | memory_check | Total memory: 62.74 GB
25:03:08T12:58:52 | INFO | line:684 |aurora.pipelines.transfer_function_kernel | memory_check | Total Bytes of Raw Data: 0.001 GB
25:03:08T12:58:52 | INFO | line:687 |aurora.pipelines.transfer_function_kernel | memory_check | Raw Data will use: 0.001 % of memory
25:03:08T12:58:52 | INFO | line:733 |aurora.pipelines.transfer_function_kernel | mth5_has_fcs | Fourier coefficients not detected for survey: EMTF Synthetic, station: test1, run: 001-- Fourier coefficients will be computed
25:03:08T12:58:52 | INFO | line:777 |mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/mth5/mth5/data/mth5/test12rr.h5
25:03:08T12:58:52 | INFO | line:733 |aurora.pipelines.transfer_function_kernel | mth5_has_fcs | Fourier coefficients not detected for survey: EMTF Synthetic, station: test2, run: 001-- Fourier coefficients will be computed
25:03:08T12:58:52 | INFO | line:777 |mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/mth5/mth5/data/mth5/test12rr.h5
25:03:08T12:58:52 | INFO | line:262 |aurora.pipelines.transfer_function_kernel | check_if_fcs_already_exist | FC levels not present
25:03:08T12:58:52 | INFO | line:552 |aurora.pipelines.process_mth5 | process_mth5_legacy | Processing config indicates 4 decimation levels
25:03:08T12:58:52 | INFO | line:463 |aurora.pipelines.transfer_function_kernel | valid_decimations | After validation there are 4 valid decimation levels
25:03:08T12:58:53 | INFO | line:900 |mtpy.processing.kernel_dataset | initialize_dataframe_for_processing | Dataset dataframe initialized successfully
25:03:08T12:58:53 | INFO | line:157 |aurora.pipelines.transfer_function_kernel | update_dataset_df | Dataset Dataframe Updated for decimation level 0 Successfully
25:03:08T12:58:53 | INFO | line:387 |aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc =  False
25:03:08T12:58:53 | INFO | line:387 |aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc =  False
25:03:08T12:58:53 | INFO | line:46 |aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Accessing band 25.728968s  (0.038867Hz)
25:03:08T12:58:54 | INFO | line:208 |aurora.time_series.frequency_band_helpers | get_band_for_coherence_sorting | Processing band 25.728968s  (0.038867Hz)
25:03:08T12:58:54 | WARNING | line:213 |aurora.time_series.frequency_band_helpers | get_band_for_coherence_sorting | Cant evaluate coherence with only 1 harmonic
25:03:08T12:58:54 | INFO | line:214 |aurora.time_series.frequency_band_helpers | get_band_for_coherence_sorting | Widening band according to min3 rule
25:03:08T12:58:54 | INFO | line:46 |aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Accessing band 26.836092s  (0.037263Hz)
25:03:08T12:58:54 | INFO | line:116 |mth5.helpers | close_open_files | /home/kkappler/software/irismt/mth5/mth5/data/mth5/test12rr.h5, Closed File
25:03:08T12:58:54 | INFO | line:116 |mth5.helpers | close_open_files | /home/kkappler/software/irismt/mth5/mth5/data/mth5/test12rr.h5, Closed File
25:03:08T12:58:54 | ERROR | line:665 |aurora.pipelines.process_mth5 | process_mth5 | Failed to run legacy processing
closing all open mth5 files and exitingThe encountered exception was 'Band' object has no attribute 'bands'
25:03:08T12:58:54 | ERROR | line:666 |aurora.pipelines.process_mth5 | process_mth5 | Failed to run legacy processing
closing all open mth5 files and exitingThe encountered exception was 'Band' object has no attribute 'bands'
Traceback (most recent call last):

  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
           │         │     └ {'__name__': '__main__', '__doc__': 'Entry point for launching an IPython kernel.\n\nThis is separate from the ipykernel pack...
           │         └ <code object <module> at 0x7cf02c8ee450, file "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel_lau...
           └ <function _run_code at 0x7cf02c8e7e50>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
         │     └ {'__name__': '__main__', '__doc__': 'Entry point for launching an IPython kernel.\n\nThis is separate from the ipykernel pack...
         └ <code object <module> at 0x7cf02c8ee450, file "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel_lau...
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
    │   └ <bound method Application.launch_instance of <class 'ipykernel.kernelapp.IPKernelApp'>>
    └ <module 'ipykernel.kernelapp' from '/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/kernelapp.py'>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
    │   └ <function IPKernelApp.start at 0x7cf0299de670>
    └ <ipykernel.kernelapp.IPKernelApp object at 0x7cf02cfd3af0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/kernelapp.py", line 739, in start
    self.io_loop.start()
    │    │       └ <function BaseAsyncIOLoop.start at 0x7cf029925550>
    │    └ <tornado.platform.asyncio.AsyncIOMainLoop object at 0x7cf02993c400>
    └ <ipykernel.kernelapp.IPKernelApp object at 0x7cf02cfd3af0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/tornado/platform/asyncio.py", line 205, in start
    self.asyncio_loop.run_forever()
    │    │            └ <function BaseEventLoop.run_forever at 0x7cf02c09e430>
    │    └ <_UnixSelectorEventLoop running=True closed=False debug=False>
    └ <tornado.platform.asyncio.AsyncIOMainLoop object at 0x7cf02993c400>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/asyncio/base_events.py", line 601, in run_forever
    self._run_once()
    │    └ <function BaseEventLoop._run_once at 0x7cf02c09ff70>
    └ <_UnixSelectorEventLoop running=True closed=False debug=False>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once
    handle._run()
    │      └ <function Handle._run at 0x7cf02c0da5e0>
    └ <Handle <TaskWakeupMethWrapper object at 0x7cefbdbf6130>(<Future finis...3B)>, ...],))>)>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
    │    │            │    │           │    └ <member '_args' of 'Handle' objects>
    │    │            │    │           └ <Handle <TaskWakeupMethWrapper object at 0x7cefbdbf6130>(<Future finis...3B)>, ...],))>)>
    │    │            │    └ <member '_callback' of 'Handle' objects>
    │    │            └ <Handle <TaskWakeupMethWrapper object at 0x7cefbdbf6130>(<Future finis...3B)>, ...],))>)>
    │    └ <member '_context' of 'Handle' objects>
    └ <Handle <TaskWakeupMethWrapper object at 0x7cefbdbf6130>(<Future finis...3B)>, ...],))>)>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 545, in dispatch_queue
    await self.process_one()
          │    └ <function Kernel.process_one at 0x7cf029a17700>
          └ <ipykernel.ipkernel.IPythonKernel object at 0x7cf02993ca60>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 534, in process_one
    await dispatch(*args)
          │         └ ([<zmq.Frame(b'19410a80-f30'...36B)>, <zmq.Frame(b'<IDS|MSG>')>, <zmq.Frame(b'aea1c3e7b498'...64B)>, <zmq.Frame(b'{"date":"20...
          └ <bound method Kernel.dispatch_shell of <ipykernel.ipkernel.IPythonKernel object at 0x7cf02993ca60>>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 437, in dispatch_shell
    await result
          └ <coroutine object IPythonKernel.execute_request at 0x7cf02847b7c0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/ipkernel.py", line 362, in execute_request
    await super().execute_request(stream, ident, parent)
                                  │       │      └ {'header': {'date': datetime.datetime(2025, 3, 8, 20, 58, 47, 601000, tzinfo=tzutc()), 'msg_id': 'c417b299-e8a6-4e18-a64c-c08...
                                  │       └ [b'19410a80-f302-4536-b66f-4b7ff0d5f633']
                                  └ <zmq.eventloop.zmqstream.ZMQStream object at 0x7cf02993c160>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 778, in execute_request
    reply_content = await reply_content
                          └ <coroutine object IPythonKernel.do_execute at 0x7cf02847b840>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/ipkernel.py", line 449, in do_execute
    res = shell.run_cell(
          │     └ <function ZMQInteractiveShell.run_cell at 0x7cf0299ccb80>
          └ <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7cf02847e8b0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/ipykernel/zmqshell.py", line 549, in run_cell
    return super().run_cell(*args, **kwargs)
                             │       └ {'store_history': True, 'silent': False, 'cell_id': '667bf6f1-9faf-4dd7-b7cb-15e65c50ba0f'}
                             └ ('show_plot = True\ntf_cls = process_mth5(config,\n                    kernel_dataset,\n                    units="MT",\n    ...
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3048, in run_cell
    result = self._run_cell(
             │    └ <function InteractiveShell._run_cell at 0x7cf02a530040>
             └ <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7cf02847e8b0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3103, in _run_cell
    result = runner(coro)
             │      └ <coroutine object InteractiveShell.run_cell_async at 0x7cf02847ba40>
             └ <function _pseudo_sync_runner at 0x7cf02a51d8b0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
    │    └ <method 'send' of 'coroutine' objects>
    └ <coroutine object InteractiveShell.run_cell_async at 0x7cf02847ba40>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3308, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
                       │    │             │        │     └ '/tmp/ipykernel_2837066/2463673138.py'
                       │    │             │        └ [<ast.Assign object at 0x7cefbdb1bb50>, <ast.Assign object at 0x7cefbdb1b190>]
                       │    │             └ <ast.Module object at 0x7cefbdb1b550>
                       │    └ <function InteractiveShell.run_ast_nodes at 0x7cf02a530310>
                       └ <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7cf02847e8b0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3490, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
             │    │        │     │              └ False
             │    │        │     └ <ExecutionResult object at 7cefbdb1bbe0, execution_count=9 error_before_exec=None error_in_exec=None info=<ExecutionInfo obje...
             │    │        └ <code object <module> at 0x7cefbdb1d9d0, file "/tmp/ipykernel_2837066/2463673138.py", line 2>
             │    └ <function InteractiveShell.run_code at 0x7cf02a5303a0>
             └ <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7cf02847e8b0>
  File "/home/kkappler/anaconda3/envs/aurora/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
         │         │    │               │    └ {'__name__': '__main__', '__doc__': 'Automatically created module for IPython interactive environment', '__package__': None, ...
         │         │    │               └ <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7cf02847e8b0>
         │         │    └ <property object at 0x7cf02a6fcef0>
         │         └ <ipykernel.zmqshell.ZMQInteractiveShell object at 0x7cf02847e8b0>
         └ <code object <module> at 0x7cefbdb1d9d0, file "/tmp/ipykernel_2837066/2463673138.py", line 2>

  File "/tmp/ipykernel_2837066/2463673138.py", line 2, in <module>
    tf_cls = process_mth5(config,
             │            └ {
        "processing": {
            "band_setup_file": "/home/kkappler/software/irismt/aurora/aurora/config/emtf_band_setup/bs_test...
    <function process_mth5 at 0x7cefbdbfb5e0>

> File "/home/kkappler/software/irismt/aurora/aurora/pipelines/process_mth5.py", line 652, in process_mth5
    return process_mth5_legacy(
    <function process_mth5_legacy at 0x7cefbdbfb550>

  File "/home/kkappler/software/irismt/aurora/aurora/pipelines/process_mth5.py", line 567, in process_mth5_legacy
    ttfz_obj = process_tf_decimation_level(
    <function process_tf_decimation_level at 0x7cefbdbd95e0>

  File "/home/kkappler/software/irismt/aurora/aurora/pipelines/process_mth5.py", line 200, in process_tf_decimation_level
    transfer_function_obj = process_transfer_functions(
    <function process_transfer_functions at 0x7cefd9b75550>

  File "/home/kkappler/software/irismt/aurora/aurora/pipelines/transfer_function_helpers.py", line 290, in process_transfer_functions
    spgm.cross_powers(frequency_bands=(feature_band))
    │    │                             └ {
    │    │                                   "band": {
    │    │                                       "center_averaging_type": "geometric",
    │    │                                       "closed": "left",
    │    │                                       "decimation_level": 0,
    │    │                                     ...
    │    └ <function Spectrogram.cross_powers at 0x7cefd9bac310>
    Spectrogram:
      3 harmonics, 0.0078125Hz spaced 
       from 0.03125 to 0.046875 Hz.
      
      416 Time observations 
      Start: 1980-01-01T00:00:0...

  File "/home/kkappler/software/irismt/mth5/mth5/timeseries/spectre/spectrogram.py", line 265, in cross_powers
    valid_frequency_bands = self._validate_frequency_bands(frequency_bands)
                            │    │                         └ {
                            │    │                               "band": {
                            │    │                                   "center_averaging_type": "geometric",
                            │    │                                   "closed": "left",
                            │    │                                   "decimation_level": 0,
                            │    │                                 ...
                            │    └ <function Spectrogram._validate_frequency_bands at 0x7cefd9bac280>
    Spectrogram:
                              3 harmonics, 0.0078125Hz spaced 
                               from 0.03125 to 0.046875 Hz.
                              
                              416 Time observations 
                              Start: 1980-01-01T00:00:0...

  File "/home/kkappler/software/irismt/mth5/mth5/timeseries/spectre/spectrogram.py", line 219, in _validate_frequency_bands
    valid_bands = [x for x in frequency_bands.bands() if self.frequency_band.contains(x)]
                              │                          │    └ <property object at 0x7cefd9ba3e50>
                              │                          └ Spectrogram:
    3 harmonics, 0.0078125Hz spaced 
     from 0.03125 to 0.046875 Hz.
    
    416 Time observations 
    Start: 1980-01-01T00:00:0...
    {
                                    "band": {
                                        "center_averaging_type": "geometric",
                                        "closed": "left",
                                        "decimation_level": 0,
                                      ...

AttributeError: 'Band' object has no attribute 'bands'

Export TF to a file

[10]:
xml_file_base = f"synthetic_test1.xml"
tf_cls.write(fn=xml_file_base, file_type="emtfxml")

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[10], line 2
      1 xml_file_base = f"synthetic_test1.xml"
----> 2 tf_cls.write(fn=xml_file_base, file_type="emtfxml")

AttributeError: 'NoneType' object has no attribute 'write'
[ ]:
edi_file_base = f"synthetic_test1.edi"
tf_cls.write(fn=edi_file_base, file_type="edi")

NOTE Fourier coefficient section below here is a work in progress.

Fourier coefficient storage in MTH5

The capability to store Fourier coeffficients (FCs) in MTH5 is now available. This will enable some different approaches to processing and data quality control (QC). The data QC tools are a work in progress. The following examples show how to add some FC levels to an MTH5, providing a starting point for processing or feature extraction from these data.

There are currently two main ways to add FCs to the MTH5.

  1. Store on the fly while processing with Aurora

  2. Use a dedicated method to make FCs only.

Storing while processing with Aurora

We can store FCs while processing by changing the processing config’s save_fcs attribute to True

If we do that with the current processing config however we encouter a warning:

Saving FCs for remote reference processing is not supported
 - To save FCs, process as single station, then you can use the FCs for RR processing
 - See issue #319 for details
 - forcing processing config save_fcs = False

There are two workarounds for this:

  1. Process each station as a single station, and then the FCs (TODO: Add an example of this)

  2. Explicitly call add_fcs function.

[ ]:
file_size_before_adding_fcs = mth5_path.stat().st_size
print(f"file_size_before_adding_fcs: {file_size_before_adding_fcs}")

Using the “single-station processing method” of FC generation

[ ]:
# set  kernel_dataset to work on one station only
kernel_dataset = KernelDataset()
kernel_dataset.from_run_summary(run_summary, "test1", None)

# Update the processing config

cc = ConfigCreator()
config = cc.create_from_kernel_dataset(kernel_dataset)

# tell the config to save the FCs
for dec in config.decimations:
    dec.save_fcs = True
    dec.save_fcs_type = "h5"

# you can export the config to a json by uncommenting the following line
# cfg_json = config.to_json()
kernel_dataset.mini_summary
[ ]:
show_plot = False
tf_cls = process_mth5(config,
                    kernel_dataset,
                    units="MT",
                    show_plot=show_plot,
                    z_file_path=None,
                )
[ ]:
file_size_after_adding_fcs_station_1 = mth5_path.stat().st_size
print(f"file_size_after_adding_fcs_station_1: {file_size_after_adding_fcs_station_1}")
[ ]:
# Now the other station

# set  kernel_dataset to work on one station only
kernel_dataset = KernelDataset()
kernel_dataset.from_run_summary(run_summary, "test2", None)

# Update the processing config

cc = ConfigCreator()
config = cc.create_from_kernel_dataset(kernel_dataset)

# tell the config to save the FCs
for dec in config.decimations:
    dec.save_fcs = True
    dec.save_fcs_type = "h5"


[ ]:
show_plot = False
tf_cls = process_mth5(config,
                    kernel_dataset,
                    units="MT",
                    show_plot=show_plot,
                    z_file_path=None,
                )

Now, the FCs are there (for the specific processing configuration).

If you were to re-process the data, the FCs are already there,

[ ]:
file_size_after_adding_fcs_station_2 = mth5_path.stat().st_size
print(f"file_size_after_adding_fcs_station_2: {file_size_after_adding_fcs_station_2}")
[ ]:
# set  kernel_dataset to work on both stations again
kernel_dataset = KernelDataset()
kernel_dataset.from_run_summary(run_summary, "test1", "test2")

# Update the processing config

cc = ConfigCreator()
config = cc.create_from_kernel_dataset(kernel_dataset)



[ ]:

from aurora.pipelines.transfer_function_kernel import TransferFunctionKernel tfk = TransferFunctionKernel(dataset=kernel_dataset, config=config) tfk.update_processing_summary() tfk.check_if_fcs_already_exist() tfk.dataset_df
[ ]:
show_plot = False
tf_cls = process_mth5(config,
                    kernel_dataset,
                    units="MT",
                    show_plot=show_plot,
                    z_file_path=None,
                )

Using Dedicated FC Generation (add_fcs function)

[ ]:
from mth5.timeseries.spectre.helpers import add_fcs_to_mth5
from mth5.timeseries.spectre.helpers import fc_decimations_creator
from mth5.timeseries.spectre.helpers import read_back_fcs

Build a fresh copy of the synthetic data file

[ ]:
mth5_path.unlink()
create_test12rr_h5(target_folder=target_folder)

In this case we need to prescribe the decimation configuration.

If we dont want to decimate, we can just pass fc_decimations == "degenerate" to add_fcs

Here is how the decimations can be created

[ ]:
sample_rate = 1.0
fc_decimations = fc_decimations_creator(initial_sample_rate=sample_rate, time_period=None)

And here is what they look like, … a list, with each element specifying the needed info for decimation and FFT.

[ ]:
fc_decimations

Now add the decimation levels:

[ ]:
add_fcs_to_mth5(mth5_path, fc_decimations=fc_decimations)

Now take a look at the mth5

[ ]:
import mth5
m = mth5.mth5.MTH5(mth5_path)
m.open_mth5()
[ ]:
m.close_mth5()

Alternatively, we could have translated the Aurora processing config to an mt_metadata FC decimation and gnerated FCs with the same function.

[ ]:
mth5_path.unlink()
create_test12rr_h5(target_folder=target_folder)
[ ]:
# Extract FC decimations from processing config and build the layer

fc_decimations = [
    x.to_fc_decimation() for x in config.decimations
]

add_fcs_to_mth5(mth5_path, fc_decimations=fc_decimations)
[ ]:

read_back_fcs(mth5_path)

And we can see that when processing, aurora detects that the FC layer is already there, and skips creating it.

[ ]:
show_plot = False
tf_cls = process_mth5(config,
                    kernel_dataset,
                    units="MT",
                    show_plot=show_plot,
                    z_file_path=None,
                )

NOTE Feature storage section below is a work in progress.

Feature Storage Experimental work in progress

Any spectrogram (FC array at some decimation level) can be passed to a feature extraction method. These features can, in turn be stored in the mth5.

This will have the following steps:

  1. select the source FC decimation level (Will use “0” to start, but also test with “1”,)

  2. Select the frequency bands (probably make these wider than normal so we have plenty of FCs per time window)

  • Note these are feature_extraction_bands, which are possibly different from TF estimation bands.

To do this we will want to look at the frequencies associated with each FCGroup, and each decimation level.

  • Group Runs on Sample Rate

    • tabulate decimation levels

    • group decimation levels by id (and validate that the decimation_level.metadata.sample_rate_decimations agree)

    • for each decimation level, we should get the frequencies associated – this can be deduced from the shape of the dataset, and the min/max frequecies of the metadata, but for now we can just unpack the xarray and take the frequncey axis)

  1. loop over the bands (for each band:)

  • 2a. extract the band from the FClayer in the mth5 (shape will be nch x ntime x nharmonics)

  1. loop over time window (for each time):

  • 3a) extract feature

As a placeholder, we will compute the channel crosspowers, and or channel energy.

  • 3b) place the feature into an xarray

  1. save teh xarray as “feature_cross_power”

[ ]:
# Imports

from mth5.mth5 import MTH5

import numpy as np
import pandas as pd
[ ]:
# # Selection of frequency bands

# active_decimation_level = "0"

# m = MTH5(mth5_path)
# m.open_mth5()
# run_summary_df = m.run_summary  # This fails on current main but works on add_aurora_tools branch.
# run_summary_df

It would be nice if we could get a summary of the fcs available in the mth5. here is a prototype for how that could be done.

[ ]:
# # Prototype for fc_summary

# fc_summary = None

# for i, row in run_summary_df.iterrows():
#     print(row.station, row.run)
#     station_obj = m.get_station(row.station)
#     run_fc_group = station_obj.fourier_coefficients_group.get_fc_group(row.run)
#     if fc_summary is None:
#         fc_summary = run_fc_group.decimation_level_summary
#         fc_summary["station"] = row.station
#         fc_summary["run"] = row.run
#         fc_summary["sample_rate_time_series"] = 0.0
#         fc_summary["window_num_samples"] = 0
#         fc_summary["harmonic_indices"] = None
#     else:
#         tmp = run_fc_group.decimation_level_summary
#         tmp["station"] = row.station
#         tmp["run"] = row.run
#         tmp["sample_rate_time_series"] = 0.0
#         tmp["window_num_samples"] = 0
#         tmp["harmonic_indices"] = None
#         fc_summary = pd.concat([fc_summary, tmp], ignore_index=True)

#         #print(run_fc_group)
#     for dec_level_id in run_fc_group.groups_list:
#         dec_level = run_fc_group.get_decimation_level(dec_level_id)
#         cond1 = fc_summary.station==row.station
#         cond2 = fc_summary.run==row.run
#         cond3 = fc_summary.component==dec_level_id
#         fc_summary.loc[cond1 & cond2 & cond3]  # this should be a unique row
#         ndx = fc_summary.loc[cond1 & cond2 & cond3].index
#         assert(len(ndx))==1
#         #fc_summary["sample_rate_time_series"].at[ndx] = dec_level.metadata.i
#         fc_summary["window_num_samples"].at[ndx[0]] = dec_level.metadata.window.num_samples
#         fc_summary["sample_rate_time_series"].at[ndx[0]] = dec_level.metadata.sample_rate
#         fc_summary["harmonic_indices"].at[ndx[0]] = dec_level.metadata.harmonic_indices


#         #print(dec_level_id, dec_level)
#         xr_dec_level = dec_level.to_xarray()
#         print("station", row.station, "run", row.run, "decimation level", dec_level_id, "SAMPLE rate", dec_level.metadata.sample_rate) # xr_dec_level.frequency)
[ ]:
# fc_summary["delta_f"] = fc_summary["sample_rate_time_series"] / fc_summary["window_num_samples"]
# fc_summary

The above table tells us what we need to know to get frequency axes .. and actually, we can make _frequencies() a method of FC_dec_level.

Note that when harmonic_indices is [-1,], that means all the (1-sided) FCs are kept.

  • this can be sanity checked by asserting that the dataset size is window_num_samples//2

So now pick a decimation level, … In this case, the component column is telling us the decimation level.

  • It would seem sensible to make sure that the sample rates, (and delta_f) values are same as well.

[ ]:
# cond0 = fc_summary["component"] == active_decimation_level
# fc_runs_to_featureize_df = fc_summary[cond0]
# fc_runs_to_featureize_df
[ ]:
# frqs = np.fft.fftfreq?

Now pick a scheme for defining the feature frequency bands

  • we prefer not to use the lowest few harmonics … this is like setting the minimum number of cycles per window, say 10

  • we also prefer not to take harmonics very close to the Nyquist frequency as these maybe attenuated by AAF, say we go up to 80% Nyq

  • For TF estimation, bands may be only a few FCs wide,but for feature extraction we make them much wider.

  • lets try with around half an octave

  • how about 2 bands per octave, mean

[ ]:
# min_cycles = 10
# nyquist_fraction = 0.8
# n_feature_bands_per_octave = 2
[ ]:
# for i, row in fc_runs_to_featureize_df.iterrows():
#     frequencies = np.fft.rfftfreq(row.window_num_samples, row.sample_rate_time_series)
#     lower_index = int(min_cycles)
#     upper_index = int(nyquist_fraction*(row.window_num_samples//2))
#     print(f" lower_index={lower_index}, upper_index={upper_index}")
#     freq_min = frequencies[lower_index]
#     freq_max = frequencies[upper_index]
#     print(f"Available band: {freq_min:3f} to {freq_max:3f}")
#     frequency_ratio = freq_max/freq_min
#     available_decades = np.log10(frequency_ratio)
#     available_ocataves = np.log2(frequency_ratio)
#     print(f"Available decades: {available_decades:3f}")
#     print(f"Available octaves: {available_ocataves:3f}")
#     num_feature_bands = available_ocataves * n_feature_bands_per_octave
[ ]:
# dec_level.metadata
# dec_level.to_xarray()
[ ]:
# cond1 = fc_summary.station=="test1"
# cond2 = fc_summary.run=="001"
# cond3 = fc_summary.component=="0"
# fc_summary.loc[cond1 & cond2 & cond3].index
[ ]:
# fc_summary["sample_rate_time_series"].at[0]
[ ]:
# m.close_mth5()
[ ]: