{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build an MTH5 and Operate the Aurora Pipeline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook pulls MT miniSEED data from the IRIS Dataselect web service and produces MTH5 out of it. It outlines the process of making an MTH5 file, generating a processing config, and running the Aurora processor.\n",
"\n",
"It assumes that aurora, mth5, and mt_metadata have all been installed.\n",
"\n",
"In this \"new\" version, the workflow has changed somewhat. \n",
"\n",
"1. The process_mth5 call works with a dataset dataframe, rather than a single run_id\n",
"2. The config object is now based on the mt_metadata.base Base class\n",
"3. Remote reference processing is supported (at least in theory)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 0. Flow of this notebook\n",
"\n",
"Section 1: Here we do imports and construct a table of the data that we will access to build the mth5. Note that there is no explanation here as to the table source -- a future update can show how to create such a table from IRIS data_availability tools\n",
"\n",
"Seciton 2: the metadata and the data are accessed, and the mth5 is created and stored.\n",
"\n",
"Section 3: Aurora is used to process the data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# # Uncomment while developing\n",
"# %load_ext autoreload\n",
"# %autoreload 2\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# Required imports for the program. \n",
"from pathlib import Path\n",
"import pandas as pd\n",
"import warnings\n",
"\n",
"from mth5 import mth5, timeseries\n",
"from mth5.clients.fdsn import FDSN\n",
"from mth5.clients.make_mth5 import MakeMTH5\n",
"from mth5.utils.helpers import initialize_mth5\n",
"from mt_metadata.utils.mttime import get_now_utc, MTime\n",
"from aurora.config import BANDS_DEFAULT_FILE\n",
"from aurora.config.config_creator import ConfigCreator\n",
"from aurora.pipelines.process_mth5 import process_mth5\n",
"from aurora.transfer_function.kernel_dataset import KernelDataset\n",
"from aurora.pipelines.run_summary import RunSummary\n",
"\n",
"warnings.filterwarnings('ignore')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Build an MTH5 file from information extracted by IRIS\n",
"\n",
"- If you have already built an MTH5 you can skip this section \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Set path so MTH5 file builds to current working directory. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PosixPath('/home/kkappler/software/irismt/aurora/docs/examples')"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"default_path = Path().cwd()\n",
"default_path"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Select mth5 file version"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# mth5_version = '0.1.0'\n",
"mth5_version = '0.2.0'\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# Initialize the Make MTH5 code. \n",
"maker = MakeMTH5(mth5_version=mth5_version)\n",
"maker.client = \"IRIS\"\n",
"maker.interact = True\n",
"\n",
"# Initalize an FDSN object to access column names for request df\n",
"fdsn_obj = FDSN()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1A: Specify the data to access from IRIS\n",
"\n",
"Note that here we explicitly prescribe the data, but this dataframe could be built from IRIS data availability tools in a programatic way"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# Generate data frame of FDSN Network, Station, Location, Channel, Startime, Endtime codes of interest\n",
"\n",
"CAS04LQE = ['8P', 'CAS04', '', 'LQE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n",
"CAS04LQN = ['8P', 'CAS04', '', 'LQN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n",
"CAS04BFE = ['8P', 'CAS04', '', 'LFE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n",
"CAS04BFN = ['8P', 'CAS04', '', 'LFN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n",
"CAS04BFZ = ['8P', 'CAS04', '', 'LFZ', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n",
"\n",
"request_list = [CAS04LQE, CAS04LQN, CAS04BFE, CAS04BFN, CAS04BFZ]\n",
"\n",
"# Turn list into dataframe\n",
"request_df = pd.DataFrame(request_list, columns=fdsn_obj.request_columns)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
network
\n",
"
station
\n",
"
location
\n",
"
channel
\n",
"
start
\n",
"
end
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
8P
\n",
"
CAS04
\n",
"
\n",
"
LQE
\n",
"
2020-06-02T19:00:00
\n",
"
2020-07-13T19:00:00
\n",
"
\n",
"
\n",
"
1
\n",
"
8P
\n",
"
CAS04
\n",
"
\n",
"
LQN
\n",
"
2020-06-02T19:00:00
\n",
"
2020-07-13T19:00:00
\n",
"
\n",
"
\n",
"
2
\n",
"
8P
\n",
"
CAS04
\n",
"
\n",
"
LFE
\n",
"
2020-06-02T19:00:00
\n",
"
2020-07-13T19:00:00
\n",
"
\n",
"
\n",
"
3
\n",
"
8P
\n",
"
CAS04
\n",
"
\n",
"
LFN
\n",
"
2020-06-02T19:00:00
\n",
"
2020-07-13T19:00:00
\n",
"
\n",
"
\n",
"
4
\n",
"
8P
\n",
"
CAS04
\n",
"
\n",
"
LFZ
\n",
"
2020-06-02T19:00:00
\n",
"
2020-07-13T19:00:00
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" network station location channel start end\n",
"0 8P CAS04 LQE 2020-06-02T19:00:00 2020-07-13T19:00:00\n",
"1 8P CAS04 LQN 2020-06-02T19:00:00 2020-07-13T19:00:00\n",
"2 8P CAS04 LFE 2020-06-02T19:00:00 2020-07-13T19:00:00\n",
"3 8P CAS04 LFN 2020-06-02T19:00:00 2020-07-13T19:00:00\n",
"4 8P CAS04 LFZ 2020-06-02T19:00:00 2020-07-13T19:00:00"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Inspect the dataframe\n",
"request_df"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# Request the inventory information from IRIS\n",
"inventory = fdsn_obj.get_inventory_from_df(request_df, data=False)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Inventory created at 2024-01-14T18:42:02.056231Z\n",
"\tCreated by: ObsPy 1.4.0\n",
"\t\t https://www.obspy.org\n",
"\tSending institution: MTH5\n",
"\tContains:\n",
"\t\tNetworks (1):\n",
"\t\t\t8P\n",
"\t\tStations (1):\n",
"\t\t\t8P.CAS04 (Corral Hollow, CA, USA)\n",
"\t\tChannels (8):\n",
"\t\t\t8P.CAS04..LFZ, 8P.CAS04..LFN, 8P.CAS04..LFE, 8P.CAS04..LQN (2x), \n",
"\t\t\t8P.CAS04..LQE (3x),\n",
" 0 Trace(s) in Stream:\n",
")"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Inspect the inventory\n",
"inventory"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Builds an MTH5 file from the user defined database. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the mth5 object set, we are ready to actually request the data from the fdsn client (IRIS) and save it to an MTH5 file. This process builds an MTH5 file and can take some time depending on how much data is requested. \n",
"\n",
"Note: `interact` keeps the MTH5 open after it is done building\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[33m\u001b[1m2024-01-14T10:42:02.807633-0800 | WARNING | mth5.mth5 | open_mth5 | 8P_CAS04.h5 will be overwritten in 'w' mode\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:03.186485-0800 | INFO | mth5.mth5 | _initialize_file | Initialized MTH5 0.2.0 file /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5 in mode w\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.151432-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.162071-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.211669-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.221759-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.291045-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.302011-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.353822-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.362284-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.414744-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:56.423305-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:57.723419-0800 | INFO | mth5.groups.base | _add_group | RunGroup a already exists, returning existing group.\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:57.866211-0800 | WARNING | mth5.timeseries.run_ts | validate_metadata | start time of dataset 2020-06-02T19:00:00+00:00 does not match metadata start 2020-06-02T18:41:43+00:00 updating metatdata value to 2020-06-02T19:00:00+00:00\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:58.111401-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:58.338249-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:58.546531-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:58.756314-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:58.959951-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n",
"\u001b[1m2024-01-14T10:42:59.029104-0800 | INFO | mth5.groups.base | _add_group | RunGroup b already exists, returning existing group.\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:59.455798-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:59.658208-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:42:59.857993-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:00.063593-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:00.267380-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n",
"\u001b[1m2024-01-14T10:43:00.343973-0800 | INFO | mth5.groups.base | _add_group | RunGroup c already exists, returning existing group.\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:00.876907-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:01.096488-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:01.299819-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:01.507180-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:01.713057-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n",
"\u001b[1m2024-01-14T10:43:01.804340-0800 | INFO | mth5.groups.base | _add_group | RunGroup d already exists, returning existing group.\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:02.091286-0800 | WARNING | mth5.timeseries.run_ts | validate_metadata | end time of dataset 2020-07-13T19:00:00+00:00 does not match metadata end 2020-07-13T21:46:12+00:00 updating metatdata value to 2020-07-13T19:00:00+00:00\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:02.276636-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:02.500754-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:02.711015-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:02.911784-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:03.115840-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n",
"\u001b[1m2024-01-14T10:43:03.232392-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\u001b[0m\n",
"\u001b[33m\u001b[1m2024-01-14T10:43:03.255150-0800 | WARNING | mth5.mth5 | filename | MTH5 file is not open or has not been created yet. Returning default name\u001b[0m\n"
]
}
],
"source": [
"mth5_object = maker.from_fdsn_client(request_df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1B: Examine and Update the MTH5 object\n",
"\n",
"With the open MTH5 Object, we can start to examine what is in it. For example, retrieve the filename and file_version. You can additionally do things such as getting the station information and edit it by setting a new value, in this case the declination model. "
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"/:\n",
"====================\n",
" |- Group: Experiment\n",
" --------------------\n",
" |- Group: Reports\n",
" -----------------\n",
" |- Group: Standards\n",
" -------------------\n",
" --> Dataset: summary\n",
" ......................\n",
" |- Group: Surveys\n",
" -----------------\n",
" |- Group: CONUS_South\n",
" ---------------------\n",
" |- Group: Filters\n",
" -----------------\n",
" |- Group: coefficient\n",
" ---------------------\n",
" |- Group: electric_analog_to_digital\n",
" ------------------------------------\n",
" |- Group: electric_dipole_92.000\n",
" --------------------------------\n",
" |- Group: electric_si_units\n",
" ---------------------------\n",
" |- Group: magnetic_analog_to_digital\n",
" ------------------------------------\n",
" |- Group: fap\n",
" -------------\n",
" |- Group: fir\n",
" -------------\n",
" |- Group: time_delay\n",
" --------------------\n",
" |- Group: electric_time_offset\n",
" ------------------------------\n",
" |- Group: hx_time_offset\n",
" ------------------------\n",
" |- Group: hy_time_offset\n",
" ------------------------\n",
" |- Group: hz_time_offset\n",
" ------------------------\n",
" |- Group: zpk\n",
" -------------\n",
" |- Group: electric_butterworth_high_pass\n",
" ----------------------------------------\n",
" --> Dataset: poles\n",
" ....................\n",
" --> Dataset: zeros\n",
" ....................\n",
" |- Group: electric_butterworth_low_pass\n",
" ---------------------------------------\n",
" --> Dataset: poles\n",
" ....................\n",
" --> Dataset: zeros\n",
" ....................\n",
" |- Group: magnetic_butterworth_low_pass\n",
" ---------------------------------------\n",
" --> Dataset: poles\n",
" ....................\n",
" --> Dataset: zeros\n",
" ....................\n",
" |- Group: Reports\n",
" -----------------\n",
" |- Group: Standards\n",
" -------------------\n",
" --> Dataset: summary\n",
" ......................\n",
" |- Group: Stations\n",
" ------------------\n",
" |- Group: CAS04\n",
" ---------------\n",
" |- Group: Fourier_Coefficients\n",
" ------------------------------\n",
" |- Group: Transfer_Functions\n",
" ----------------------------\n",
" |- Group: a\n",
" -----------\n",
" --> Dataset: ex\n",
" .................\n",
" --> Dataset: ey\n",
" .................\n",
" --> Dataset: hx\n",
" .................\n",
" --> Dataset: hy\n",
" .................\n",
" --> Dataset: hz\n",
" .................\n",
" |- Group: b\n",
" -----------\n",
" --> Dataset: ex\n",
" .................\n",
" --> Dataset: ey\n",
" .................\n",
" --> Dataset: hx\n",
" .................\n",
" --> Dataset: hy\n",
" .................\n",
" --> Dataset: hz\n",
" .................\n",
" |- Group: c\n",
" -----------\n",
" --> Dataset: ex\n",
" .................\n",
" --> Dataset: ey\n",
" .................\n",
" --> Dataset: hx\n",
" .................\n",
" --> Dataset: hy\n",
" .................\n",
" --> Dataset: hz\n",
" .................\n",
" |- Group: d\n",
" -----------\n",
" --> Dataset: ex\n",
" .................\n",
" --> Dataset: ey\n",
" .................\n",
" --> Dataset: hx\n",
" .................\n",
" --> Dataset: hy\n",
" .................\n",
" --> Dataset: hz\n",
" .................\n",
" --> Dataset: channel_summary\n",
" ..............................\n",
" --> Dataset: tf_summary\n",
" ........................."
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mth5_object"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"mth5_path = mth5_object.filename"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'0.2.0'"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mth5_object.file_version\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1m2024-01-14T10:43:03.348442-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\u001b[0m\n"
]
}
],
"source": [
"mth5_object.close_mth5()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"mth5_object = initialize_mth5(mth5_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1C: Optionally Update Metdata:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IGRF-13\n",
"IGRF\n"
]
}
],
"source": [
"# Edit and update the MTH5 metadata \n",
"s = mth5_object.get_station(\"CAS04\", survey=\"CONUS_South\")\n",
"print(s.metadata.location.declination.model)\n",
"s.metadata.location.declination.model = 'IGRF'\n",
"print(s.metadata.location.declination.model)\n",
"s.write_metadata() # writes to file mth5_filename"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Filename: /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5 \n",
" Version: 0.2.0\n"
]
}
],
"source": [
"# Print some info about the mth5 \n",
"mth5_filename = mth5_object.filename\n",
"version = mth5_object.file_version\n",
"print(f\" Filename: {mth5_filename} \\n Version: {version}\")\n"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Get the available stations and runs from the MTH5 object\n",
"mth5_object.channel_summary.summarize()\n",
"ch_summary = mth5_object.channel_summary.to_dataframe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2: Process Data\n",
"If MTH5 file already exists you can start here if you dont want to execute the previous code to get data again."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"interact = False\n",
"if interact:\n",
" pass\n",
"else:\n",
" h5_path = default_path.joinpath(\"8P_CAS04.h5\")\n",
" mth5_object = initialize_mth5(h5_path, mode=\"a\", file_version=mth5_version)\n",
" ch_summary = mth5_object.channel_summary.to_dataframe()\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate an Aurora Configuration file using MTH5 as an input\n",
"\n",
"Up to this point, we have used mth5 and mt_metadata, but haven't yet used aurora. So we will use the MTH5 that we just created (and examined and updated) as input into Aurora.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Channel Summary\n",
"\n",
"This is a very useful datastructure inside the mth5. It acts basically like an index of available data at the channel-run level, i.e. there is one row for every contiguous chunk of time-series recorded by an electric dipole or magnetometer"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
survey
\n",
"
station
\n",
"
run
\n",
"
latitude
\n",
"
longitude
\n",
"
elevation
\n",
"
component
\n",
"
start
\n",
"
end
\n",
"
n_samples
\n",
"
sample_rate
\n",
"
measurement_type
\n",
"
azimuth
\n",
"
tilt
\n",
"
units
\n",
"
hdf5_reference
\n",
"
run_hdf5_reference
\n",
"
station_hdf5_reference
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ex
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
11267
\n",
"
1.0
\n",
"
electric
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
1
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ey
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
11267
\n",
"
1.0
\n",
"
electric
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
2
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hx
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
11267
\n",
"
1.0
\n",
"
magnetic
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
3
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hy
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
11267
\n",
"
1.0
\n",
"
magnetic
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
4
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hz
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
11267
\n",
"
1.0
\n",
"
magnetic
\n",
"
0.0
\n",
"
90.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
5
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ex
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
847649
\n",
"
1.0
\n",
"
electric
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
6
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ey
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
847649
\n",
"
1.0
\n",
"
electric
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
7
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hx
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
847649
\n",
"
1.0
\n",
"
magnetic
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
8
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hy
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
847649
\n",
"
1.0
\n",
"
magnetic
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
9
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hz
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
847649
\n",
"
1.0
\n",
"
magnetic
\n",
"
0.0
\n",
"
90.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
10
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ex
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
1638043
\n",
"
1.0
\n",
"
electric
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
11
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ey
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
1638043
\n",
"
1.0
\n",
"
electric
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
12
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hx
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
1638043
\n",
"
1.0
\n",
"
magnetic
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
13
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hy
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
1638043
\n",
"
1.0
\n",
"
magnetic
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
14
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hz
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
1638043
\n",
"
1.0
\n",
"
magnetic
\n",
"
0.0
\n",
"
90.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
15
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ex
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
1034586
\n",
"
1.0
\n",
"
electric
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
16
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
ey
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
1034586
\n",
"
1.0
\n",
"
electric
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
17
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hx
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
1034586
\n",
"
1.0
\n",
"
magnetic
\n",
"
13.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
18
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hy
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
1034586
\n",
"
1.0
\n",
"
magnetic
\n",
"
103.2
\n",
"
0.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
"
\n",
"
19
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
37.633351
\n",
"
-121.468382
\n",
"
329.3875
\n",
"
hz
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
1034586
\n",
"
1.0
\n",
"
magnetic
\n",
"
0.0
\n",
"
90.0
\n",
"
digital counts
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
<HDF5 object reference>
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" survey station run latitude longitude elevation component \\\n",
"0 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ex \n",
"1 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ey \n",
"2 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hx \n",
"3 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hy \n",
"4 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hz \n",
"5 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ex \n",
"6 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ey \n",
"7 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hx \n",
"8 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hy \n",
"9 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hz \n",
"10 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ex \n",
"11 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ey \n",
"12 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hx \n",
"13 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hy \n",
"14 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hz \n",
"15 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ex \n",
"16 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ey \n",
"17 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hx \n",
"18 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hy \n",
"19 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hz \n",
"\n",
" start end n_samples \\\n",
"0 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n",
"1 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n",
"2 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n",
"3 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n",
"4 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n",
"5 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n",
"6 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n",
"7 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n",
"8 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n",
"9 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n",
"10 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n",
"11 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n",
"12 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n",
"13 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n",
"14 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n",
"15 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n",
"16 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n",
"17 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n",
"18 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n",
"19 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n",
"\n",
" sample_rate measurement_type azimuth tilt units \\\n",
"0 1.0 electric 13.2 0.0 digital counts \n",
"1 1.0 electric 103.2 0.0 digital counts \n",
"2 1.0 magnetic 13.2 0.0 digital counts \n",
"3 1.0 magnetic 103.2 0.0 digital counts \n",
"4 1.0 magnetic 0.0 90.0 digital counts \n",
"5 1.0 electric 13.2 0.0 digital counts \n",
"6 1.0 electric 103.2 0.0 digital counts \n",
"7 1.0 magnetic 13.2 0.0 digital counts \n",
"8 1.0 magnetic 103.2 0.0 digital counts \n",
"9 1.0 magnetic 0.0 90.0 digital counts \n",
"10 1.0 electric 13.2 0.0 digital counts \n",
"11 1.0 electric 103.2 0.0 digital counts \n",
"12 1.0 magnetic 13.2 0.0 digital counts \n",
"13 1.0 magnetic 103.2 0.0 digital counts \n",
"14 1.0 magnetic 0.0 90.0 digital counts \n",
"15 1.0 electric 13.2 0.0 digital counts \n",
"16 1.0 electric 103.2 0.0 digital counts \n",
"17 1.0 magnetic 13.2 0.0 digital counts \n",
"18 1.0 magnetic 103.2 0.0 digital counts \n",
"19 1.0 magnetic 0.0 90.0 digital counts \n",
"\n",
" hdf5_reference run_hdf5_reference station_hdf5_reference \n",
"0 \n",
"1 \n",
"2 \n",
"3 \n",
"4 \n",
"5 \n",
"6 \n",
"7 \n",
"8 \n",
"9 \n",
"10 \n",
"11 \n",
"12 \n",
"13 \n",
"14 \n",
"15 \n",
"16 \n",
"17 \n",
"18 \n",
"19 "
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ch_summary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The Channel summary has a lot of uses, below we use it to check if the data have mixed sample rates, and to get a list of available stations"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Available stations: ['CAS04']\n"
]
}
],
"source": [
"available_runs = ch_summary.run.unique()\n",
"sr = ch_summary.sample_rate.unique()\n",
"if len(sr) != 1:\n",
" print('Only one sample rate per run is available')\n",
" \n",
"available_stations = ch_summary.station.unique()\n",
"print(f\"Available stations: {available_stations}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"### Run Summary\n",
"\n",
"A cousin of the channel summary is the Run Summary.\n",
"This is a condensed version of the channel summary, with one row per continuous acquistion run at a station.\n",
"\n",
"The run summary can be accessed from an open mth5 object, or from an iterable of h5 paths as in the example below\n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1m2024-01-14T10:43:04.066603-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\u001b[0m\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
survey
\n",
"
station_id
\n",
"
run_id
\n",
"
start
\n",
"
end
\n",
"
sample_rate
\n",
"
input_channels
\n",
"
output_channels
\n",
"
channel_scale_factors
\n",
"
valid
\n",
"
mth5_path
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
1.0
\n",
"
[hx, hy]
\n",
"
[ex, ey, hz]
\n",
"
{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...
\n",
"
True
\n",
"
/home/kkappler/software/irismt/aurora/docs/exa...
\n",
"
\n",
"
\n",
"
1
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
1.0
\n",
"
[hx, hy]
\n",
"
[ex, ey, hz]
\n",
"
{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...
\n",
"
True
\n",
"
/home/kkappler/software/irismt/aurora/docs/exa...
\n",
"
\n",
"
\n",
"
2
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
1.0
\n",
"
[hx, hy]
\n",
"
[ex, ey, hz]
\n",
"
{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...
\n",
"
True
\n",
"
/home/kkappler/software/irismt/aurora/docs/exa...
\n",
"
\n",
"
\n",
"
3
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
1.0
\n",
"
[hx, hy]
\n",
"
[ex, ey, hz]
\n",
"
{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...
\n",
"
True
\n",
"
/home/kkappler/software/irismt/aurora/docs/exa...
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" survey station_id run_id start \\\n",
"0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n",
"1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n",
"2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n",
"3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n",
"\n",
" end sample_rate input_channels output_channels \\\n",
"0 2020-06-02 22:07:46+00:00 1.0 [hx, hy] [ex, ey, hz] \n",
"1 2020-06-12 17:52:23+00:00 1.0 [hx, hy] [ex, ey, hz] \n",
"2 2020-07-01 17:32:59+00:00 1.0 [hx, hy] [ex, ey, hz] \n",
"3 2020-07-13 19:00:00+00:00 1.0 [hx, hy] [ex, ey, hz] \n",
"\n",
" channel_scale_factors valid \\\n",
"0 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n",
"1 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n",
"2 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n",
"3 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n",
"\n",
" mth5_path \n",
"0 /home/kkappler/software/irismt/aurora/docs/exa... \n",
"1 /home/kkappler/software/irismt/aurora/docs/exa... \n",
"2 /home/kkappler/software/irismt/aurora/docs/exa... \n",
"3 /home/kkappler/software/irismt/aurora/docs/exa... "
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mth5_run_summary = RunSummary()\n",
"h5_path = default_path.joinpath(\"8P_CAS04.h5\")\n",
"mth5_run_summary.from_mth5s([h5_path,])\n",
"run_summary = mth5_run_summary.clone()\n",
"run_summary.df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we have a dataframe of the available runs to process from the MTH5 \n",
"\n",
"Sometimes we just want to look at the survey, station, run, and time intervals\n",
"we can for that we can call mini_summary"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
survey
\n",
"
station_id
\n",
"
run_id
\n",
"
start
\n",
"
end
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
\n",
"
\n",
"
1
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
\n",
"
\n",
"
2
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
\n",
"
\n",
"
3
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" survey station_id run_id start \\\n",
"0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n",
"1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n",
"2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n",
"3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n",
"\n",
" end \n",
"0 2020-06-02 22:07:46+00:00 \n",
"1 2020-06-12 17:52:23+00:00 \n",
"2 2020-07-01 17:32:59+00:00 \n",
"3 2020-07-13 19:00:00+00:00 "
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"run_summary.mini_summary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But here are the columns in the run summary"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['survey', 'station_id', 'run_id', 'start', 'end', 'sample_rate',\n",
" 'input_channels', 'output_channels', 'channel_scale_factors', 'valid',\n",
" 'mth5_path'],\n",
" dtype='object')"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"run_summary.df.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Make your own mini summary by choosing columns"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
survey
\n",
"
station_id
\n",
"
run_id
\n",
"
start
\n",
"
end
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
\n",
"
\n",
"
1
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
\n",
"
\n",
"
2
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
\n",
"
\n",
"
3
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" survey station_id run_id start \\\n",
"0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n",
"1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n",
"2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n",
"3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n",
"\n",
" end \n",
"0 2020-06-02 22:07:46+00:00 \n",
"1 2020-06-12 17:52:23+00:00 \n",
"2 2020-07-01 17:32:59+00:00 \n",
"3 2020-07-13 19:00:00+00:00 "
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"coverage_short_list_columns = [\"survey\", 'station_id', 'run_id', 'start', 'end', ]\n",
"run_summary.df[coverage_short_list_columns]"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"### Kernel Dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is like a run summary, but for a single station or a pair of stations.\n",
"It is used to specify the inputs to aurora processing.\n",
"\n",
"It takes a run_summary and a station name, and optionally, a remote reference station name\n",
"\n",
"It is made _based on the available data_ in the MTH5 archive.\n",
"\n",
"Syntax:\n",
"kernel_dataset.from_run_summary(run_summary, local_station_id, reference_station_id)\n",
"\n",
"By Default, all runs will be processed\n",
"\n",
"To restrict to processing a single run, or a list of runs, we can either tell KernelDataset to keep or drop a station_run dictionary. \n"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
survey
\n",
"
station_id
\n",
"
run_id
\n",
"
start
\n",
"
end
\n",
"
duration
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
a
\n",
"
2020-06-02 19:00:00+00:00
\n",
"
2020-06-02 22:07:46+00:00
\n",
"
11266.0
\n",
"
\n",
"
\n",
"
1
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
b
\n",
"
2020-06-02 22:24:55+00:00
\n",
"
2020-06-12 17:52:23+00:00
\n",
"
847648.0
\n",
"
\n",
"
\n",
"
2
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
c
\n",
"
2020-06-12 18:32:17+00:00
\n",
"
2020-07-01 17:32:59+00:00
\n",
"
1638042.0
\n",
"
\n",
"
\n",
"
3
\n",
"
CONUS South
\n",
"
CAS04
\n",
"
d
\n",
"
2020-07-01 19:36:55+00:00
\n",
"
2020-07-13 19:00:00+00:00
\n",
"
1034585.0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" survey station_id run_id start \\\n",
"0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n",
"1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n",
"2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n",
"3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n",
"\n",
" end duration \n",
"0 2020-06-02 22:07:46+00:00 11266.0 \n",
"1 2020-06-12 17:52:23+00:00 847648.0 \n",
"2 2020-07-01 17:32:59+00:00 1638042.0 \n",
"3 2020-07-13 19:00:00+00:00 1034585.0 "
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"kernel_dataset = KernelDataset()\n",
"kernel_dataset.from_run_summary(run_summary, \"CAS04\")\n",
"kernel_dataset.mini_summary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Here is one way to select a single run:\n"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" survey station_id run_id start \\\n",
"0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n",
"\n",
" end \n",
"0 2020-06-02 22:07:46+00:00 \n"
]
}
],
"source": [
"station_runs_dict = {}\n",
"station_runs_dict[\"CAS04\"] = [\"a\", ]\n",
"keep_or_drop = \"keep\"\n",
"\n",
"kernel_dataset.select_station_runs(station_runs_dict, keep_or_drop)\n",
"print(kernel_dataset.df[coverage_short_list_columns])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## To discard runs that are not very long"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"