{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Build an MTH5 and Operate the Aurora Pipeline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook pulls MT miniSEED data from the IRIS Dataselect web service and produces MTH5 out of it. It outlines the process of making an MTH5 file, generating a processing config, and running the Aurora processor.\n", "\n", "It assumes that aurora, mth5, and mt_metadata have all been installed.\n", "\n", "In this \"new\" version, the workflow has changed somewhat. \n", "\n", "1. The process_mth5 call works with a dataset dataframe, rather than a single run_id\n", "2. The config object is now based on the mt_metadata.base Base class\n", "3. Remote reference processing is supported (at least in theory)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 0. Flow of this notebook\n", "\n", "Section 1: Here we do imports and construct a table of the data that we will access to build the mth5. Note that there is no explanation here as to the table source -- a future update can show how to create such a table from IRIS data_availability tools\n", "\n", "Seciton 2: the metadata and the data are accessed, and the mth5 is created and stored.\n", "\n", "Section 3: Aurora is used to process the data" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# # Uncomment while developing\n", "# %load_ext autoreload\n", "# %autoreload 2\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Required imports for the program. \n", "from pathlib import Path\n", "import pandas as pd\n", "import warnings\n", "\n", "from mth5 import mth5, timeseries\n", "from mth5.clients.fdsn import FDSN\n", "from mth5.clients.make_mth5 import MakeMTH5\n", "from mth5.utils.helpers import initialize_mth5\n", "from mt_metadata.utils.mttime import get_now_utc, MTime\n", "from aurora.config import BANDS_DEFAULT_FILE\n", "from aurora.config.config_creator import ConfigCreator\n", "from aurora.pipelines.process_mth5 import process_mth5\n", "from aurora.transfer_function.kernel_dataset import KernelDataset\n", "from aurora.pipelines.run_summary import RunSummary\n", "\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Build an MTH5 file from information extracted by IRIS\n", "\n", "- If you have already built an MTH5 you can skip this section \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set path so MTH5 file builds to current working directory. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PosixPath('/home/kkappler/software/irismt/aurora/docs/examples')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "default_path = Path().cwd()\n", "default_path" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Select mth5 file version" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# mth5_version = '0.1.0'\n", "mth5_version = '0.2.0'\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Initialize the Make MTH5 code. \n", "maker = MakeMTH5(mth5_version=mth5_version)\n", "maker.client = \"IRIS\"\n", "maker.interact = True\n", "\n", "# Initalize an FDSN object to access column names for request df\n", "fdsn_obj = FDSN()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1A: Specify the data to access from IRIS\n", "\n", "Note that here we explicitly prescribe the data, but this dataframe could be built from IRIS data availability tools in a programatic way" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Generate data frame of FDSN Network, Station, Location, Channel, Startime, Endtime codes of interest\n", "\n", "CAS04LQE = ['8P', 'CAS04', '', 'LQE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n", "CAS04LQN = ['8P', 'CAS04', '', 'LQN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n", "CAS04BFE = ['8P', 'CAS04', '', 'LFE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n", "CAS04BFN = ['8P', 'CAS04', '', 'LFN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n", "CAS04BFZ = ['8P', 'CAS04', '', 'LFZ', '2020-06-02T19:00:00', '2020-07-13T19:00:00']\n", "\n", "request_list = [CAS04LQE, CAS04LQN, CAS04BFE, CAS04BFN, CAS04BFZ]\n", "\n", "# Turn list into dataframe\n", "request_df = pd.DataFrame(request_list, columns=fdsn_obj.request_columns)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
networkstationlocationchannelstartend
08PCAS04LQE2020-06-02T19:00:002020-07-13T19:00:00
18PCAS04LQN2020-06-02T19:00:002020-07-13T19:00:00
28PCAS04LFE2020-06-02T19:00:002020-07-13T19:00:00
38PCAS04LFN2020-06-02T19:00:002020-07-13T19:00:00
48PCAS04LFZ2020-06-02T19:00:002020-07-13T19:00:00
\n", "
" ], "text/plain": [ " network station location channel start end\n", "0 8P CAS04 LQE 2020-06-02T19:00:00 2020-07-13T19:00:00\n", "1 8P CAS04 LQN 2020-06-02T19:00:00 2020-07-13T19:00:00\n", "2 8P CAS04 LFE 2020-06-02T19:00:00 2020-07-13T19:00:00\n", "3 8P CAS04 LFN 2020-06-02T19:00:00 2020-07-13T19:00:00\n", "4 8P CAS04 LFZ 2020-06-02T19:00:00 2020-07-13T19:00:00" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Inspect the dataframe\n", "request_df" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Request the inventory information from IRIS\n", "inventory = fdsn_obj.get_inventory_from_df(request_df, data=False)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(Inventory created at 2024-01-14T18:42:02.056231Z\n", "\tCreated by: ObsPy 1.4.0\n", "\t\t https://www.obspy.org\n", "\tSending institution: MTH5\n", "\tContains:\n", "\t\tNetworks (1):\n", "\t\t\t8P\n", "\t\tStations (1):\n", "\t\t\t8P.CAS04 (Corral Hollow, CA, USA)\n", "\t\tChannels (8):\n", "\t\t\t8P.CAS04..LFZ, 8P.CAS04..LFN, 8P.CAS04..LFE, 8P.CAS04..LQN (2x), \n", "\t\t\t8P.CAS04..LQE (3x),\n", " 0 Trace(s) in Stream:\n", ")" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Inspect the inventory\n", "inventory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Builds an MTH5 file from the user defined database. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the mth5 object set, we are ready to actually request the data from the fdsn client (IRIS) and save it to an MTH5 file. This process builds an MTH5 file and can take some time depending on how much data is requested. \n", "\n", "Note: `interact` keeps the MTH5 open after it is done building\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33m\u001b[1m2024-01-14T10:42:02.807633-0800 | WARNING | mth5.mth5 | open_mth5 | 8P_CAS04.h5 will be overwritten in 'w' mode\u001b[0m\n", "\u001b[1m2024-01-14T10:42:03.186485-0800 | INFO | mth5.mth5 | _initialize_file | Initialized MTH5 0.2.0 file /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5 in mode w\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.151432-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.162071-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.211669-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.221759-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.291045-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.302011-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.353822-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.362284-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.414744-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:56.423305-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.\u001b[0m\n", "\u001b[1m2024-01-14T10:42:57.723419-0800 | INFO | mth5.groups.base | _add_group | RunGroup a already exists, returning existing group.\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:57.866211-0800 | WARNING | mth5.timeseries.run_ts | validate_metadata | start time of dataset 2020-06-02T19:00:00+00:00 does not match metadata start 2020-06-02T18:41:43+00:00 updating metatdata value to 2020-06-02T19:00:00+00:00\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:58.111401-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:58.338249-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:58.546531-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:58.756314-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:58.959951-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id a. Setting to ch.run_metadata.id to a\u001b[0m\n", "\u001b[1m2024-01-14T10:42:59.029104-0800 | INFO | mth5.groups.base | _add_group | RunGroup b already exists, returning existing group.\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:59.455798-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:59.658208-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:42:59.857993-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:00.063593-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:00.267380-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id b. Setting to ch.run_metadata.id to b\u001b[0m\n", "\u001b[1m2024-01-14T10:43:00.343973-0800 | INFO | mth5.groups.base | _add_group | RunGroup c already exists, returning existing group.\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:00.876907-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:01.096488-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:01.299819-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:01.507180-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:01.713057-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id c. Setting to ch.run_metadata.id to c\u001b[0m\n", "\u001b[1m2024-01-14T10:43:01.804340-0800 | INFO | mth5.groups.base | _add_group | RunGroup d already exists, returning existing group.\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:02.091286-0800 | WARNING | mth5.timeseries.run_ts | validate_metadata | end time of dataset 2020-07-13T19:00:00+00:00 does not match metadata end 2020-07-13T21:46:12+00:00 updating metatdata value to 2020-07-13T19:00:00+00:00\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:02.276636-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:02.500754-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:02.711015-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:02.911784-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:03.115840-0800 | WARNING | mth5.groups.run | from_runts | Channel run.id sr1_001 != group run.id d. Setting to ch.run_metadata.id to d\u001b[0m\n", "\u001b[1m2024-01-14T10:43:03.232392-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\u001b[0m\n", "\u001b[33m\u001b[1m2024-01-14T10:43:03.255150-0800 | WARNING | mth5.mth5 | filename | MTH5 file is not open or has not been created yet. Returning default name\u001b[0m\n" ] } ], "source": [ "mth5_object = maker.from_fdsn_client(request_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1B: Examine and Update the MTH5 object\n", "\n", "With the open MTH5 Object, we can start to examine what is in it. For example, retrieve the filename and file_version. You can additionally do things such as getting the station information and edit it by setting a new value, in this case the declination model. " ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "/:\n", "====================\n", " |- Group: Experiment\n", " --------------------\n", " |- Group: Reports\n", " -----------------\n", " |- Group: Standards\n", " -------------------\n", " --> Dataset: summary\n", " ......................\n", " |- Group: Surveys\n", " -----------------\n", " |- Group: CONUS_South\n", " ---------------------\n", " |- Group: Filters\n", " -----------------\n", " |- Group: coefficient\n", " ---------------------\n", " |- Group: electric_analog_to_digital\n", " ------------------------------------\n", " |- Group: electric_dipole_92.000\n", " --------------------------------\n", " |- Group: electric_si_units\n", " ---------------------------\n", " |- Group: magnetic_analog_to_digital\n", " ------------------------------------\n", " |- Group: fap\n", " -------------\n", " |- Group: fir\n", " -------------\n", " |- Group: time_delay\n", " --------------------\n", " |- Group: electric_time_offset\n", " ------------------------------\n", " |- Group: hx_time_offset\n", " ------------------------\n", " |- Group: hy_time_offset\n", " ------------------------\n", " |- Group: hz_time_offset\n", " ------------------------\n", " |- Group: zpk\n", " -------------\n", " |- Group: electric_butterworth_high_pass\n", " ----------------------------------------\n", " --> Dataset: poles\n", " ....................\n", " --> Dataset: zeros\n", " ....................\n", " |- Group: electric_butterworth_low_pass\n", " ---------------------------------------\n", " --> Dataset: poles\n", " ....................\n", " --> Dataset: zeros\n", " ....................\n", " |- Group: magnetic_butterworth_low_pass\n", " ---------------------------------------\n", " --> Dataset: poles\n", " ....................\n", " --> Dataset: zeros\n", " ....................\n", " |- Group: Reports\n", " -----------------\n", " |- Group: Standards\n", " -------------------\n", " --> Dataset: summary\n", " ......................\n", " |- Group: Stations\n", " ------------------\n", " |- Group: CAS04\n", " ---------------\n", " |- Group: Fourier_Coefficients\n", " ------------------------------\n", " |- Group: Transfer_Functions\n", " ----------------------------\n", " |- Group: a\n", " -----------\n", " --> Dataset: ex\n", " .................\n", " --> Dataset: ey\n", " .................\n", " --> Dataset: hx\n", " .................\n", " --> Dataset: hy\n", " .................\n", " --> Dataset: hz\n", " .................\n", " |- Group: b\n", " -----------\n", " --> Dataset: ex\n", " .................\n", " --> Dataset: ey\n", " .................\n", " --> Dataset: hx\n", " .................\n", " --> Dataset: hy\n", " .................\n", " --> Dataset: hz\n", " .................\n", " |- Group: c\n", " -----------\n", " --> Dataset: ex\n", " .................\n", " --> Dataset: ey\n", " .................\n", " --> Dataset: hx\n", " .................\n", " --> Dataset: hy\n", " .................\n", " --> Dataset: hz\n", " .................\n", " |- Group: d\n", " -----------\n", " --> Dataset: ex\n", " .................\n", " --> Dataset: ey\n", " .................\n", " --> Dataset: hx\n", " .................\n", " --> Dataset: hy\n", " .................\n", " --> Dataset: hz\n", " .................\n", " --> Dataset: channel_summary\n", " ..............................\n", " --> Dataset: tf_summary\n", " ........................." ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mth5_object" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "mth5_path = mth5_object.filename" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0.2.0'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mth5_object.file_version\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m2024-01-14T10:43:03.348442-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\u001b[0m\n" ] } ], "source": [ "mth5_object.close_mth5()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "mth5_object = initialize_mth5(mth5_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1C: Optionally Update Metdata:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "IGRF-13\n", "IGRF\n" ] } ], "source": [ "# Edit and update the MTH5 metadata \n", "s = mth5_object.get_station(\"CAS04\", survey=\"CONUS_South\")\n", "print(s.metadata.location.declination.model)\n", "s.metadata.location.declination.model = 'IGRF'\n", "print(s.metadata.location.declination.model)\n", "s.write_metadata() # writes to file mth5_filename" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Filename: /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5 \n", " Version: 0.2.0\n" ] } ], "source": [ "# Print some info about the mth5 \n", "mth5_filename = mth5_object.filename\n", "version = mth5_object.file_version\n", "print(f\" Filename: {mth5_filename} \\n Version: {version}\")\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Get the available stations and runs from the MTH5 object\n", "mth5_object.channel_summary.summarize()\n", "ch_summary = mth5_object.channel_summary.to_dataframe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2: Process Data\n", "If MTH5 file already exists you can start here if you dont want to execute the previous code to get data again." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "interact = False\n", "if interact:\n", " pass\n", "else:\n", " h5_path = default_path.joinpath(\"8P_CAS04.h5\")\n", " mth5_object = initialize_mth5(h5_path, mode=\"a\", file_version=mth5_version)\n", " ch_summary = mth5_object.channel_summary.to_dataframe()\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generate an Aurora Configuration file using MTH5 as an input\n", "\n", "Up to this point, we have used mth5 and mt_metadata, but haven't yet used aurora. So we will use the MTH5 that we just created (and examined and updated) as input into Aurora.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Channel Summary\n", "\n", "This is a very useful datastructure inside the mth5. It acts basically like an index of available data at the channel-run level, i.e. there is one row for every contiguous chunk of time-series recorded by an electric dipole or magnetometer" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystationrunlatitudelongitudeelevationcomponentstartendn_samplessample_ratemeasurement_typeazimuthtiltunitshdf5_referencerun_hdf5_referencestation_hdf5_reference
0CONUS SouthCAS04a37.633351-121.468382329.3875ex2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:00112671.0electric13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
1CONUS SouthCAS04a37.633351-121.468382329.3875ey2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:00112671.0electric103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
2CONUS SouthCAS04a37.633351-121.468382329.3875hx2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:00112671.0magnetic13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
3CONUS SouthCAS04a37.633351-121.468382329.3875hy2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:00112671.0magnetic103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
4CONUS SouthCAS04a37.633351-121.468382329.3875hz2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:00112671.0magnetic0.090.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
5CONUS SouthCAS04b37.633351-121.468382329.3875ex2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:008476491.0electric13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
6CONUS SouthCAS04b37.633351-121.468382329.3875ey2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:008476491.0electric103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
7CONUS SouthCAS04b37.633351-121.468382329.3875hx2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:008476491.0magnetic13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
8CONUS SouthCAS04b37.633351-121.468382329.3875hy2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:008476491.0magnetic103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
9CONUS SouthCAS04b37.633351-121.468382329.3875hz2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:008476491.0magnetic0.090.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
10CONUS SouthCAS04c37.633351-121.468382329.3875ex2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:0016380431.0electric13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
11CONUS SouthCAS04c37.633351-121.468382329.3875ey2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:0016380431.0electric103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
12CONUS SouthCAS04c37.633351-121.468382329.3875hx2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:0016380431.0magnetic13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
13CONUS SouthCAS04c37.633351-121.468382329.3875hy2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:0016380431.0magnetic103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
14CONUS SouthCAS04c37.633351-121.468382329.3875hz2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:0016380431.0magnetic0.090.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
15CONUS SouthCAS04d37.633351-121.468382329.3875ex2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:0010345861.0electric13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
16CONUS SouthCAS04d37.633351-121.468382329.3875ey2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:0010345861.0electric103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
17CONUS SouthCAS04d37.633351-121.468382329.3875hx2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:0010345861.0magnetic13.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
18CONUS SouthCAS04d37.633351-121.468382329.3875hy2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:0010345861.0magnetic103.20.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
19CONUS SouthCAS04d37.633351-121.468382329.3875hz2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:0010345861.0magnetic0.090.0digital counts<HDF5 object reference><HDF5 object reference><HDF5 object reference>
\n", "
" ], "text/plain": [ " survey station run latitude longitude elevation component \\\n", "0 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ex \n", "1 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ey \n", "2 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hx \n", "3 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hy \n", "4 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hz \n", "5 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ex \n", "6 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ey \n", "7 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hx \n", "8 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hy \n", "9 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hz \n", "10 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ex \n", "11 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ey \n", "12 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hx \n", "13 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hy \n", "14 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hz \n", "15 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ex \n", "16 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ey \n", "17 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hx \n", "18 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hy \n", "19 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hz \n", "\n", " start end n_samples \\\n", "0 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n", "1 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n", "2 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n", "3 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n", "4 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 \n", "5 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n", "6 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n", "7 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n", "8 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n", "9 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 \n", "10 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n", "11 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n", "12 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n", "13 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n", "14 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 \n", "15 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n", "16 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n", "17 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n", "18 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n", "19 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 \n", "\n", " sample_rate measurement_type azimuth tilt units \\\n", "0 1.0 electric 13.2 0.0 digital counts \n", "1 1.0 electric 103.2 0.0 digital counts \n", "2 1.0 magnetic 13.2 0.0 digital counts \n", "3 1.0 magnetic 103.2 0.0 digital counts \n", "4 1.0 magnetic 0.0 90.0 digital counts \n", "5 1.0 electric 13.2 0.0 digital counts \n", "6 1.0 electric 103.2 0.0 digital counts \n", "7 1.0 magnetic 13.2 0.0 digital counts \n", "8 1.0 magnetic 103.2 0.0 digital counts \n", "9 1.0 magnetic 0.0 90.0 digital counts \n", "10 1.0 electric 13.2 0.0 digital counts \n", "11 1.0 electric 103.2 0.0 digital counts \n", "12 1.0 magnetic 13.2 0.0 digital counts \n", "13 1.0 magnetic 103.2 0.0 digital counts \n", "14 1.0 magnetic 0.0 90.0 digital counts \n", "15 1.0 electric 13.2 0.0 digital counts \n", "16 1.0 electric 103.2 0.0 digital counts \n", "17 1.0 magnetic 13.2 0.0 digital counts \n", "18 1.0 magnetic 103.2 0.0 digital counts \n", "19 1.0 magnetic 0.0 90.0 digital counts \n", "\n", " hdf5_reference run_hdf5_reference station_hdf5_reference \n", "0 \n", "1 \n", "2 \n", "3 \n", "4 \n", "5 \n", "6 \n", "7 \n", "8 \n", "9 \n", "10 \n", "11 \n", "12 \n", "13 \n", "14 \n", "15 \n", "16 \n", "17 \n", "18 \n", "19 " ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ch_summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Channel summary has a lot of uses, below we use it to check if the data have mixed sample rates, and to get a list of available stations" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Available stations: ['CAS04']\n" ] } ], "source": [ "available_runs = ch_summary.run.unique()\n", "sr = ch_summary.sample_rate.unique()\n", "if len(sr) != 1:\n", " print('Only one sample rate per run is available')\n", " \n", "available_stations = ch_summary.station.unique()\n", "print(f\"Available stations: {available_stations}\")" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Run Summary\n", "\n", "A cousin of the channel summary is the Run Summary.\n", "This is a condensed version of the channel summary, with one row per continuous acquistion run at a station.\n", "\n", "The run summary can be accessed from an open mth5 object, or from an iterable of h5 paths as in the example below\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m2024-01-14T10:43:04.066603-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\u001b[0m\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystation_idrun_idstartendsample_rateinput_channelsoutput_channelschannel_scale_factorsvalidmth5_path
0CONUS SouthCAS04a2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:001.0[hx, hy][ex, ey, hz]{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...True/home/kkappler/software/irismt/aurora/docs/exa...
1CONUS SouthCAS04b2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:001.0[hx, hy][ex, ey, hz]{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...True/home/kkappler/software/irismt/aurora/docs/exa...
2CONUS SouthCAS04c2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:001.0[hx, hy][ex, ey, hz]{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...True/home/kkappler/software/irismt/aurora/docs/exa...
3CONUS SouthCAS04d2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:001.0[hx, hy][ex, ey, hz]{'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...True/home/kkappler/software/irismt/aurora/docs/exa...
\n", "
" ], "text/plain": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n", "1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n", "2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n", "3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n", "\n", " end sample_rate input_channels output_channels \\\n", "0 2020-06-02 22:07:46+00:00 1.0 [hx, hy] [ex, ey, hz] \n", "1 2020-06-12 17:52:23+00:00 1.0 [hx, hy] [ex, ey, hz] \n", "2 2020-07-01 17:32:59+00:00 1.0 [hx, hy] [ex, ey, hz] \n", "3 2020-07-13 19:00:00+00:00 1.0 [hx, hy] [ex, ey, hz] \n", "\n", " channel_scale_factors valid \\\n", "0 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n", "1 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n", "2 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n", "3 {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... True \n", "\n", " mth5_path \n", "0 /home/kkappler/software/irismt/aurora/docs/exa... \n", "1 /home/kkappler/software/irismt/aurora/docs/exa... \n", "2 /home/kkappler/software/irismt/aurora/docs/exa... \n", "3 /home/kkappler/software/irismt/aurora/docs/exa... " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mth5_run_summary = RunSummary()\n", "h5_path = default_path.joinpath(\"8P_CAS04.h5\")\n", "mth5_run_summary.from_mth5s([h5_path,])\n", "run_summary = mth5_run_summary.clone()\n", "run_summary.df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have a dataframe of the available runs to process from the MTH5 \n", "\n", "Sometimes we just want to look at the survey, station, run, and time intervals\n", "we can for that we can call mini_summary" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystation_idrun_idstartend
0CONUS SouthCAS04a2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:00
1CONUS SouthCAS04b2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:00
2CONUS SouthCAS04c2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:00
3CONUS SouthCAS04d2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:00
\n", "
" ], "text/plain": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n", "1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n", "2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n", "3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n", "\n", " end \n", "0 2020-06-02 22:07:46+00:00 \n", "1 2020-06-12 17:52:23+00:00 \n", "2 2020-07-01 17:32:59+00:00 \n", "3 2020-07-13 19:00:00+00:00 " ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run_summary.mini_summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But here are the columns in the run summary" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['survey', 'station_id', 'run_id', 'start', 'end', 'sample_rate',\n", " 'input_channels', 'output_channels', 'channel_scale_factors', 'valid',\n", " 'mth5_path'],\n", " dtype='object')" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "run_summary.df.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Make your own mini summary by choosing columns" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystation_idrun_idstartend
0CONUS SouthCAS04a2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:00
1CONUS SouthCAS04b2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:00
2CONUS SouthCAS04c2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:00
3CONUS SouthCAS04d2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:00
\n", "
" ], "text/plain": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n", "1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n", "2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n", "3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n", "\n", " end \n", "0 2020-06-02 22:07:46+00:00 \n", "1 2020-06-12 17:52:23+00:00 \n", "2 2020-07-01 17:32:59+00:00 \n", "3 2020-07-13 19:00:00+00:00 " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "coverage_short_list_columns = [\"survey\", 'station_id', 'run_id', 'start', 'end', ]\n", "run_summary.df[coverage_short_list_columns]" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Kernel Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is like a run summary, but for a single station or a pair of stations.\n", "It is used to specify the inputs to aurora processing.\n", "\n", "It takes a run_summary and a station name, and optionally, a remote reference station name\n", "\n", "It is made _based on the available data_ in the MTH5 archive.\n", "\n", "Syntax:\n", "kernel_dataset.from_run_summary(run_summary, local_station_id, reference_station_id)\n", "\n", "By Default, all runs will be processed\n", "\n", "To restrict to processing a single run, or a list of runs, we can either tell KernelDataset to keep or drop a station_run dictionary. \n" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystation_idrun_idstartendduration
0CONUS SouthCAS04a2020-06-02 19:00:00+00:002020-06-02 22:07:46+00:0011266.0
1CONUS SouthCAS04b2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:00847648.0
2CONUS SouthCAS04c2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:001638042.0
3CONUS SouthCAS04d2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:001034585.0
\n", "
" ], "text/plain": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n", "1 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n", "2 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n", "3 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n", "\n", " end duration \n", "0 2020-06-02 22:07:46+00:00 11266.0 \n", "1 2020-06-12 17:52:23+00:00 847648.0 \n", "2 2020-07-01 17:32:59+00:00 1638042.0 \n", "3 2020-07-13 19:00:00+00:00 1034585.0 " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kernel_dataset = KernelDataset()\n", "kernel_dataset.from_run_summary(run_summary, \"CAS04\")\n", "kernel_dataset.mini_summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Here is one way to select a single run:\n" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 a 2020-06-02 19:00:00+00:00 \n", "\n", " end \n", "0 2020-06-02 22:07:46+00:00 \n" ] } ], "source": [ "station_runs_dict = {}\n", "station_runs_dict[\"CAS04\"] = [\"a\", ]\n", "keep_or_drop = \"keep\"\n", "\n", "kernel_dataset.select_station_runs(station_runs_dict, keep_or_drop)\n", "print(kernel_dataset.df[coverage_short_list_columns])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## To discard runs that are not very long" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystation_idrun_idstartend
0CONUS SouthCAS04b2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:00
1CONUS SouthCAS04c2020-06-12 18:32:17+00:002020-07-01 17:32:59+00:00
2CONUS SouthCAS04d2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:00
\n", "
" ], "text/plain": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n", "1 CONUS South CAS04 c 2020-06-12 18:32:17+00:00 \n", "2 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n", "\n", " end \n", "0 2020-06-12 17:52:23+00:00 \n", "1 2020-07-01 17:32:59+00:00 \n", "2 2020-07-13 19:00:00+00:00 " ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kernel_dataset = KernelDataset()\n", "kernel_dataset.from_run_summary(run_summary, \"CAS04\")\n", "cutoff_duration_in_seconds = 15000\n", "kernel_dataset.drop_runs_shorter_than(cutoff_duration_in_seconds)\n", "kernel_dataset.df[coverage_short_list_columns]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Select only runs \"b\" & \"d\"" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystation_idrun_idstartend
0CONUS SouthCAS04b2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:00
1CONUS SouthCAS04d2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:00
\n", "
" ], "text/plain": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n", "1 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n", "\n", " end \n", "0 2020-06-12 17:52:23+00:00 \n", "1 2020-07-13 19:00:00+00:00 " ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kernel_dataset = KernelDataset()\n", "kernel_dataset.from_run_summary(run_summary, \"CAS04\")\n", "station_runs_dict = {}\n", "station_runs_dict[\"CAS04\"] = [\"b\",\"d\"]\n", "keep_or_drop = \"keep\"\n", "kernel_dataset.select_station_runs(station_runs_dict, keep_or_drop)\n", "kernel_dataset.df[coverage_short_list_columns]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The same result can be obtained by _excluding_ runs a & c" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
surveystation_idrun_idstartend
0CONUS SouthCAS04b2020-06-02 22:24:55+00:002020-06-12 17:52:23+00:00
1CONUS SouthCAS04d2020-07-01 19:36:55+00:002020-07-13 19:00:00+00:00
\n", "
" ], "text/plain": [ " survey station_id run_id start \\\n", "0 CONUS South CAS04 b 2020-06-02 22:24:55+00:00 \n", "1 CONUS South CAS04 d 2020-07-01 19:36:55+00:00 \n", "\n", " end \n", "0 2020-06-12 17:52:23+00:00 \n", "1 2020-07-13 19:00:00+00:00 " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kernel_dataset = KernelDataset()\n", "kernel_dataset.from_run_summary(run_summary, \"CAS04\")\n", "station_runs_dict = {}\n", "station_runs_dict[\"CAS04\"] = [\"a\",\"c\"]\n", "keep_or_drop = \"drop\"\n", "kernel_dataset.select_station_runs(station_runs_dict, keep_or_drop)\n", "kernel_dataset.df[coverage_short_list_columns]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make an aurora configuration file (and then save that json file.)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "cc = ConfigCreator()\n", "config = cc.create_from_kernel_dataset(kernel_dataset, \n", " emtf_band_file=BANDS_DEFAULT_FILE,)\n" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "for decimation in config.decimations:\n", " decimation.estimator.engine = \"RME\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take a look at the config:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{\n", " \"processing\": {\n", " \"band_setup_file\": \"/home/kkappler/software/irismt/aurora/aurora/config/emtf_band_setup/bs_test.cfg\",\n", " \"band_specification_style\": \"EMTF\",\n", " \"channel_nomenclature.ex\": \"ex\",\n", " \"channel_nomenclature.ey\": \"ey\",\n", " \"channel_nomenclature.hx\": \"hx\",\n", " \"channel_nomenclature.hy\": \"hy\",\n", " \"channel_nomenclature.hz\": \"hz\",\n", " \"decimations\": [\n", " {\n", " \"decimation_level\": {\n", " \"anti_alias_filter\": \"default\",\n", " \"bands\": [\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.23828125,\n", " \"frequency_min\": 0.19140625,\n", " \"index_max\": 30,\n", " \"index_min\": 25\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.19140625,\n", " \"frequency_min\": 0.15234375,\n", " \"index_max\": 24,\n", " \"index_min\": 20\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.15234375,\n", " \"frequency_min\": 0.12109375,\n", " \"index_max\": 19,\n", " \"index_min\": 16\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.12109375,\n", " \"frequency_min\": 0.09765625,\n", " \"index_max\": 15,\n", " \"index_min\": 13\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.09765625,\n", " \"frequency_min\": 0.07421875,\n", " \"index_max\": 12,\n", " \"index_min\": 10\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.07421875,\n", " \"frequency_min\": 0.05859375,\n", " \"index_max\": 9,\n", " \"index_min\": 8\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.05859375,\n", " \"frequency_min\": 0.04296875,\n", " \"index_max\": 7,\n", " \"index_min\": 6\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 0,\n", " \"frequency_max\": 0.04296875,\n", " \"frequency_min\": 0.03515625,\n", " \"index_max\": 5,\n", " \"index_min\": 5\n", " }\n", " }\n", " ],\n", " \"decimation.factor\": 1.0,\n", " \"decimation.level\": 0,\n", " \"decimation.method\": \"default\",\n", " \"decimation.sample_rate\": 1.0,\n", " \"estimator.engine\": \"RME\",\n", " \"estimator.estimate_per_channel\": true,\n", " \"extra_pre_fft_detrend_type\": \"linear\",\n", " \"input_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"method\": \"fft\",\n", " \"min_num_stft_windows\": 2,\n", " \"output_channels\": [\n", " \"hz\",\n", " \"ex\",\n", " \"ey\"\n", " ],\n", " \"pre_fft_detrend_type\": \"linear\",\n", " \"prewhitening_type\": \"first difference\",\n", " \"recoloring\": true,\n", " \"reference_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"regression.max_iterations\": 10,\n", " \"regression.max_redescending_iterations\": 2,\n", " \"regression.minimum_cycles\": 10,\n", " \"save_fcs\": false,\n", " \"window.clock_zero_type\": \"ignore\",\n", " \"window.num_samples\": 128,\n", " \"window.overlap\": 32,\n", " \"window.type\": \"boxcar\"\n", " }\n", " },\n", " {\n", " \"decimation_level\": {\n", " \"anti_alias_filter\": \"default\",\n", " \"bands\": [\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 1,\n", " \"frequency_max\": 0.0341796875,\n", " \"frequency_min\": 0.0263671875,\n", " \"index_max\": 17,\n", " \"index_min\": 14\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 1,\n", " \"frequency_max\": 0.0263671875,\n", " \"frequency_min\": 0.0205078125,\n", " \"index_max\": 13,\n", " \"index_min\": 11\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 1,\n", " \"frequency_max\": 0.0205078125,\n", " \"frequency_min\": 0.0166015625,\n", " \"index_max\": 10,\n", " \"index_min\": 9\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 1,\n", " \"frequency_max\": 0.0166015625,\n", " \"frequency_min\": 0.0126953125,\n", " \"index_max\": 8,\n", " \"index_min\": 7\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 1,\n", " \"frequency_max\": 0.0126953125,\n", " \"frequency_min\": 0.0107421875,\n", " \"index_max\": 6,\n", " \"index_min\": 6\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 1,\n", " \"frequency_max\": 0.0107421875,\n", " \"frequency_min\": 0.0087890625,\n", " \"index_max\": 5,\n", " \"index_min\": 5\n", " }\n", " }\n", " ],\n", " \"decimation.factor\": 4.0,\n", " \"decimation.level\": 1,\n", " \"decimation.method\": \"default\",\n", " \"decimation.sample_rate\": 0.25,\n", " \"estimator.engine\": \"RME\",\n", " \"estimator.estimate_per_channel\": true,\n", " \"extra_pre_fft_detrend_type\": \"linear\",\n", " \"input_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"method\": \"fft\",\n", " \"min_num_stft_windows\": 2,\n", " \"output_channels\": [\n", " \"hz\",\n", " \"ex\",\n", " \"ey\"\n", " ],\n", " \"pre_fft_detrend_type\": \"linear\",\n", " \"prewhitening_type\": \"first difference\",\n", " \"recoloring\": true,\n", " \"reference_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"regression.max_iterations\": 10,\n", " \"regression.max_redescending_iterations\": 2,\n", " \"regression.minimum_cycles\": 10,\n", " \"save_fcs\": false,\n", " \"window.clock_zero_type\": \"ignore\",\n", " \"window.num_samples\": 128,\n", " \"window.overlap\": 32,\n", " \"window.type\": \"boxcar\"\n", " }\n", " },\n", " {\n", " \"decimation_level\": {\n", " \"anti_alias_filter\": \"default\",\n", " \"bands\": [\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 2,\n", " \"frequency_max\": 0.008544921875,\n", " \"frequency_min\": 0.006591796875,\n", " \"index_max\": 17,\n", " \"index_min\": 14\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 2,\n", " \"frequency_max\": 0.006591796875,\n", " \"frequency_min\": 0.005126953125,\n", " \"index_max\": 13,\n", " \"index_min\": 11\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 2,\n", " \"frequency_max\": 0.005126953125,\n", " \"frequency_min\": 0.004150390625,\n", " \"index_max\": 10,\n", " \"index_min\": 9\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 2,\n", " \"frequency_max\": 0.004150390625,\n", " \"frequency_min\": 0.003173828125,\n", " \"index_max\": 8,\n", " \"index_min\": 7\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 2,\n", " \"frequency_max\": 0.003173828125,\n", " \"frequency_min\": 0.002685546875,\n", " \"index_max\": 6,\n", " \"index_min\": 6\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 2,\n", " \"frequency_max\": 0.002685546875,\n", " \"frequency_min\": 0.002197265625,\n", " \"index_max\": 5,\n", " \"index_min\": 5\n", " }\n", " }\n", " ],\n", " \"decimation.factor\": 4.0,\n", " \"decimation.level\": 2,\n", " \"decimation.method\": \"default\",\n", " \"decimation.sample_rate\": 0.0625,\n", " \"estimator.engine\": \"RME\",\n", " \"estimator.estimate_per_channel\": true,\n", " \"extra_pre_fft_detrend_type\": \"linear\",\n", " \"input_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"method\": \"fft\",\n", " \"min_num_stft_windows\": 2,\n", " \"output_channels\": [\n", " \"hz\",\n", " \"ex\",\n", " \"ey\"\n", " ],\n", " \"pre_fft_detrend_type\": \"linear\",\n", " \"prewhitening_type\": \"first difference\",\n", " \"recoloring\": true,\n", " \"reference_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"regression.max_iterations\": 10,\n", " \"regression.max_redescending_iterations\": 2,\n", " \"regression.minimum_cycles\": 10,\n", " \"save_fcs\": false,\n", " \"window.clock_zero_type\": \"ignore\",\n", " \"window.num_samples\": 128,\n", " \"window.overlap\": 32,\n", " \"window.type\": \"boxcar\"\n", " }\n", " },\n", " {\n", " \"decimation_level\": {\n", " \"anti_alias_filter\": \"default\",\n", " \"bands\": [\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 3,\n", " \"frequency_max\": 0.00274658203125,\n", " \"frequency_min\": 0.00213623046875,\n", " \"index_max\": 22,\n", " \"index_min\": 18\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 3,\n", " \"frequency_max\": 0.00213623046875,\n", " \"frequency_min\": 0.00164794921875,\n", " \"index_max\": 17,\n", " \"index_min\": 14\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 3,\n", " \"frequency_max\": 0.00164794921875,\n", " \"frequency_min\": 0.00115966796875,\n", " \"index_max\": 13,\n", " \"index_min\": 10\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 3,\n", " \"frequency_max\": 0.00115966796875,\n", " \"frequency_min\": 0.00079345703125,\n", " \"index_max\": 9,\n", " \"index_min\": 7\n", " }\n", " },\n", " {\n", " \"band\": {\n", " \"center_averaging_type\": \"geometric\",\n", " \"closed\": \"left\",\n", " \"decimation_level\": 3,\n", " \"frequency_max\": 0.00079345703125,\n", " \"frequency_min\": 0.00054931640625,\n", " \"index_max\": 6,\n", " \"index_min\": 5\n", " }\n", " }\n", " ],\n", " \"decimation.factor\": 4.0,\n", " \"decimation.level\": 3,\n", " \"decimation.method\": \"default\",\n", " \"decimation.sample_rate\": 0.015625,\n", " \"estimator.engine\": \"RME\",\n", " \"estimator.estimate_per_channel\": true,\n", " \"extra_pre_fft_detrend_type\": \"linear\",\n", " \"input_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"method\": \"fft\",\n", " \"min_num_stft_windows\": 2,\n", " \"output_channels\": [\n", " \"hz\",\n", " \"ex\",\n", " \"ey\"\n", " ],\n", " \"pre_fft_detrend_type\": \"linear\",\n", " \"prewhitening_type\": \"first difference\",\n", " \"recoloring\": true,\n", " \"reference_channels\": [\n", " \"hx\",\n", " \"hy\"\n", " ],\n", " \"regression.max_iterations\": 10,\n", " \"regression.max_redescending_iterations\": 2,\n", " \"regression.minimum_cycles\": 10,\n", " \"save_fcs\": false,\n", " \"window.clock_zero_type\": \"ignore\",\n", " \"window.num_samples\": 128,\n", " \"window.overlap\": 32,\n", " \"window.type\": \"boxcar\"\n", " }\n", " }\n", " ],\n", " \"id\": \"CAS04-None\",\n", " \"stations.local.id\": \"CAS04\",\n", " \"stations.local.mth5_path\": \"/home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\",\n", " \"stations.local.remote\": false,\n", " \"stations.local.runs\": [\n", " {\n", " \"run\": {\n", " \"id\": \"b\",\n", " \"input_channels\": [\n", " {\n", " \"channel\": {\n", " \"id\": \"hx\",\n", " \"scale_factor\": 1.0\n", " }\n", " },\n", " {\n", " \"channel\": {\n", " \"id\": \"hy\",\n", " \"scale_factor\": 1.0\n", " }\n", " }\n", " ],\n", " \"output_channels\": [\n", " {\n", " \"channel\": {\n", " \"id\": \"ex\",\n", " \"scale_factor\": 1.0\n", " }\n", " },\n", " {\n", " \"channel\": {\n", " \"id\": \"ey\",\n", " \"scale_factor\": 1.0\n", " }\n", " },\n", " {\n", " \"channel\": {\n", " \"id\": \"hz\",\n", " \"scale_factor\": 1.0\n", " }\n", " }\n", " ],\n", " \"sample_rate\": 1.0,\n", " \"time_periods\": [\n", " {\n", " \"time_period\": {\n", " \"end\": \"2020-06-12T17:52:23+00:00\",\n", " \"start\": \"2020-06-02T22:24:55+00:00\"\n", " }\n", " }\n", " ]\n", " }\n", " },\n", " {\n", " \"run\": {\n", " \"id\": \"d\",\n", " \"input_channels\": [\n", " {\n", " \"channel\": {\n", " \"id\": \"hx\",\n", " \"scale_factor\": 1.0\n", " }\n", " },\n", " {\n", " \"channel\": {\n", " \"id\": \"hy\",\n", " \"scale_factor\": 1.0\n", " }\n", " }\n", " ],\n", " \"output_channels\": [\n", " {\n", " \"channel\": {\n", " \"id\": \"ex\",\n", " \"scale_factor\": 1.0\n", " }\n", " },\n", " {\n", " \"channel\": {\n", " \"id\": \"ey\",\n", " \"scale_factor\": 1.0\n", " }\n", " },\n", " {\n", " \"channel\": {\n", " \"id\": \"hz\",\n", " \"scale_factor\": 1.0\n", " }\n", " }\n", " ],\n", " \"sample_rate\": 1.0,\n", " \"time_periods\": [\n", " {\n", " \"time_period\": {\n", " \"end\": \"2020-07-13T19:00:00+00:00\",\n", " \"start\": \"2020-07-01T19:36:55+00:00\"\n", " }\n", " }\n", " ]\n", " }\n", " }\n", " ],\n", " \"stations.remote\": []\n", " }\n", "}" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "config" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the Aurora Pipeline using the input MTh5 and Confiugration File" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m2024-01-14T10:43:04.266683-0800 | INFO | aurora.pipelines.transfer_function_kernel | show_processing_summary | Processing Summary Dataframe:\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.272434-0800 | INFO | aurora.pipelines.transfer_function_kernel | show_processing_summary | survey station_id run_id valid remote duration fc dec_level dec_factor sample_rate window_duration num_samples_window num_samples num_stft_windows\n", "0 CONUS South CAS04 b True False 847648.0 False 0 1.0 1.000000 128.0 128 847648.0 8829.0\n", "1 CONUS South CAS04 b True False 847648.0 False 1 4.0 0.250000 512.0 128 211912.0 2207.0\n", "2 CONUS South CAS04 b True False 847648.0 False 2 4.0 0.062500 2048.0 128 52978.0 551.0\n", "3 CONUS South CAS04 b True False 847648.0 False 3 4.0 0.015625 8192.0 128 13244.0 137.0\n", "4 CONUS South CAS04 d True False 1034585.0 False 0 1.0 1.000000 128.0 128 1034585.0 10776.0\n", "5 CONUS South CAS04 d True False 1034585.0 False 1 4.0 0.250000 512.0 128 258646.0 2693.0\n", "6 CONUS South CAS04 d True False 1034585.0 False 2 4.0 0.062500 2048.0 128 64661.0 673.0\n", "7 CONUS South CAS04 d True False 1034585.0 False 3 4.0 0.015625 8192.0 128 16165.0 168.0\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.273880-0800 | INFO | aurora.pipelines.transfer_function_kernel | memory_warning | Total memory: 62.73 GB\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.274445-0800 | INFO | aurora.pipelines.transfer_function_kernel | memory_warning | Total Bytes of Raw Data: 0.014 GB\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.274901-0800 | INFO | aurora.pipelines.transfer_function_kernel | memory_warning | Raw Data will use: 0.022 % of memory\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.294047-0800 | INFO | aurora.pipelines.transfer_function_kernel | check_if_fc_levels_already_exist | Fourier coefficients not detected for survey: CONUS South, station_id: CAS04, run_id: b-- Fourier coefficients will be computed\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.308964-0800 | INFO | aurora.pipelines.transfer_function_kernel | check_if_fc_levels_already_exist | Fourier coefficients not detected for survey: CONUS South, station_id: CAS04, run_id: d-- Fourier coefficients will be computed\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.310055-0800 | INFO | aurora.pipelines.process_mth5 | process_mth5 | fc_levels_already_exist = False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.310457-0800 | INFO | aurora.pipelines.process_mth5 | process_mth5 | Processing config indicates 4 decimation levels\u001b[0m\n", "\u001b[1m2024-01-14T10:43:04.311219-0800 | INFO | aurora.pipelines.transfer_function_kernel | valid_decimations | After validation there are 4 valid decimation levels\u001b[0m\n", "\u001b[1m2024-01-14T10:43:05.684536-0800 | INFO | aurora.transfer_function.kernel_dataset | initialize_dataframe_for_processing | Dataset dataframe initialized successfully\u001b[0m\n", "\u001b[1m2024-01-14T10:43:05.685088-0800 | INFO | aurora.pipelines.transfer_function_kernel | update_dataset_df | Dataset Dataframe Updated for decimation level 0 Successfully\u001b[0m\n", "\u001b[1m2024-01-14T10:43:06.799551-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:07.973010-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:08.008574-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 25.728968s (0.038867Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:08.199503-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 19.929573s (0.050177Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:08.454349-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 15.164131s (0.065945Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:08.738590-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 11.746086s (0.085135Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:09.074259-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 9.195791s (0.108745Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:09.421783-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 7.362526s (0.135823Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:09.840684-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 5.856115s (0.170762Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:10.309801-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 4.682492s (0.213562Hz)\u001b[0m\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m2024-01-14T10:43:11.549673-0800 | INFO | aurora.pipelines.transfer_function_kernel | update_dataset_df | DECIMATION LEVEL 1\u001b[0m\n", "\u001b[1m2024-01-14T10:43:11.742180-0800 | INFO | aurora.pipelines.transfer_function_kernel | update_dataset_df | Dataset Dataframe Updated for decimation level 1 Successfully\u001b[0m\n", "\u001b[1m2024-01-14T10:43:12.233537-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:12.686413-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:12.702055-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 102.915872s (0.009717Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:12.793054-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 85.631182s (0.011678Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:12.915422-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 68.881694s (0.014518Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:13.083985-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 54.195827s (0.018452Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:13.244329-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 43.003958s (0.023254Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:13.398642-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 33.310722s (0.030020Hz)\u001b[0m\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m2024-01-14T10:43:14.037690-0800 | INFO | aurora.pipelines.transfer_function_kernel | update_dataset_df | DECIMATION LEVEL 2\u001b[0m\n", "\u001b[1m2024-01-14T10:43:14.109079-0800 | INFO | aurora.pipelines.transfer_function_kernel | update_dataset_df | Dataset Dataframe Updated for decimation level 2 Successfully\u001b[0m\n", "\u001b[1m2024-01-14T10:43:14.423959-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:14.754151-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:14.764705-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 411.663489s (0.002429Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:14.806179-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 342.524727s (0.002919Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:14.847611-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 275.526776s (0.003629Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:14.962995-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 216.783308s (0.004613Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:15.064856-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 172.015831s (0.005813Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:15.179677-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 133.242890s (0.007505Hz)\u001b[0m\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m2024-01-14T10:43:15.771183-0800 | INFO | aurora.pipelines.transfer_function_kernel | update_dataset_df | DECIMATION LEVEL 3\u001b[0m\n", "\u001b[1m2024-01-14T10:43:15.799960-0800 | INFO | aurora.pipelines.transfer_function_kernel | update_dataset_df | Dataset Dataframe Updated for decimation level 3 Successfully\u001b[0m\n", "\u001b[1m2024-01-14T10:43:16.108674-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:16.403179-0800 | INFO | aurora.pipelines.process_mth5 | save_fourier_coefficients | Skip saving FCs. dec_level_config.save_fc = False False\u001b[0m\n", "\u001b[1m2024-01-14T10:43:16.411907-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 1514.701336s (0.000660Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:16.453957-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 1042.488956s (0.000959Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:16.495345-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 723.371271s (0.001382Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:16.537654-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 532.971560s (0.001876Hz)\u001b[0m\n", "\u001b[1m2024-01-14T10:43:16.578356-0800 | INFO | aurora.time_series.frequency_band_helpers | get_band_for_tf_estimate | Processing band 412.837995s (0.002422Hz)\u001b[0m\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m2024-01-14T10:43:17.251452-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5\u001b[0m\n" ] } ], "source": [ "show_plot = True\n", "tf_cls = process_mth5(config,\n", " kernel_dataset,\n", " units=\"MT\",\n", " show_plot=show_plot,\n", " z_file_path=None,\n", " )" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "mt_metadata.transfer_functions.core.TF" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "type(tf_cls)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write the transfer functions generated by the Aurora pipeline" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "EMTFXML(station='CAS04', latitude=37.63, longitude=-121.47, elevation=329.39)" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ " tf_cls.write(fn=\"emtfxml_test.xml\", file_type=\"emtfxml\")" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "EMTFXML(station='CAS04', latitude=37.63, longitude=-121.47, elevation=329.39)" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tf_cls.write(fn=\"emtfxml_test.xml\", file_type=\"edi\")" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "EMTFXML(station='CAS04', latitude=37.63, longitude=-121.47, elevation=329.39)" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ " tf_cls.write(fn=\"emtfxml_test.xml\", file_type=\"zmm\")" ] } ], "metadata": { "kernelspec": { "display_name": "aurora-test", "language": "python", "name": "aurora-test" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.10" } }, "nbformat": 4, "nbformat_minor": 4 }