SeisIO¶
SeisIO is a collection of utilities for reading and downloading geophysical timeseries data.
Introduction¶
SeisIO is a framework for working with univariate geophysical data on 64-bit systems. SeisIO is designed around three basic principles:
- Ease of use: one shouldn’t need a PhD to understand command syntax.
- Fluidity: working with data shouldn’t feel clumsy.
- Performance: speed and efficient memory usage matter.
The project is home to an expanding set of web clients, file format readers, and analysis utilities.
Overview¶
SeisIO stores data in minimalist containers that track the bare necessities for analysis. New data are easily added with basic operators like +. Unwanted channels can be removed just as easily. Data can be written to a number of file formats.
Installation¶
From the Julia prompt: press ]
to enter the Pkg environment, then type
add SeisIO; build; precompile
Dependencies should install automatically. To verify that everything works correctly, type
test SeisIO
and allow 10-20 minutes for tests to complete. Exit the Pkg environment by pressing Backspace or Control + C.
Getting Started¶
At the Julia prompt, type
using SeisIO
You’ll need to repeat this step whenever you restart Julia, as with any command-line interpreter (CLI) language.
Learning SeisIO¶
An interactive tutorial using Jupyter notebooks in a web browser can be accessed from the Julia prompt with these commands:
p = pathof(SeisIO)
d = dirname(realpath(p))
cd(d)
include("../tutorial/install.jl")
SeisIO also has an online tutorial guide, intended as a gentle introduction for people less familiar with the Julia language. The two are intentionally redundant; Jupyter isn’t compatible with all systems and browsers.
For a faster start, skip to any of these topics:
- Working with Data: learn how to manage data using SeisIO
- Reading Data: learn how to read data from file
- Web Requests: learn how to download data
Updating¶
From the Julia prompt: press ]
to enter the Pkg environment, then type
update
. Once package updates finish, restart Julia to use them.
First Steps¶
SeisIO is designed around easy, fluid, and fast data access. At the most basic level, SeisIO uses an array-like structure called a SeisChannel for single-channel data, and a multichannel structure named SeisData.
Start Here¶
Create a new, empty SeisChannel object with
Ch = SeisChannel()
The meanings of the field names are explained here; you can also type
?SeisChannel
at the Julia prompt. You can edit field values manually, e.g.,
returning to the code block above,
Ch = SeisChannel()
Ch.loc = GeoLoc(lat=-90.0, lon=0.0, el=2835.0, az=0.0, inc=0.0)
Ch.name = "South pole"
or you can set them with keywords at creation:
Ch = SeisChannel(name="Templo de San José de La Floresta, Ajijic", fs=40.0)
Note that Strings in field names support full Unicode, so even graffiti can be saved and read back in. This is useful for data containing non-English characters.
SeisData structures are collections of channel data. They can be created with the SeisData() command, which can optionally create any number of empty channels at a time, e.g.,
S = SeisData() # empty structure, no channels
S1 = SeisData(12) # empty 12-channel structure
They can be explored similarly:
S = SeisData(1)
S.name[1] = "South pole"
S.loc[1] = GeoLoc(lat=-90.0, lon=0.0, el=2835.0, az=0.0, inc=0.0)
A collection of channels becomes a SeisData structure; for example,
L = GeoLoc(lat=46.1967, lon=-122.1875, el=1440.0, az=0.0, inc=0.0)
S = SeisData(SeisChannel(), SeisChannel())
S = SeisData(randSeisData(5), SeisChannel(),
SeisChannel(id="UW.SEP..EHZ", name="Darth Exploded", loc=L))
You can push channels onto existing SeisData structures, like adding one key to a dictionary. For example,
push!(S, Ch)
copies Ch to a new channel in S. Note that the new S[3] is not a view into Ch.
This is deliberate, as otherwise the workspace quickly becomes a mess of
redundant channels. Clean up with Ch = []
to free memory before moving on.
Operations on SeisData structures¶
We’re now ready for a short tutorial of what we can do with data structures. In the commands below, as in most of this documentation, Ch is a SeisChannel object and S is a SeisData object.
Adding channels to a SeisData structure¶
You’ve already seen one way to add a channel to SeisData: push!(S, SeisChannel()) adds an empty channel. Here are others:
S = randSeisData(4)
n = 3
append!(S, SeisData(n))
This initializes an empty n-channel SeisData structure and appends it to the end of S, similar to appending one array to another. S will have 7 channels, but the last three are empty.
The addition operator, +, calls push! and add!, with one key difference: to ensure reflexivity (i.e., S1 + S2 == S2 + S1), the + operator sorts the output (command sort!) and prunes empty channels (command prune!). Thus,
S = SeisData(2)
S1 = randSeisData(3)
S += S1
outputs only the three channels of random data initialized in the second line of the code block; in addition, they’re sorted by ID, so it’s likely that S != S1.
Search, Sort, and Prune¶
The easiest way to find channels of interest in a data structure is to use
findid or findchan. findid(id, S)
returns the numeric index i of the first channel in S where S.id[i] == id
. findchan(cha, S)
returns an array of numeric indices in S to all channels whose IDs satisfy occursin(cha, S.id[i])
.
For example:
L = GeoLoc(lat=46.1967, lon=-122.1875, el=1440.0, az=0.0, inc=0.0)
S = SeisData(randSeisData(5), SeisChannel(id="YY.STA1..EHZ"),
SeisChannel(id="UW.SEP..EHZ", name="Darth Exploded", loc=L))
findid("UW.SEP..EHZ", S) # 7
findchan("EHZ", S) # [6, 7], maybe others (depending on randSeisData)
You can sort channels in a structure by channel ID with the sort! command.
Several functions exist to prune empty and unwanted channels from SeisData structures. Revisiting the previous code block, for example, try these:
deleteat!(S, 1:2) # Delete first two channels of S
S -= 3 # Delete third channel of S
# Extract S[1] as a SeisChannel, removing it from S
C = pull(S, 1)
# Delete channels containing ".SEP."
delete!(S, ".SEP.", exact=false)
# Delete all channels whose S.x is empty
prune!(S)
S
S should have one channel left.
In the delete! command, specifying exact=false means that any channel whose
ID partly matches the string “.SEP.” gets deleted; by default,
delete!(S, str)
only matches channels where str is the exact ID. This is
an efficient way to remove unresponsive subnets and unwanted channel types, but
beware of accidental over-matching.
Next Steps¶
Because tracking arbitrary operations can be difficult, several functions have been written to keep track of data and operations in a semi-automated way. See the next section, working with data, for detailed discussion of managing data from the Julia command prompt.
Working with Data¶
This section is a tutorial for tracking and managing SeisIO data.
Creating Data Containers¶
Create a new, empty object using any of the following commands:
Object | Purpose |
---|---|
SeisChannel() | A single channel of univariate (usually time-series) data |
SeisData() | Multichannel univariate (usually time-series) data |
SeisHdr() | Header structure for discrete seismic events |
SeisEvent() | Discrete seismic events; includes SeisHdr and SeisData objects |
Acquiring Data¶
Keeping Track¶
A number of auxiliary functions exist to keep track of channels:
-
findchan
(id::String, S::SeisData)¶
-
findchan
(S::SeisData, id::String)
Get all channel indices i in S with id \(\in\) S.id[i]. Can do partial id matches, e.g. findchan(S, “UW.”) returns indices to all channels whose IDs begin with “UW.”.
-
findid
(S::SeisData, id)¶
Return the index of the first channel in S where id = id. Requires an exact string match; faster than Julia findfirst on small structures.
-
findid
(S::SeisData, Ch::SeisChannel)
Equivalent to findfirst(S.id.==Ch.id).
-
namestrip!(S[, convention])
Remove bad characters from the :name fields of S. Specify convention as a string (default is “File”):
Convention | Characters Removed:sup:(a) |
---|---|
“File” | "$*/:<>?@\^|~DEL |
“HTML” | "&';<>©DEL |
“Julia” | $\DEL |
“Markdown” | !#()*+-.[\]_`{} |
“SEED” | .DEL |
“Strict” | !"#$%&'()*+,-./:;<=>?@[\]^`{|}~DEL |
(a) DEL
here is \x7f (ASCII/Unicode U+007f).
-
timestamp
()¶
Return current UTC time formatted yyyy-mm-ddTHH:MM:SS.
-
track_off!(S)
Turn off tracking in S and return a boolean vector of which channels were added or changed.
-
track_on!(S)
Begin tracking changes in S. Tracks changes to :id, channel additions, and changes to data vector sizes in S.x.
Does not track data processing operations on any channel i unless length(S.x[i]) changes for channel i (e.g. filtering is not tracked).
Warning: If you have or suspect gapped data in any channel, calling ungap! while tracking is active will always flag a channel as changed.
Source Logging¶
The :src field records the last data source used to populate each channel; usually a file name or URL.
When a data source is added to a channel, including the first time data are added, it’s also recorded in the :notes field. Use show_src(S, i) to print all data sources for channel S[i] to stdout (see below for details).
Channel Maintenance¶
A few functions exist specifically to simplify data maintenance:
-
prune!(S::SeisData)
Delete all channels from S that have no data (i.e. S.x is empty or non-existent).
-
C = pull(S::SeisData, id::String)
Extract the first channel with id=id from S and return it as a new SeisChannel structure. The corresponding channel in S is deleted.
-
C = pull(S::SeisData, i::integer)
Extract channel i from S as a new SeisChannel object C, and delete the corresponding channel from S.
Taking Notes¶
Functions that add and process data note these operations in the :notes field of each object affected. One can also add custom notes with the note! command:
-
note!(S, i, str)
Append str with a timestamp to the :notes field of channel number i of S.
-
note!(S, id, str)
As above for the first channel in S whose id is an exact match to id.
-
note!(S, str)
if str* mentions a channel name or ID, only the corresponding channel(s) in **S is annotated; otherwise, all channels are annotated.
Clear all notes from channel i
of S
.
clear_notes!(S, id)
Clear all notes from the first channel in S
whose id field exactly matches id
.
clear_notes!(S)
Clear all notes from every channel in S
.
Checking Your Work¶
If you need to check what’s been done to a channel, or the sources present in the channel data, these commands are helpful:
-
show_processing
(S::SeisData)¶
-
show_processing
(S::SeisData, i::Int)
-
show_processing
(C::SeisChannel)
Tabulate and print all processing steps in :notes to stdout in human-readable format.
-
show_src
(S::SeisData)¶
-
show_src
(S::SeisData, i::Int)
-
show_src
(C::SeisChannel)
Tabulate and print all data sources in :notes to stdout.
-
show_writes
(S::SeisData)¶
-
show_writes
(S::SeisData, i::Int)
-
show_writes
(C::SeisChannel)
Tabulate and print all write operations in :notes to stdout in human-readable format.
Getting Help¶
In addition to the Juypter notebooks and online tutorial guide, other sources of help are available:
Examples¶
Several worked examples exist throughout these documents, in addition to examples.jl and the interactive tutorial.
Invoke the command-prompt examples with the following command sequence:
p = pathof(SeisIO)
d = dirname(realpath(p))
cd(d)
include("../test/examples.jl")
Tests¶
The commands in tests/ can be used as templates; to install test data and run all tests, execute these commands:
using Pkg
Pkg.test("SeisIO") # lunch break recommended. Tests can take 20 minutes.
# 99.5% code coverage wasn't an accident...
p = pathof(SeisIO)
cd(realpath(dirname(p) * "/../test/"))
Command-Line Help¶
A great deal of additional help functions are available at the Julia command prompt. All SeisIO functions and structures have their own docstrings. For example, typing ?SeisData at the Julia prompt produces the following:
SeisData
A custom structure designed to contain the minimum necessary information
for processing univariate geophysical data.
SeisChannel
A single channel designed to contain the minimum necessary information
for processing univariate geophysical data.
Fields
========
Field Description
–––––– ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
:n Number of channels [^1]
:c TCP connections feeding data to this object [^1]
:id Channel id. Uses NET.STA.LOC.CHAN format when possible
:name Freeform channel name
:loc Location (position) vector; any subtype of InstrumentPosition
:fs Sampling frequency in Hz; fs=0.0 for irregularly-sampled data.
:gain Scalar gain
:resp Instrument response; any subtype of InstrumentResponse
:units String describing data units. UCUM standards are assumed.
:src Freeform string describing data source.
:misc Dictionary for non-critical information.
:notes Timestamped notes; includes automatically-logged info.
:t Matrix of time gaps in integer μs, formatted [Sample# Length]
:x Time-series data
Dedicated Help Functions¶
These functions take no arguments and dump information to stdout.
Submodule SEED¶
-
dataless_support
()¶
Output lists of supported blockettes in dataless SEED to stdout.
-
mseed_support
()¶
Output lists of supported blockettes in mini-SEED to stdout.
-
resp_wont_read
()¶
Output “hall of shame” of known examples of broken RESP to stdout.
-
seed_support
()¶
Output full information on SEED support to stdout.
Formats Guide¶
formats is a constant static dictionary with descriptive entries of each data format. Access the list of formats with sort(keys(formats)). Then try a command like formats[“slist”] for detailed info. on the slist format.
Help-Only Functions¶
These functions contain help docstrings but execute nothing. They exist to answer common questions.
-
?web_chanspec
Answers: how do I specify channels in a web request? Outputs channel id syntax to stdout.
-
?seis_www
Answers: which servers are available for FDSN queries? Outputs the FDSN server list to stdout.
-
?TimeSpec
All About Keywords¶
Invoke keywords help with ?SeisIO.KW for complete information on SeisIO shared keywords and meanings.
Time-Series Files¶
-
read_data!(S, fmt::String, filepat [, KWs])
-
S = read_data(fmt::String, filepat [, KWs])
filepat
. Supports wildcards.?SeisIO.KW
. See table below for the list.Supported File Formats¶
File Format | String | Strict Match |
---|---|---|
AH-1 | ah1 | id, fs, gain, loc, resp, units |
AH-2 | ah2 | id, fs, gain, loc, resp |
Bottle (UNAVCO) | bottle | id, fs, gain |
GeoCSV, time-sample pair | geocsv | id |
GeoCSV, sample list | geocsv.slist | id |
Lennartz ASCII | lenartz | id, fs |
Mini-SEED | mseed | id, fs |
PASSCAL SEG Y | passcal | id, fs, gain, loc |
SAC | sac | id, fs, gain |
SEG Y (rev 0 or rev 1) | segy | id, fs, gain, loc |
SEISIO | seisio | id, fs, gain, loc, resp, units |
SLIST (ASCII sample list) | slist | id, fs |
SUDS | suds | id |
UW data file | uw | id, fs, gain, units |
Win32 | win32 | id, fs, gain, loc, resp, units |
Strings are case-sensitive to prevent any performance impact from using matches and/or lowercase().
Note that read_data with file format “seisio” largely exists as a convenience
wrapper; it reads only the first SeisIO object from each file that can be
converted to a SeisData structure. For more complicated read operations,
rseis
should be used.
Warning: GeoCSV files must be Unix text files; DOS text files, whose lines end in “\r\n”, will not read properly. Convert with dos2unix or equivalent Windows Powershell commands.
Supported Keywords¶
Keyword | Used By | Type | Default | Meaning |
---|---|---|---|---|
cf | win32 | String | “” | win32 channel info filestr |
full | [1] | Bool | false | read full header into :misc? |
ll | segy | UInt8 | 0x00 | set loc in :id? (see below) |
memmap | * | Bool | false | use Mmap.mmap to buffer file? |
nx_add | [2] | Int64 | 360000 | minimum size increase of x |
nx_new | [3] | Int64 | 86400000 | length(x) for new channels |
jst | win32 | Bool | true | are sample times JST (UTC+9)? |
swap | [4] | Bool | true | byte swap? |
strict | * | Bool | true | use strict match? |
v | * | Integer | 0 | verbosity |
vl | * | Bool | 0 | verbose source logging? [5] |
Table Footnotes
[1] | used by ah1, ah2, sac, segy, suds, uw; information read into :misc varies by file format. |
[2] | see table below. |
[3] | used by bottle, mseed, suds, win32 |
[4] | used by bottle, mseed, suds, win32 |
[5] | used by mseed, passcal, segy; swap is automatic for sac. |
[6] | adds one line to :notes per file read. It is not guaranteed that files listed in S.notes[i] contain data for channel i; rather, all files listed are from the read operation(s) that populated i. |
Performance Tips¶
- mmap=true improves read speed for some formats, particularly ASCII readers, but requires caution. In our benchmarks, the following significant (>3%) speed changes are observed:
- Significant speedup: ASCII formats, including metadata formats
- Slight speedup: mini-SEED
- Significant slowdown: SAC
2. With mseed or win32 data, adjust nx_new and nx_add based on the sizes of the data vectors that you expect to read. If the largest has Nmax samples, and the smallest has Nmin, we recommend nx_new=Nmin and nx_add=Nmax-Nmin.
Default values can be changed in SeisIO keywords, e.g.,
SeisIO.KW.nx_new = 60000
SeisIO.KW.nx_add = 360000
The system-wide defaults are nx_new=86400000 and nx_add=360000. Using these values with very small jobs will greatly decrease performance.
3. strict=true may slow read_data based on the fields matched as part of the file format. In general, any file format that can match on more than id and fs will read slightly slower with this option.
Channel Matching¶
By default, read_data continues a channel if data read from file matches the
channel id (field :id). In some cases this is not enough to guarantee a good match. With strict=true
, read_data matches against fields :id, :fs, :gain, :loc, :resp, and :units. However, not all of these fields are stored natively in all file formats. Column “Strict Match” in the first table lists which fields are stored (and can be logically matched) in each format with strict=true.
Examples¶
S = read_data("uw", "99011116541W", full=true)
- Read UW-format data file
99011116541W
- Store full header information in
:misc
- Read UW-format data file
read_data!(S, "sac", "MSH80*.SAC")
- Read SAC-format files matching string pattern MSH80*.SAC
- Read into existing SeisData object
S
S = read_data("win32", "20140927*.cnt", cf="20140927*ch", nx_new=360000)
- Read win32-format data files with names matching pattern
2014092709*.cnt
- Use ASCII channel information filenames that match pattern
20140927*ch
- Assign new channels an initial size of
nx_new
samples
- Read win32-format data files with names matching pattern
Memory Mapping¶
memmap=true is considered unsafe because Julia language handling of SIGBUS/SIGSEGV and associated risks is undocumented as of SeisIO v1.0.0. Thus, for example, we don’t know what a connection failure during memory-mapped file I/O does. In some languages, this situation without additional signal handling was notorious for corrupting files.
Under no circumstances should mmap=true be used to read files directly from a drive whose host device power management is independent of the destination computer’s. This includes all work flows that involve reading files directly into memory from a connected data logger. It is not a sufficient workaround to set a data logger to “always on”.
Format Descriptions and Notes¶
Additional format information can be accessed from the command line by typing
SeisIO.formats("FMT")
where FMT is the format name; keys(SeisIO.formats)
for a list.
- AH (Ad-Hoc) was developed as a machine-independent seismic data format based on External Data Representation (XDR).
- Bottle is a single-channel format maintained by UNAVCO (USA).
- GeoCSV: an extension of “human-readable”, tabular file format Comma-Separated Values (CSV).
- Lennartz: a variant of sample list (SLIST) used by Lennartz portable digitizers.
- PASSCAL: A single- channel variant of SEG Y with no file header, developed by PASSCAL/New Mexico Tech and used with PASSCAL field equipment.
- SAC: the Seismic Analysis Code data format, originally developed by LLNL for the eponymous command-line interpreter.
- SEED: adopted by the International Federation of Digital Seismograph Networks (FDSN) as an omnibus seismic data standard. mini-SEED is a data-only variant that uses only data blockettes.
- SEG Y: Society of Exploration Geophysicists data format. Common in the energy industry. Developed and maintained by SEG.
- SLIST: An ASCII file with a one-line header and data written to file in ASCII string format.
- SUDS: A similar format to SEED, developed by the US Geological Survey (USGS) in the late 1980s.
- UW: created in the 1970s by the Pacific Northwest Seismic Network (PNSN), USA, for event archival; used until the early 2000s.
- Win32: maintained by the National Research Institute for Earth Science and Disaster Prevention (NIED), Japan. Continuous data are divided into files that contain a minute of data from multiple channels stored in one-second segments.
Format-Specific Information¶
SEG Y¶
Only SEG Y rev 0 and rev 1 with standard headers are supported. The following are known support limitations:
- A few SEG Y headers are partially implemented or unused. These will be refined as we obtain more test data with standardized SEG Y headers and known results.
- Not all SEG Y files use the gain formula in the SEG Y rev 1 manual. Users are urged to consult equipment manufacturers and/or coders whose software converts proprietary data formats to SEG Y.
- SeisIO does not use the Textual File Header (file bytes 1-3600) or Extended Textual File Header records, as these were never standardized. Specify full=true to read the raw bytes into vectors in :misc. These byte vectors can be parsed manually by the user after file read.
Setting the Location Subfield¶
The location subfield within :id (“LL” in NN.SSSS.LL.CC) is normally blank, but can be set from an arbitrary Int32 quantity in SEG Y. The reason for this behavior is that SEG Y has at least six “recommended” quantities that can indicate a unique channel. Use one by passing the corresponding value from the table below to keyword “ll”:
Code | U | Bytes | :misc | Usual trace header quantity |
---|---|---|---|---|
0x00 | None (Default); don’t set LL | |||
0x01 | Y | 001-004 | trace_seq_line | Trace sequence number within line |
0x02 | Y | 005-008 | trace_seq_file | Trace sequence number within SEG Y file |
0x03 | 009-012 | rec_no | Original field record number | |
0x04 | Y | 013-016 | channel_no | Trace number within original field record |
0x05 | 017-020 | energy_src_pt | Energy source point number | |
0x06 | 021-024 | cdp | Ensemble number | |
0x07 | ? | 025-028 | trace_in_ensemble | Trace number within the ensemble |
0x08 | 037-040 | src-rec_dist | Distance from center of source point | |
0x09 | 041-044 | rec_ele | Receiver group elevation | |
0x0a | 045-048 | src_ele | Surface elevation at source | |
0x0b | 049-052 | src_dep | Source depth below surface (positive) | |
0x0c | 053-056 | rec_datum_ele | Datum elevation at receiver group | |
0x0d | 057-060 | src_datum_ele | Datum elevation at source | |
0x0e | 061-064 | src_water_dep | Water depth at source | |
0x0f | 065-068 | rec_water_dep | Water depth at group | |
0x10 | 073-076 | src_x | Source coordinate - X | |
0x11 | 077-080 | src_y | Source coordinate - Y | |
0x12 | 081-084 | rec_x | Group coordinate - X | |
0x13 | 085-088 | rec_y | Group coordinate - Y | |
0x14 | 181-184 | cdp_x | X coordinate of ensemble (CDP) position | |
0x15 | 185-188 | cdp_y | Y coordinate of ensemble (CDP) position | |
0x16 | 189-192 | inline_3d | For 3-D poststack data, in-line number | |
0x17 | 193-196 | crossline_3d | For 3-D poststack data, cross-line number | |
0x18 | 197-200 | shot_point | Shotpoint number (2-D post-stack data) | |
0x19 | 205-208 | trans_mant | Transduction Constant (mantissa) | |
0x1a | ? | 233-236 | unassigned_1 | Unassigned — For optional information |
0x1b | ? | 237-240 | unassigned_2 | Unassigned — For optional information |
A SEG Y file usually increments one (or more) of 0x01, 0x02, or 0x04 for each trace. Unfortunately, we can’t imagine any way to use all three, or even two, in a SEGY-compliant channel ID.
Warning: for any quantity above,
- Numeric values >1296 lead to nonstandard characters in the LL subfield
- Numeric values >7200 lead to non-unique :id fields, with undefined results
- Numeric values >9216 cause read_data to throw an InexactError
UW¶
Only UW v2 (UW-2) data files are supported. We have no reason to believe that any UW-1 data files are in circulation, and external converters to UW-2 exist.
Win32¶
Use older channel files with caution. They were not controlled by any central authority until the late 2010s. Inconsistencies between different versions of the same channel file were found by SeisIO developers as recently as 2015.
Metadata Files¶
-
read_meta!(S, fmt::String, filepat [, KWs])
-
S = read_meta(fmt::String, filepat [, KWs])
filepat
. Supports wildcards.?SeisIO.KW
.Supported File Formats¶
File Format | String |
---|---|
Dataless SEED | dataless |
FDSN Station XML | sxml |
SACPZ | sacpz |
SEED RESP | resp |
Warning: Dataless SEED, SACPZ, and RESP files must be Unix text files; DOS text files, whose lines end in “\r\n”, will not read properly. Convert with dos2unix or equivalent Windows Powershell commands.
Supported Keywords¶
KW | Used By | Type | Default | Meaning |
---|---|---|---|---|
memmap | all | Bool | false | use Mmap.mmap to buffer file? |
msr | sxml | Bool | false | read full MultiStageResp? |
s | all | TimeSpec | Start time | |
t | all | TimeSpec | Termination (end) time | |
units | resp | Bool | false | fill in MultiStageResp units? |
dataless | ||||
v | all | Integer | 0 | verbosity |
Note: mmap=true improves read speed for ASCII formats but requires caution. Julia language handling of SIGBUS/SIGSEGV and associated risks is unknown and undocumented.
HDF5 Files¶
Of the increasingly popular HDF5-based formats for geophysical data, only ASDF is supported at present. Support for other (sub)formats is planned.
-
S = read_hdf5(fname::String, s::TimeSpec, t::TimeSpec, [, KWs])
-
read_hdf5!(S::GphysData, fname::String, s::TimeSpec, t::TimeSpec, [, KWs])
?SeisIO.KW
.This has one fundamental design difference from read_data: HDF5 archives are assumed to be large files with data from multiple channels; they are scanned selectively for data of interest to read, rather than read into memory in their entirety.
Supported Keywords¶
KW | Type | Default | Meaning |
---|---|---|---|
id | String | “...*” | id pattern, formated nn.sss.ll.ccc |
(net.sta.loc.cha); FDSN-style wildcards (a) | |||
msr | Bool | true | read full (MultiStageResp) instrument resp? |
v | Integer | 0 | verbosity |
(a) A question mark (‘?’) is a wildcard for a single character (exactly one); an asterisk (‘*’) is a wildcard for zero or more characters.
Writing to HDF5 volumes is supported through write_hdf5, described in Writing to File.
XML Meta-Data¶
SeisIO can read and write the following XML metadata formats:
- QuakeML Version 1.2
- StationXML Version 1.1
StationXML¶
-
read_sxml
(fpat[, KWs])¶
Read FDSN StationXML files matching string pattern fpat into a new SeisData object.
s
: start time. Format “YYYY-MM-DDThh:mm:ss”, e.g., “0001-01-01T00:00:00”.t
: termination (end) time. Format “YYYY-MM-DDThh:mm:ss”.msr
: (Bool) read instrument response info as MultiStageResp?QuakeML¶
-
read_qml
(fpat)
Read QuakeML files matching string pattern fpat. Returns a tuple containing an array of SeisHdr objects H and an array of SeisSrc objects R. Each pair (H[i], R[i]) describes the preferred location (origin, SeisHdr) and event source (focal mechanism or moment tensor, SeisSrc) of event i.
If multiple focal mechanisms, locations, or magnitudes are present in a single Event element of the XML file(s), the following rules are used to select one of each per event:
Non-essential QuakeML data are saved to misc in each SeisHdr or SeisSrc object as appropriate.
Web Services¶
Data requests use get_data!
for FDSN, IRISWS, and IRIS PH5WS data services; for (near)
real-time streaming, see SeedLink.
Time-Series Data¶
-
get_data!(S, method, channels; KWs)
-
S = get_data(method, channels; KWs)
?chanspec
at the Julia prompt for more info.Keywords¶
unscale
: divide gain from data after downloaddemean
: demean data after downloaddetrend
: detrend data after downloadtaper
: taper data after downloadungap
: remove gaps in data after downloadrr
: remove seismic instrument response after download
autoname
: determine file names from channel ID?msr
: get instrument responses asMultiStageResonse
? (“FDSN” only)s
: start timet
: termination (end) timexf
: XML file name for output station XML
Special Behavior¶
- autoname=true attempts to emulate IRISWS channel file naming conventions. For this to work, however, each request must return exactly one channel. A wildcard (“*” or “?”) in a channel string deactivates
autoname=true
. - Seismic processing keywords follow an order of operations that matches the ordering of the above list.
- IRISWS requests always remove the stage zero gain on the server side, because the service doesn’t include the gain constant in the request. This ensures that :gain is accurate in SeisIO.
- IRISWS requests don’t fill :loc or :resp fields in mini-SEED and don’t fill the :resp field in SAC. For cross-format consistency, the stage-zero (scalar) gain is removed from any request to IRISWS and the :gain field in such channels is 1.0.
Data Formats¶
SeisIO supports the following data format strings in timeseries web requests, subject to the limitations of the web service:
- “miniseed” or “mseed” for mini-SEED
- “sac” or “sacbl” for binary little-endian SAC
- “geocsv” for two-column (tspair) GeoCSV
Station Metadata¶
-
FDSNsta!(S, chans, KW)
-
S = FDSNsta(chans, KW)
Fill channels chans of SeisData structure S with information retrieved from remote station XML files by web query.
msr
: get instrument responses asMultiStageResonse
?s
: start timet
: termination (end) timexf
: XML file name for output station XML
Examples¶
Note that the “src” keyword is used by FDSNWS dataselect queries, but not by IRISWS or PH5WS timeseries queries.
1. Download 10 minutes of data from four stations at Mt. St. Helens (WA, USA), delete the low-gain channels, and save as SAC files in the current directory.
S = get_data("FDSN", "CC.VALT, UW.SEP, UW.SHW, UW.HSR", src="IRIS", t=-600)
S -= "UW.SHW..ELZ"
S -= "UW.HSR..ELZ"
writesac(S)
2. Get 5 stations, 2 networks, all channels, last 600 seconds of data at IRIS
CHA = "CC.PALM, UW.HOOD, UW.TIMB, CC.HIYU, UW.TDH"
TS = u2d(time())
TT = -600
S = get_data("FDSN", CHA, src="IRIS", s=TS, t=TT)
3. A request to FDSN Potsdam, time-synchronized, with some verbosity
ts = "2011-03-11T06:00:00"
te = "2011-03-11T06:05:00"
R = get_data("FDSN", "GE.BKB..BH?", src="GFZ", s=ts, t=te, v=1, y=true)
4. Get channel information for strain and seismic channels at station PB.B001:
S = FDSNsta("CC.VALT..,PB.B001..BS?,PB.B001..E??")
- Get trace data from IRISws from
TS
toTT
at channelsCHA
S = SeisData()
CHA = "UW.TDH..EHZ, UW.VLL..EHZ, CC.VALT..BHZ"
TS = u2d(time()-86400)
TT = 600
get_data!(S, "IRIS", CHA, s=TS, t=TT)
6. Get synchronized trace data from IRISws with a 55-second timeout on HTTP requests, written directly to disk.
CHA = "UW.TDH..EHZ, UW.VLL..EHZ, CC.VALT..BHZ"
TS = u2d(time())
TT = -600
S = get_data("IRIS", CHA, s=TS, t=TT, y=true, to=55, w=true)
7. Request 10 minutes of continuous vertical-component data from a small May 2016 earthquake swarm at Mt. Hood, OR, USA, and cosine taper after download:
STA = "UW.HOOD.--.BHZ,CC.TIMB.--.EHZ"
TS = "2016-05-16T14:50:00"; TE = 600
S = get_data("IRIS", STA, s=TS, t=TE)
8. Grab data from a predetermined time window in two different formats
ts = "2016-03-23T23:10:00"
te = "2016-03-23T23:17:00"
S = get_data("IRIS", "CC.JRO..BHZ", s=ts, t=te, fmt="sacbl")
T = get_data("IRIS", "CC.JRO..BHZ", s=ts, t=te, fmt="miniseed")
Bad Requests¶
Failed data requests are saved to special channels whose IDs begin with “XX.FAIL”. The HTTP response message is stored as a String in :misc["msg"]
; display to STDOUT with println(stdout, S.misc[i]["msg"])
.
Unparseable data requests are saved to special channels whose IDs begin with “XX.FMT”. The raw response bytes are stored as an Array{UInt8,1} in :misc["raw"]
and can be dumped to file or parsed with external programs as needed.
One special channel is created per bad request.
SeedLink¶
SeedLink is a TCP/IP-based data transmission protocol that allows near-real-time access to data from thousands of geophysical monitoring instruments. See data keywords list and channel id syntax for options.
-
seedlink!(S, mode, chans, KWs)
-
seedlink!(S, mode, chans, patts, KWs)
-
S = seedlink(mode, chans, KWs)
Initiate a SeedLink session in DATA mode to feed data from channels chans
with selection patterns patts
to SeisData structure S
. A handle to a TCP connection is appended to S.c
. Data are periodically parsed until the connection is closed. One SeisData object can support multiple connections, provided that each connection’s streams feed unique channels.
Keywords¶
kw | def | type | meaning |
---|---|---|---|
gap | 3600 | R | a stream with no data in >gap seconds is considered offline |
kai | 600 | R | keepalive interval (s) |
port | 18000 | I | port number |
refresh | 20 | R | base refresh interval (s) [1] |
seq | S | Starting sequence hex value, like “5BE37A” | |
u | rtserve.iris.washington.edu | S | base SeedLink service URL, no “http://” |
x_on_err | true | Bool | exit on error? |
Table Footnotes
[1] | This value is a base value; a small amount is added to this number by each new SeedLink session to minimize the risk of congestion |
Change these with SeisIO.KW.SL.[key] = value, e.g., SeisIO.KW.SL.refresh = 30.
Special Behavior¶
- SeedLink follows unusual rules for wild cards in
sta
andpatts
: *
is not a valid SeedLink wild card.- The LOC and CHA fields can be left blank in
sta
to select all locations and channels.
- SeedLink follows unusual rules for wild cards in
- DO NOT feed one data channel from multiple SeedLink connections. This leads to TCP congestion on your computer, which can have severe consequences:
- A channel fed by multiple SeedLink connections will have many small segments, all out of order. merge! might fix this if caught quickly, but with hundreds of disjoint segments, expect memory and CPU issues.
- Your SeedLink connection will probably reset.
- Julia may freeze, requiring
kill -9
. To the best of our knowledge Julia has no special handling to mitigate TCP congestion. - Your data may be corrupted, including disk writes from w=true.
Special Methods¶
close(S.c[i])
ends SeedLink connectioni
.!deleteat(S.c, i)
removes a handle to closed SeedLink connectioni
.
SeedLink Utilities¶
-
SL_info
(v, url)¶
Retrieve SeedLink information at verbosity level v from url. Returns XML as a string. Valid strings for L are ID, CAPABILITIES, STATIONS, STREAMS, GAPS, CONNECTIONS, ALL.
-
has_sta
(sta[, u=url, port=n])¶
u
specifies the URL without “http://”Check that streams exist at url for stations sta, formatted NET.STA. Use “?” to match any single character. Returns true for stations that exist. sta can also be the name of a valid config file or a 1d string array.
Returns a BitArray with one value per entry in sta.
-
has_stream
(cha::Union{String, Array{String, 1}}, u::String)¶
-
has_stream
(sta::Array{String, 1}, sel::Array{String, 1}, u::String, port=N::Int, gap=G::Real)
u
specifies the URL without “http://”Check that streams with recent data exist at url u for channel spec cha, formatted NET.STA.LOC.CHA.DFLAG, e.g. “UW.TDH..EHZ.D, CC.HOOD..BH?.E”. Use “?” to match any single character. Returns true for streams with recent data. cha can also be the name of a valid config file.
If two arrays are passed to has_stream, the first should be formatted as SeedLink STATION patterns (SSSSS NN, e.g. [“TDH UW”, “VALT CC”]); the second should be an array of SeedLink selector patterns (LLCCC.D, e.g. [“??EHZ.D”, “??BH?.?”]).
Write Suppport¶
The table below sumamrizes the current write options for SeisIO. Each function is described in detail in this chapter.
Structure/Description | Output Format | Function | Submodule |
---|---|---|---|
GphysChannel | ASDF | write_hdf5 | |
GphysChannel | SAC timeseries | writesac | |
GphysChannel channel metadata | StationXML | write_sxml | |
GphysChannel instrument response | SAC polezero | writesacpz | |
GphysData | ASDF | write_hdf5 | |
GphysData | SAC timeseries | writesac | |
GphysData channel metadata | StationXML | write_sxml | |
GphysData instrument response | SAC polezero | writesacpz | |
SeisEvent | ASDF | write_hdf5 | |
SeisEvent header and source info | ASDF QuakeML | asdf_wqml | |
SeisEvent header and source info | QuakeML | write_qml | |
SeisEvent trace data only | SAC timeseries | writesac | |
Array{SeisEvent, 1} | ASDF QuakeML | asdf_wqml | |
Array{SeisHdr, 1} | QuakeML | write_qml | |
Array{SeisHdr, 1}, Array{SeisSrc, 1} | ASDF QuakeML | asdf_wqml | |
Array{SeisHdr, 1}, Array{SeisSrc, 1} | QuakeML | write_qml | |
SeisHdr | QuakeML | write_qml | |
SeisHdr, SeisSrc | ASDF QuakeML | asdf_wqml | |
SeisHdr, SeisSrc | QuakeML | wqml | |
any SeisIO structure | SeisIO file | wseis | |
primitive data type or array | ASDF AuxiliaryData | asdf_waux |
Methods for SeisEvent, SeisHdr, or SeisSrc are part of submodule SeisIO.Quake. asdf_waux and asdf_wqml are part of SeisIO.SeisHDF..
Write Functions¶
Functions are organized by file format.
HDF5/ASDF¶
-
write_hdf5
(fname, S)¶
Write data from S to file fname in a seismic HDF5 format. The default file format is ASDF.
With ASDF files, if typeof(S) == SeisEvent, S.hdr and S.source are written (appended) to the “QuakeML ” element.
Supported Keywords¶
KW | Type | Default | Meaning |
---|---|---|---|
add | Bool | false | Add new traces to file as needed? |
chans | ChanSpec | 1:S.n | Channels to write to file |
len | Period | Day(1) | Length of new traces added to file |
ovr | Bool | false | Overwrite data in existing traces? |
v | Integer | 0 | verbosity |
Write Method: Add (add=true)¶
Initializes new traces (filled with NaNs) of length = len as needed, and overwrite with data in appropriate places.
add=true follows these steps in this order: 1. Determine times of all data in S[chans] and all traces in “Waveforms/”. 2. For all data in S[chans] that cannot be written to an existing trace, a new trace of length = len sampled at S.fs[i] is initialized (filled with NaNs). 3. If a segment in S[chans] overlaps a trace in “Waveforms/” (including newly- created traces): + Merge the header data in S[chans] into the relevant station XML. + Overwrite the relevant segment(s) of the trace.
Unless len exactly matches the time boundaries of each segment in S, new traces will contain more data than S, with the extra samples initialized to NaNs. Presumably these will be replaced with real data in subsequent overwrites.
Write Method: Overwrite (ovr = true)¶
If ovr=true is specified, but add=false, write_hdf5 only overwrites existing data in hdf_out. * No new trace data objects are created in hdf_out. * No new file is created. If hdf_out doesn’t exist, nothing happens. * If no traces in hdf_out overlap segments in S, hdf_out isn’t modified. * Station XML is merged in channels that are partly overwritten.
QuakeML¶
-
write_qml
(fname, Ev::SeisEvent; v::Integer=0)¶
Write event metadata from SeisEvent Ev to file fname.
-
write_qml
(fname, SHDR::SeisHdr; v::Integer=0)
-
write_qml
(fname, SHDR::Array{SeisHdr, 1}; v::Integer=0)
Write QML to file fname from SHDR.
If fname exists, and is QuakeML, SeisIO appends the existing XML. If the
file exists, but is NOT QuakeML, an error is thrown; the file isn’t overwritten.
write_qml(fname, SHDR::SeisHdr, SSRC::SeisSrc; v::Integer=0) write_qml(fname, SHDR::Array{SeisHdr,1}, SSRC::Array{SeisSrc,1}; v::Integer=0)
Write QML to file fname from SHDR and SSRC.
Warning: To write data from SeisSrc structure R in array SSRC, it must be true that R.eid == H.id for some H in array SHDR.
SAC¶
-
writesac
(S::GphysData, chans=CC, v=V)¶
-
writesac
(C::GphysChannel; chans=CC, fname=FF, v=V)
Write SAC data to SAC files with auto-generated names. With any GphysChannel subtype, specifying fname=FF sets the filename to FF.
-
writesacpz
(pzf, S[, chans=CC])¶
Write fields from SeisIO structure S to SACPZ file pzf. Specify which channels to write in a GphysDaya structure with chans=CC.
SeisIO Native¶
-
wseis
(fname, S)¶
-
wseis
(fname, S, T, U...)
Write SeisIO data to file fname. Multiple objects can be written at once.
Station XML¶
-
write_sxml
(fname, S[, chans=CC])¶
Write station XML from the fields of S to file fname. Specify channel numbers to write in a GphysData object with chans=CC.
Use keyword chans=Cha to restrict station XML write to Cha. This keyword can accept an Integer, UnitRange, or Array{Int64,1} argument.
Data Processing¶
Supported processing operations are described below.
In most cases, a “safe” version of each function can be invoked to create a new object with the processed output.
Any function that can logically operate on a single-channel object will do so. Any function that operates on a SeisData object can be applied to the :data field of a SeisEvent object.
Basic Operations¶
These functions have no keywords that fundamentally change their behavior.
Remove the mean from all channels i with S.fs[i] > 0.0. Specify irr=true to also
remove the mean from irregularly sampled channels. Specify chans=CC to restrict
to channel number(s) CC. Ignores NaNs.
Remove the polynomial trend of degree n from every regularly-sampled channel
in S using a least-squares polynomial fit. Specify chans=CC to restrict
to channel number(s) CC. Ignores NaNs.
Compute the envelope of channel data in S. Only affects regularly-sampled
channels.
-
nanfill!(S)
For each channel i in S, replace all NaNs in S.x[i] with the mean
of non-NaN values.
-
resample!(S::GphysData [, chans=CC, fs=FS])
-
resample!(C::SeisChannel, fs::Float64)
Resample data in S to FS Hz. If keyword fs is not specified, data are resampled to the lowest non-zero value in S.fs[CC]. Note that a poor choice of fs can lead to upsampling and other bad behaviors.
Use keyword chans=CC to only resample channel numbers CC. By default,
all channels i with S.fs[i] > 0.0 are resampled.
-
unscale!(S[, chans=CC, irr=false])
Divide the gains from all channels i with S.fs[i] > 0.0. Specify chans=CC to restrict to channel number(s) CC. Specify irr=true to also remove gains of irregularly-sampled channels.
Customizable Operations¶
Convert Seismograms¶
Seismograms can be converted to or from displacement, velocity, or acceleration using convert_seis:
-
convert_seis!(S[, chans=CC, units_out=UU, v=V])
-
convert_seis!(C[, units_out=UU, v=V])
Converts all seismic data channels in S to velocity seismograms, differentiating or integrating as needed.
Keywords¶
- units_out=UU: specify output units: “m”, “m/s” (default), or “m/s²”
- chans=CC: restrict seismogram conversion to seismic data channels in CC
- v=V: verbosity
Behavior and Usage Warnings¶
Long Seismograms: convert_seis becomes less reversible as seismograms lengthen, especially at Float32 precision, due to loss of significance. At single (Float32) precision, seismograms with N ~ 10^6 samples are reconstructable after one conversion (e.g. “m” ==> “m/s” can be reversed, with output approximately equal to the original data). After multiple conversions (i.e., “m” ==> “m/s²” or “m/s²” ==> “m”), Float32 data cannot be perfectly reconstructed in this way, though reconstruction errors are typically small.
Rectangular Integration: integration is always rectangular; irregularly-spaced seismic data are not processed by convert_seis. Summation uses an in-place variant of Kahan-Babuška-Neumaier summation.
Fill Gaps¶
-
ungap!(S[, chans=CC, m=true, tap=false])
-
ungap!(C[, m=true, tap=false])
Fill time gaps in each channel with the mean of the channel data.
Keywords¶
- chans=CC: only ungap channels CC.
- m=false: this flag fills gaps with NaNs instead of the mean.
- tap=true: taper data before filling gaps.
Merge¶
-
merge!(S::GphysData, U::GphysData)
Merge two GphysData structures. For timeseries data, a single-pass merge-and-prune
operation is applied to value pairs whose sample times are separated by less than
half the sampling interval.
-
merge!(S::GphysData)
“Flatten” a GphysData structure by merging data from identical channels.
Merge Behavior¶
Which channels merge?¶
- Channels merge if they have identical values for
:id
,:fs
,:loc
,:resp
, and:units
. - An unset
:loc
,:resp
, or:units
field matches any set value in the corresponding field of another channel.
What happens to merged fields?¶
- The essential properties above are preserved.
- Other fields are combined.
- Merged channels with different :name values use the name of the channel with the latest data before the merge; other names are logged to :notes.
What does merge!
resolve?¶
Issue | Resolution |
---|---|
Empty channels | Delete |
Duplicated channels | Delete duplicate channels |
Duplicated windows in channel(s) | Delete duplicate windows |
Multiple channels, same properties(a) | Merge to a single channel |
Channel with out-of-order time windows | Sort in chronological order |
Overlapping windows, identical data, time-aligned | Windows merged |
Overlapping windows, identical data, small time offset(a) | Time offset corrected, windows merged |
Overlapping windows, non-identical data | Samples averaged, windows merged |
(a) “Properties” here are :id
, :fs
, :loc
, :resp
, and :units
.
(b) Data offset >4 sample intervals are treated as overlapping and non-identical.
When SeisIO Won’t Merge¶
SeisIO does not combine data channels if any of the five fields above are non-empty and different. For example, if a GphysData object S contains two channels, each with id “XX.FOO..BHZ”, but one has fs=100 Hz and the other fs=50 Hz, merge! does nothing.
It’s best to merge only unprocessed data. Data segments that were processed
independently (e.g. detrended) will be averaged pointwise when merged, which
can easily leave data in an unusuable state.
-
mseis!(S::GphysData, U::GphysData, ...)
Merge multiple GphysData structures into S.
Seismic Instrument Response¶
-
translate_resp!(S, resp_new[, chans=CC, wl=g])
-
translate_resp!(Ch, resp_new[, wl=g])
Translate the instrument response of seismic data channels to resp_new.
Replaces field :resp with resp_new for all affected channels.
-
remove_resp!(S, chans=CC, wl=g])
-
remove_resp!(Ch, wl=g])
Remove (flatten to DC) the instrument response of Ch, or of seismic data channels CC in S. Replaces :resp with the appropriate (all-pass) response.
Keywords¶
- C=cha restricts response translation for GphysData object S to channel(s) cha. Accepts an Integer, UnitRange, or Array{Int64,1} argument; does not accept string IDs. By default, all seismic data channels in S have their responses translated to resp_new.
- wl=g sets the waterlevel to g (default: g = eps(Float32) ~ 1.1f-7). The waterlevel is the minimum magnitude (absolute value) of the normalized old frequency response; in other words, if the old frequency response has a maximum magnitude of 1.0, then no response coefficient can be lower than g. This is useful to prevent “divide by zero” errors, but setting it too high will cause errors.
Precision and Memory Optimization¶
To optimize speed and memory use, instrument response translation maps data to Complex{Float32} before translation; thus, with Float64 data, there can be minor rounding errors.
Instrument responses are also memory-intensive. The minimum memory consumption to translate the response of a gapless Float32 SeisChannel object is ~7x the size of the object itself.
More precisely, for an object S (of Type <: GphysData or GphysChannel), translation requires memory ~ 2 kB + the greater of (7x the size of the longest Float32 segment, or 3.5x the size of the longest Float64 segment). Translation uses four vectors – three complex and one real – that are updated and dynamically resized as the algorithm loops over each segment:
- Old response container: Array{Complex{Float32,1}}(undef, Nx)
- New response container: Array{Complex{Float32,1}}(undef, Nx)
- Complex data container: Array{Complex{Float32,1}}(undef, Nx)
- Real frequencies for FFT: Array{Float32,1}(undef, Nx)
…where Nx is the number of samples in the longest segment in S.
Causality¶
Response translation adds no additional processing to guarantee causality. At
a minimum, most users will want to call detrend!
and taper!
before
translating instrument responses.
Synchronize¶
-
sync!(S::GphysData[, s=ST, t=EN, v=VV])
Synchronize all data in S to start at ST and terminate at EN with verbosity level VV.
For regularly-sampled channels, gaps between the specified and true times are filled with the mean; this isn’t possible with irregularly-sampled data.
Specifying start time (s)¶
- s=”last”: (Default) sync to the last start time of any channel in S.
- s=”first”: sync to the first start time of any channel in S.
- A numeric value is treated as an epoch time (?time for details).
- A DateTime is treated as a DateTime. (see Dates.DateTime for details.)
- Any string other than “last” or “first” is parsed as a DateTime.
Specifying end time (t)¶
- t=”none”: (Default) end times are not synchronized.
- t=”last”: synchronize all channels to end at the last end time in S.
- t=”first” synchronize to the first end time in S.
- numeric, datetime, and non-reserved strings are treated as for -s.
Taper¶
-
taper!(S[, keywords])
Cosine taper each channel in S around time gaps. Specify chans=CC to restrict
to channel number(s) CC. Does not modify irregularly-sampled data channels.
-
taper!(C[, keywords])
Cosine taper each segment of time-series data in GphysChannel object C that contains at least N_min total samples. Returns if C is irregularly sampled.
Keywords¶
- chans: Only taper the specified channels.
- N_min: Data segments with N < N_min total samples are not tapered.
- t_max: Maximum taper edge in seconds.
- α: Taper edge area; as for a Tukey window, the first and last 100*:math:alpha`% of samples in each window are tapered, up to `t_max seconds of data.
Zero-Phase Filter¶
-
filtfilt!(S::GphysData[; KWs])
Apply a zero-phase filter to regularly-sampled data in S. Irregularly-sampled data are never processed by filtfilt!.
-
filtfilt!(C::SeisChannel[; KWs])
Apply zero-phase filter to C.x. Filtering is applied to each contiguous data segment in C separately.
Keywords¶
KW | Default | Type | Description |
---|---|---|---|
chans | [] | (a) | channel numbers to filter |
fl | 1.0 | Float64 | lower corner frequency [Hz] (b) |
fh | 15.0 | Float64 | upper corner frequency [Hz] (b) |
np | 4 | Int64 | number of poles |
rp | 10 | Int64 | pass-band ripple (dB) |
rs | 30 | Int64 | stop-band ripple (dB) |
rt | “Bandpass” | String | response type (type of filter) |
dm | “Butterworth” | String | design mode (name of filter) |
Default filtering KW values can be changed by adjusting the shared keywords, e.g., SeisIO.KW.Filt.np = 2
changes the default number of poles to 2.
Submodules¶
Name | Purpose |
---|---|
ASCII | ASCII file formats (includes GeoCSV, SLIST, and variants) |
FastIO | Replacement low-level I/O functions to avoid thread locking |
Quake | Earthquake seismology |
RandSeis | Generate SeisIO structures with quasi-random entries |
SEED | Standard for the Exchange of Earthquake Data (SEED) file format |
SUDS | Seismic Unified Data System (SUDS) file format |
SeisHDF | Dedicated support for seismic HDF5 subformats |
UW | University of Washington data format |
Using Submodules¶
At the Julia prompt, type using SeisIO.NNNN where NNNN is the submodule name; for example, using SeisIO.Quake
loads the Quake submodule.
RandSeis¶
This submodule is used to quickly generate SeisIO objects with quasi-random field contents. Access it with “using SeisIO.RandSeis”
The following are true of all random data objects generated by the RandSeis module:
- Channels have SEED-compliant IDs, sampling frequencies, and data types.
- Random junk fills :notes and :misc.
- Sampling frequency (:fs) is chosen from a set of common values.
- Channel data are randomly generated.
- Time gaps are automatically inserted into regularly-sampled data.
-
randPhaseCat
()¶
Generate a random seismic phase catalog suitable for testing EventChannel,
EventTraceData, and SeisEvent objects.
-
randSeisChannel
([c=false, s=false])¶
Generate a SeisChannel of random data. Specify c=true for campaign-style
(irregularly-sampled) data (fs = 0.0); specify s=true to guarantee seismic data.
s=true overrides c=true.
-
randSeisData
([c=0.2, s=0.6])¶
-
randSeisData
(N[, c=0.2, s=0.6])
Generate 8 to 24 channels of random seismic data as a SeisData object. Specify N for a fixed number of channels.
- 100*c% of channels will have irregularly-sampled data (fs = 0.0)
- 100*s% of channels are guaranteed to have seismic data.
-
randSeisEvent
([c=0.2, s=0.6])¶
-
randSeisEvent
(N[, c=0.2, s=0.6])
Generate a SeisEvent structure filled with random values. Specify N for a fixed number of channels.
- 100*c% of channels will have irregularly-sampled data (fs = 0.0)
- 100*s% of channels are guaranteed to have seismic data.
-
randSeisHdr
()¶
Generate a SeisHdr structure filled with random values.
-
randSeisSrc
()¶
Generate a SeisSrc structure filled with random values.
SEED¶
Submodule for the Standard for the Exchange of Earthquake Data (SEED) file format; includes additional functionality.
-
dataless_support
()¶
Dump status of dataless SEED blockette support to stdout.
-
mseed_support
()¶
Dump status of mini-SEED blockette support to stdout.
-
seed_support
()¶
Dump status of SEED blockette support to stdout.
Scanning SEED Volumes¶
-
scan_seed
(fname)¶
Scan a single SEED file and report on the contents. Much faster than read_data as no samples are decoded/read and most blockettes are skipped. Control output behavior with keywords.
Supported Keywords¶
KW | Type | Default | Meaning |
---|---|---|---|
memmap | Bool | false | memory-map file? |
npts | Bool | true | output samples per channel? |
ngaps | Bool | true | output time gaps per channel? |
nfs | Bool | true | output number of fs vals per channel? |
quiet | Bool | false | true prints nothing to stdout |
seg_times | Bool | false | output exact gap times? |
fs_times | Bool | false | output exact times of fs changes? |
v | Integer | 0 | verbosity |
Note that seg_times and fs_times dump verbose per-channel tabulation to stdout.
Users are encourage to submit feature request Issues if there’s a need to scan for other changes within a SEED volume.
Interaction with Online Requests¶
scan_seed cannot interact directly with online SEED requests. As a workaround, do get_data(..., w=true)
to dump the raw request directly to disk, then scan the file(s) created.
UW¶
The UW submodule extends functionality for the University of Washington (UW) file format(s).
The UW data format was created in the 1970s by the Pacific Northwest Seismic Network (PNSN), USA, for event archival. It remained in use through the 1990s. A UW event is described by a pickfile and a corresponding data file, whose filenames were identical except for the last character. The data file is self-contained; the pick file is not required to read raw trace data. However, station locations were stored in an external text file.
Only UW-2 data files are supported by SeisIO. We have only seen UW-1 data files in Exabyte tapes from the 1980s.
-
uwpf
(pf[, v])¶
Read UW-format seismic pick file pf. Returns a tuple of (SeisHdr, SeisSrc).
-
uwpf!(W, pf[, v::Int64=KW.v])
Read UW-format seismic pick info from pickfile f into SeisEvent object W. Overwrites W.source and W.hdr with pickfile information. Keyword v controls verbosity.
Nodal¶
The Nodal submodule is intended to handle data from nodal arrays. Nodal arrays differ from standard seismic data in that the start and end times of data segments are usually synchronized.
Reading Nodal Data Files¶
-
S = read_nodal(fmt, fname [, KWs])
Read data in file format fmt from file fname into memory. Returns a NodalData object.
Supported Keywords¶
KW | Type | Default | Meaning |
---|---|---|---|
chans | ChanSpec | Int64[] | channels to read |
nn | String | N0 | network subfield in :id |
s | TimeSpec | 0001-01-01T00:00:00 | start time |
t | TimeSpec | 9999-12-31T12:59:59 | end time |
v | Integer | 0 | verbosity |
Non-Standard Behavior¶
Real values supplied to keywords s=
and t=
are treated as seconds relative to file begin time. Most SeisIO functions that accept TimeSpec arguments treat Real values as seconds relative to now()
.
Supported File Formats¶
File Format | String | Notes |
---|---|---|
Silixa TDMS | silixa | Limited support; see below |
SEG Y | segy | Field values are different from read_data output |
Silixa TDMS Support Status¶
- Currently only reads file header and samples from first block
- Not yet supported (test files needed):
- first block additional samples
- second block
- second block additional samples
- Awaiting manufacturer clarification:
- parameters in :info
- position along cable; currently loc.(x,y,z) = 0.0 for all channels
- frequency response; currently
:resp
is an all-pass placeholder
Nodal SEG Y Support Status¶
See SEG Y Support.
Working with NodalData objects¶
NodalData objects have one major structural difference from SeisData objects: the usual data field :x is a set of views to an Array{Float32, 2} (equivalent to a Matrix{Float32}) stored in field :data. This allows the user to apply two-dimensional data processing operations directly to the data matrix.
NodalData Assumptions¶
S.t[i]
is the same for all i.S.fs[i]
is constant for all i.length(S.x[i])
is constant for all i.
Other Differences from SeisData objects¶
- Operations like push! and append! must regenerate
:data
usinghcat()
, and therefore consume a lot of memory. - Attempting to push! or append! channels of unequal length throws an error.
- Attempting to push! or append! same-length channels with different
:t
or:fs
won’t synchronize them! You will instead have columns in:data
that aren’t time-aligned. - Irregularly-sampled data (
:fs = 0.0
) are not supported.
Types¶
See docstrings for field names and descriptions.
-
NodalLoc
()¶
Nodal location. Currently only stores position along optical cable.
-
NodalData
()¶
Structure to hold nodal array data. Similar to a SeisData object.
-
NodalChannel
()¶
A single channel of data from a nodal array. Similar to a SeisChannel object.
Quake¶
The Quake submodule was introduced in SeisIO v0.3.0 to isolate handling of discrete earthquake events from handling of continuous geophysical data. While the channel data are similar, fully describing an earthquake event requires many additional Types (objects) and more information (fields) in channel descriptors.
Types¶
See Type help text for field descriptions and SeisIO behavior.
-
EQMag
()¶
Earthquake magnitude object.
-
EQLoc
()¶
Structure to hold computed earthquake location data.
-
EventChannel
()¶
A single channel of trace data (digital seismograms) associated with a
discrete event (earthquake).
-
EventTraceData
()¶
A custom structure designed to describe trace data (digital seismograms)
associated with a discrete event (earthquake).
-
PhaseCat
()¶
A seismic phase catalog is a dictionary with phase names for keys (e.g. “pP”, “PKP”)
and SeisPha objects for values.
-
SeisEvent
()¶
A compound Type comprising a SeisHdr (event header), SeisSrc (source process),
and EventTraceData (digital seismograms.)
-
SeisHdr
()¶
Earthquake event header object.
-
SeisPha
()¶
A description of a seismic phase measured on a data channel.
-
SeisSrc
()¶
Seismic source process description.
-
SourceTime
()¶
QuakeML-compliant seismic source-time parameterization.
Web Queries¶
Keyword descriptions for web queries appear at the end of this section.
-
FDSNevq
(ot)¶
Event header query. Multi-server query for the event(s) with origin time(s) closest to ot. Returns a tuple consisting of an Array{SeisHdr,1} and an Array{SeisSrc,1}, so that the i`th entry of each array describes the header and source process of event `i.
Keywords: evw, mag, nev, rad, reg, src, to, v
Notes¶
- Specify ot as a string formatted YYYY-MM-DDThh:mm:ss in UTC (e.g. “2001-02-08T18:54:32”).
- Incomplete string queries are read to the nearest fully-specified time constraint; thus, FDSNevq(“2001-02-08”) returns the nearest event to 2001-02-08T00:00:00.
- If no event is found in the specified search window, FDSNevq exits with an error.
- For FDSNevq, keyword src can be a comma-delineated list of sources, provided each has a value in ?seis_www; for example,
src="IRIS, INGV, NCEDC"
is valid.
-
FDSNevt
(ot::String, chans::String)¶
Get header and trace data for the event closest to origin time ot on channels chans. Returns a SeisEvent structure.
Keywords: evw, fmt, len, mag, model, nd, opts, pha, rad, reg, src, to, v, w
Notes¶
- Specify ot as a string formatted YYYY-MM-DDThh:mm:ss in UTC (e.g. “2001-02-08T18:54:32”).
- Incomplete string queries are read to the nearest fully-specified time constraint; thus, FDSNevq(“2001-02-08”) returns the nearest event to 2001-02-08T00:00:00.
- If no event is found in the specified search window, FDSNevt exits with an error.
- Unlike FDSNevq, number of events cannot be specified and src must be a single source String in ?seis_www.
-
get_pha!(S::Data[, keywords])
Command-line interface to IRIS online travel time calculator, which calls TauP. Returns a matrix of strings.
Keywords: pha, model, to, v
References¶
- TauP manual: http://www.seis.sc.edu/downloads/TauP/taup.pdf
- Crotwell, H. P., Owens, T. J., & Ritsema, J. (1999). The TauP Toolkit: Flexible seismic travel-time and ray-path utilities, SRL 70(2), 154-160.
Web Query Keywords¶
KW | Default | T [1] | Meaning |
---|---|---|---|
evw | [600.0, 600.0] | A{F,1} | search window in seconds [2] |
fmt | “miniseed” | S | request data format |
len | 120.0 | I | desired trace length [s] |
mag | [6.0, 9.9] | A{F,1} | magnitude range for queries |
model | “iasp91” | S | Earth velocity model for phase times |
nd | 1 | I | number of days per subrequest |
nev | 0 | I | number of events returned per query [3] |
opts | “” | S | user-specified options [4] |
pha | “P” | S | phases to get [5] |
rad | [] | A{F,1} | radial search region [6] |
reg | [] | A{F,1} | rectangular search region [7] |
src | “IRIS” | S | data source; type ?seis_www for list |
to | 30 | I | read timeout for web requests [s] |
v | 0 | I | verbosity |
w | false | B | write requests to disk? [8] |
Table Footnotes
[1] | Types: A = Array, B = Boolean, C = Char, DT = DateTime, F = Float, I = Integer, S = String, U8 = Unsigned 8-bit integer (UInt8) |
[2] | search range is always ot-|evw[1]| ≤ t ≤ ot+|evw[2]| |
[3] | nev=0 returns all events in the query |
[4] | String is passed as-is, e.g. “szsrecs=true&repo=realtime” for FDSN. String should not begin with an ampersand. |
[5] | Comma-separated String, like “P, pP”; use “ttall” for all phases |
[6] | Specify region [center_lat, center_lon, min_radius, max_radius, dep_min, dep_max], with lat, lon, and radius in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches. |
[7] | Specify region [lat_min, lat_max, lon_min, lon_max, dep_min, dep_max], with lat, lon in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches. |
[8] | If w=true, a file name is automatically generated from the request parameters, in addition to parsing data to a SeisData structure. Files are created from the raw download even if data processing fails, in contrast to get_data(… wsac=true). |
Example¶
Get seismic and strainmeter records for the P-wave of the Tohoku-Oki great earthquake on two borehole stations and write to native SeisData format:
S = FDSNevt("201103110547", "PB.B004..EH?,PB.B004..BS?,PB.B001..BS?,PB.B001..EH?")
wseis("201103110547_evt.seis", S)
Utility Functions¶
-
distaz!(Ev::SeisEvent)
Compute distance, azimuth, and backazimuth by the Haversine formula.
Overwrites Ev.data.dist, Ev.data.az, and Ev.data.baz.
-
gcdist
([lat_src, lon_src, ]rec)¶
Compute great circle distance, azimuth, and backazimuth from a single source
with coordinates [s_lat, s_lon] to receivers rec with coordinates
[r_lat r_lon] in each row.
-
show_phases
(P::PhaseCat)¶
Formatted display of seismic phases in dictionary P.
-
fill_sac_evh!(Ev::SeisEvent, fname[; k=N])
Fill (overwrite) values in Ev.hdr with data from SAC file fname. Keyword k=i specifies the reference channel i from which the absolute origin time Ev.hdr.ot is set. Potentially affects header fields :id, :loc (subfields .lat, .lon, .dep only), and :ot.
Reading Earthquake Data Files¶
-
S = read_quake(fmt::String, filename [, KWs])
?SeisIO.KW
.Supported File Formats¶
File Format | String | Notes |
---|---|---|
PC-SUDS | suds | |
QuakeML | qml, quakeml | only reads first event from file |
UW | uw |
Supported Keywords¶
KW | Used By | Type | Default | Meaning |
---|---|---|---|---|
full | suds, uw | Bool | false | read full header into :misc? |
v | all | Integer | 0 | verbosity |
QuakeML¶
-
read_qml
(fpat::String)¶
Read QuakeML files matching string pattern fpat. Returns a tuple containing an array of SeisHdr objects H and an array of SeisSrc objects R. Each pair (H[i], R[i]) describes the preferred location (origin, SeisHdr) and event source (focal mechanism or moment tensor, SeisSrc) of event i.
If multiple focal mechanisms, locations, or magnitudes are present in a single Event element of the XML file(s), the following rules are used to select one of each per event:
Non-essential QuakeML data are saved to misc in each SeisHdr or SeisSrc object as appropriate.
-
write_qml
(fname, Ev::SeisEvent; v::Integer=0)
See writing.
SeisHDF¶
This submodule contains dedicated support for seismic subformats of the HDF5 file format.
Additional Functions¶
-
asdf_waux
(fname, path, X)¶
Write X to AuxiliaryData/path in file fname. If an object already exists at AuxiliaryData/path, it will be deleted and overwritten with X.
-
asdf_rqml
(fpat)¶
Read QuakeXML (qml) from ASDF file(s) matching file string pattern fpat. Returns:
- H, Array{SeisHdr,1}
- R, Array{SeisSrc,1}
-
asdf_wqml
(fname, H, R[, keywords])¶
-
asdf_wqml
(fname, EV[, keywords])
Write to ASDF “QuakeML ” group in file fname. In the above function calls, H can be a SeisHdr or Array{SeisHdr, 1}; R can be a SeisSource or Array{SeisSource, 1}; EV can be a SeisEvent or Array{SeisEvent, 1}.
Supported Keywords¶
KW | Type | Default | Meaning |
---|---|---|---|
ovr | Bool | false | Overwrite data in existing traces? |
v | Integer | 0 | verbosity |
-
read_asdf_evt
(filestr, event_id[, keywords])¶
-
read_asdf_evt
(filestr[, keywords])
Read data in seismic HDF5 format with ids matching event_id from files matching pattern filestr. Returns an array of SeisEvent structures. With only one input argument, all event IDs are read.
Keywords:
- msr: (Bool) read full (MultiStageResp) instrument response?
- v: (Integer) verbosity level
-
scan_hdf5
(h5file)¶
-
scan_hdf5
(hdf, level="trace")
Scan HDF5 archive h5file and return station names with waveform data contained therein as a list of strings formatted “nn.sssss” (network.station).
Set level=”trace” to return channel names with waveform data as a list of strings formatted “nn.sssss.ll.ccc” (network.station.location.channel).
Appendices¶
Time Syntax¶
Functions that allow time specification use two reserved keywords or arguments to track time:
- s: Start (begin) time
- t: Termination (end) time
Specify each as a DateTime, Real, or String.
- Real numbers are interpreted as seconds. Special behavior is invoked when both s and t are of Type Real.
- DateTime values should follow Julia documentation
- Strings have the expected format spec
YYYY-MM-DDThh:mm:ss.ssssss
- Fractional second is optional and accepts up to 6 decimal places (μs)
- Incomplete time Strings treat missing fields as 0.
- Example: s=”2016-03-23T11:17:00.333”
It isn’t necessary to choose values so that s ≤ t. The two values are always sorted, so that t < s doesn’t error.
Time Types and Behavior¶
typeof(s) | typeof(t) | Behavior |
---|---|---|
DateTime | DateTime | convert to String, then sort |
DateTime | Real | add t seconds to s, convert to String, then sort |
DateTime | String | convert s to String, then sort |
Real | DateTime | add s seconds to t, convert to String, then sort |
Real | Real | treat as relative (see below), convert to String, sort |
Real | String | add s seconds to t, convert to String, then sort |
String | DateTime | convert t to String, then sort |
String | Real | add t seconds to s, convert to String, then sort |
String | String | sort |
Special Behavior with two Real arguments¶
If s and t are both Real numbers, they’re treated as seconds measured relative to the start of the current minute. This convention may seem unusual, but it greatly simplifies web requests; for example, specifying s=-1200.0, t=0.0 is a convenient shorthand for “the last 20 minutes of data”.
Data Requests Syntax¶
Channel ID Syntax¶
NN.SSSSS.LL.CC
(net.sta.loc.cha, separated by periods) is the expected syntax for all web functions. The maximum field width in characters corresponds to the length of each field (e.g. 2 for network). Fields can’t contain whitespace.
NN.SSSSS.LL.CC.T
(net.sta.loc.cha.tflag) is allowed in SeedLink. T
is a single-character data type flag and must be one of DECOTL
: Data, Event, Calibration, blOckette, Timing, or Logs. Calibration, timing, and logs are not in the scope of SeisIO and may crash SeedLink sessions.
The table below specifies valid types and expected syntax for channel lists.
Type | Description | Example |
---|---|---|
String | Comma-delineated list of IDs | “PB.B004.01.BS1, PB.B002.01.BS1” |
Array{String,1} | String array, one ID string per entry | [“PB.B004.01.BS1”, “PB.B002.01.BS1”] |
Array{String,2} | String array, one set of IDs per row | [“PB” “B004” “01” “BS1”; |
“PB” “B002” “01” “BS1”] |
The expected component order is always network, station, location, channel; thus, “UW.TDH..EHZ” is OK, but “UW.TDH.EHZ” fails.
-
chanspec
()¶
Type ?chanspec
in Julia to print the above info. to stdout.
Wildcards and Blanks¶
Allowed wildcards are client-specific.
- The LOC field can be left blank in any client:
"UW.ELK..EHZ"
and["UW" "ELK" "" "EHZ"]
are all valid. Blank LOC fields are set to--
in IRIS,*
in FDSN, and??
in SeedLink. ?
acts as a single-character wildcard in FDSN & SeedLink. Thus,CC.VALT..???
is valid.*
acts as a multi-character wildcard in FDSN. Thus,CC.VALT..*
andCC.VALT..???
behave identically in FDSN.- Partial specifiers are OK, but a network and station are always required:
"UW.EL?"
is OK,".ELK.."
fails.
Channel Configuration Files¶
One entry per line, ASCII text, format NN.SSSSS.LL.CCC.D. Due to client-specific wildcard rules, the most versatile configuration files are those that specify each channel most completely:
# This only works with SeedLink
GE.ISP..BH?.D
NL.HGN
MN.AQU..BH?
MN.AQU..HH?
UW.KMO
CC.VALT..BH?.D
# This works with FDSN and SeedLink, but not IRIS
GE.ISP..BH?
NL.HGN
MN.AQU..BH?
MN.AQU..HH?
UW.KMO
CC.VALT..BH?
# This works with all three:
GE.ISP..BHZ
GE.ISP..BHN
GE.ISP..BHE
MN.AQU..BHZ
MN.AQU..BHN
MN.AQU..BHE
MN.AQU..HHZ
MN.AQU..HHN
MN.AQU..HHE
UW.KMO..EHZ
CC.VALT..BHZ
CC.VALT..BHN
CC.VALT..BHE
Server List¶
String Source BGR http://eida.bgr.de EMSC http://www.seismicportal.eu ETH http://eida.ethz.ch GEONET http://service.geonet.org.nz GFZ http://geofon.gfz-potsdam.de ICGC http://ws.icgc.cat INGV http://webservices.ingv.it IPGP http://eida.ipgp.fr IRIS http://service.iris.edu ISC http://isc-mirror.iris.washington.edu KOERI http://eida.koeri.boun.edu.tr LMU http://erde.geophysik.uni-muenchen.de NCEDC http://service.ncedc.org NIEP http://eida-sc3.infp.ro NOA http://eida.gein.noa.gr ORFEUS http://www.orfeus-eu.org RESIF http://ws.resif.fr SCEDC http://service.scedc.caltech.edu TEXNET http://rtserve.beg.utexas.edu USGS http://earthquake.usgs.gov USP http://sismo.iag.usp.br
SeisIO Standard Keywords¶
SeisIO.KW is a memory-resident structure of default values for common keywords used by package functions. KW has one substructure, SL, with keywords specific to SeedLink. These defaults can be modified, e.g., SeisIO.KW.nev=2 changes the default for nev to 2.
KW | Default | T [1] | Meaning |
---|---|---|---|
comp | 0x00 | U8 | compress data on write? [2] |
fmt | “miniseed” | S | request data format [3] |
mag | [6.0, 9.9] | A{F,1} | magnitude range for queries |
n_zip | 100000 | I | compress if length(:x) > n_zip |
nd | 1 | I | number of days per subrequest |
nev | 1 | I | number of events returned per query |
nx_add | 360000 | I | length increase of undersized data array |
nx_new | 8640000 | I | number of samples for a new channel |
opts | “” | S | user-specified options [4] |
prune | true | B | call prune! after get_data? |
rad | [] | A{F,1} | radial search region [5] |
reg | [] | A{F,1} | rectangular search region [6] |
si | true | B | autofill station info on data req? [7] |
src | “IRIS” | S | data source; type ?seis_www for list |
to | 30 | I | read timeout for web requests (s) |
v | 0 | I | verbosity |
w | false | B | write requests to disk? [8] |
y | false | B | sync data after web request? |
Table Footnotes
[1] | Types: A = Array, B = Boolean, C = Char, DT = DateTime, F = Float, I = Integer, S = String, U8 = Unsigned 8-bit integer (UInt8) |
[2] | If KW.comp == 0x00, never compress data; if KW.comp == 0x01, only compress channel i if length(S.x[i]) > KW.n_zip; if comp == 0x02, always compress data. |
[3] | Strings have the same names and spellings as file formats in read_data. Note that “sac” in a web request is aliased to “sacbl”, i.e., binary little-endian SAC, to match the native endianness of the Julia language. |
[4] | String is passed as-is, e.g. “szsrecs=true&repo=realtime” for FDSN. String should not begin with an ampersand. |
[5] | Specify region [center_lat, center_lon, min_radius, max_radius, dep_min, dep_max], with lat, lon, and radius in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches. |
[6] | Specify region [lat_min, lat_max, lon_min, lon_max, dep_min, dep_max], with lat, lon in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches. |
[7] | FDSNWS timeseries only. |
[8] | If w=true, a file name is automatically generated from the request parameters, in addition to parsing data to a SeisData structure. Files are created from the raw download even if data processing fails, in contrast to get_data(… wsac=true). |
Utility Functions¶
This appendix covers utility functions that belong in no other category.
-
d2u
(DT::DateTime)¶
Aliased to Dates.datetime2unix
.
-
fctoresp
(fc, c)¶
Generate a generic PZResp object for a geophone with critical frequency fc
and damping constant c
. If no damping constant is specified, assumes c = 1/sqrt(2).
-
find_regex
(path::String, r::Regex)¶
OS-agnostic equivalent to Linux find. First argument is a path string, second is a Regex. File strings are postprocessed using Julia’s native PCRE Regex engine. By design, find_regex only returns file names.
-
getbandcode
(fs, fc=FC)¶
Get SEED-compliant one-character band code corresponding to instrument sample rate fs
and corner frequency FC
. If unset, FC
is assumed to be 1 Hz.
-
get_file_ver
(fname::String)¶
Get the version of a SeisIO native format file.
-
get_seis_channels
(S::GphysData)¶
Get numeric indices of channels in S whose instrument codes indicate seismic data.
-
guess
(fname::String)¶
Attempt to guess data file format and endianness using known binary file markers.
-
inst_code
(C::GphysChannel)¶
-
inst_code
(S::GphysData, i::Int64)
-
inst_code
(S::GphysData)
Get instrument codes.
-
ls
(s::String)¶
Similar functionality to Bash ls with OS-agnostic output. Accepts wildcards in paths and file names. * Always returns the full path and file name. * Partial file name wildcards (e.g. “ls(data/2006*.sac)) invoke glob. * Path wildcards (e.g. ls(/data/*/*.sac)) invoke find_regex to circumvent glob limitations. * Passing ony “*” as a filename (e.g. “ls(/home/*)) invokes find_regex to recursively search subdirectories, as in the Bash shell.
-
ls
()
Return full path and file name of files in current working directory.
-
j2md
(y, j)¶
Convert Julian day j of year y to month, day.
-
md2j
(y, m, d)¶
Convert month m, day d of year y to Julian day j.
Remove unwanted characters from S.
-
parsetimewin
(s, t)¶
Convert times s and t to strings \(\alpha, \omega\) sorted \(\alpha < \omega\). s and t can be real numbers, DateTime objects, or ASCII strings. Expected string format is “yyyy-mm-ddTHH:MM:SS.nnn”, e.g. 2016-03-23T11:17:00.333.
-
resp_a0!(R::InstrumentResponse)
-
resp_a0!(S::GphysData)
Update sensitivity :a0 of PZResp/PZResp64 responses.
-
resptofc
(R)¶
Attempt to guess critical frequency from poles and zeros of a PZResp/PZResp64.
-
set_file_ver
(fname::String)¶
Sets the SeisIO file version of file fname.
-
u2d
(x)¶
Alias to Dates.unix2datetime
.
-
validate_units
(S::GphysData)¶
Validate strings in :units field to ensure UCUM compliance.
-
vucum
(str::String)¶
Check whether str
contains valid UCUM units.
.. _seisio_file_format:
SeisIO Native Format¶
Invoking the command wseis writes SeisIO structures to a native data format in little-endian byte order. This page documents the low-level file format. Abbreviations used:
Type | Meaning | C | Fortran 77 |
---|---|---|---|
Char | Unicode character | wchar | CHARACTER*4 |
Float32 | 32-bit float | float | REAL |
Float64 | 64-bit float | double | REAL*8 |
Int8 | signed 8-bit int | short | INTEGER |
Int16 | signed 16-bit int | int | INTEGER*2 |
Int32 | signed 32-bit int | long | INTEGER*4 |
Int64 | signed 64-bit integer | long long | INTEGER*8 |
UInt8 | unsigned 8-bit int | unsigned short | CHARACTER |
UInt16 | unsigned 16-bit int | unsigned | |
UInt32 | unsigned 32-bit int | unsigned long | |
UInt64 | unsigned 64-bit int | unsigned long long |
Note that String in Julia has no exact C equivalence. SeisIO writes each String
in two parts: an Int64 (String length in bytes) followed by the String contents
(as bytes, equivalent to UInt8). Unlike C/Fortran, there are no issues with
strings that contain the null character (0x00 or '\x0'
).
SeisIO File¶
Var | Meaning | T | N |
---|---|---|---|
“SEISIO” | UInt8 | 6 | |
V |
SeisIO file format version | Float32 | 1 |
J |
# of SeisIO objects in file | UInt32 | 1 |
C |
SeisIO object codes for each object | UInt32 | J |
B |
Byte indices for each object | UInt64 | J |
{ | for i = 1:J | ||
(Objects) | variable | J | |
} | |||
ID |
ID hashes | UInt64 | variable |
TS |
Start times | Int64 | variable |
TE |
End times | Int64 | variable |
P |
Parent object index in C and B | variable | |
bID |
Byte offset of ID array |
Int64 | 1 |
bTS |
Byte offset of TS array |
Int64 | 1 |
bTE |
Byte offset of TE array |
Int64 | 1 |
bP |
Byte offset of P array |
Int64 | 1 |
ID, TS, and TE are the ID, data start time, and data end time of each channel in each object. P is the index of the parent object in C and B. TS and TE are measured from Unix epoch time (1970-01-01T00:00:00Z) in integer microseconds.
Intent: when seeking data from channel i
between times s
and t
,
if hash(i)
matches ID[j]
and the time windows overlap, retrieve index
k = P[j]
from NP, seek to byte offset B[k]
, and read an object of
type C[k]
from file.
If an archive contains no data objects, ID, TS, TE, and P are empty;
equivalently, bID == bTS
.
Simple Object Types¶
Fields of these objects are written in one of three ways: as “plain data” types, such as UInt8 or Float64; as arrays; or as strings.
In a simple object, each array is stored as follows: 1. Int64 number of dimensions (e.g. 2) 2. Int64 array of dimensions themselves (e.g. 2, 2) 3. Array values (e.g. 0.08250153, 0.023121119, 0.6299772, 0.79595184)
EQLoc¶
Var | Type | N | Meaning |
---|---|---|---|
lat | Float64 | 1 | latitude |
lon | Float64 | 1 | longitude |
dep | Float64 | 1 | depth |
dx | Float64 | 1 | x-error |
dy | Float64 | 1 | y-error |
dz | Float64 | 1 | z-error |
dt | Float64 | 1 | t-error (error in origin time) |
se | Float64 | 1 | standard error |
rms | Float64 | 1 | rms pick error |
gap | Float64 | 1 | azimuthal gap |
dmin | Float64 | 1 | minimum source-receiver distance in location |
dmax | Float64 | 1 | maximum source-receiver distance in location |
nst | Int64 | 1 | number of stations used to locate earthquake |
flags | UInt8 | 1 | one-bit flags for special location properties |
Ld | Int64 | 1 | length of “datum” string in bytes |
datum | UInt8 | Ld | Datum string |
Lt | Int64 | 1 | length of “typ” (event type) string in bytes |
typ | UInt8 | Lt | earthquake type string |
Li | Int64 | 1 | length of “sig” (error significance) string in bytes |
sig | UInt8 | Li | earthquake location error significance string |
Lr | Int64 | 1 | length of “src” (data source) string in bytes |
src | UInt8 | Lr | data source string |
>>(<<(flags, n-1), 7)
.EQMag¶
Var | Type | N | Meaning |
---|---|---|---|
val | Float32 | 1 | magnitude value |
gap | Float64 | 1 | largest azimuthal gap between stations in magnitude |
nst | Int64 | 1 | number of stations used in magnitude computation |
Lsc | Int64 | 1 | length of magnitude scale string |
msc | UInt8 | Lsc | magnitude scale string |
Lr | Int64 | 1 | length of data source string |
src | UInt8 | Lr | data source string |
SeisPha¶
Var | Type | N | Meaning |
---|---|---|---|
F | Float64 | 8 | amplitude, distance, incidence angle, residual, |
ray parameter, takeoff angle, travel time, uncertainty | |||
C | Char | 2 | polarity, quality |
SourceTime¶
Var | Type | N | Meaning |
---|---|---|---|
Ld | Int64 | 1 | size of descriptive string in bytes |
desc | UInt8 | 1 | descriptive string |
F | Float64 | 3 | duration, rise time, decay time |
StringVec¶
A vector of variable-length strings; its exact Type in Julia is Array{String,1}.
Var | Type | N | Meaning |
---|---|---|---|
ee | UInt8 | 1 | is this string vector empty? [9] |
L | Int64 | 1 | number of strings to read |
{ | i = 1:L | ||
nb | Int64 | 1 | length of string in bytes |
str | UInt8 | nb | string |
} |
[9] | If ee == 0x00 , then no values are stored for L, nb, or str. |
Location Types¶
GenLoc¶
Var | Type | N | Meaning |
---|---|---|---|
Ld | Int64 | 1 | length of datum string in bytes |
datum | UInt8 | Ld | datum string |
Ll | Int64 | 1 | length of location vector in bytes |
loc | Float64 | Ll | location vector |
GeoLoc¶
Var | Type | N | Meaning |
---|---|---|---|
Ld | Int64 | 1 | length of datum string in bytes |
datum | UInt8 | Ld | datum string |
F | Float64 | 6 | latitude, longitude, elevation, |
depth, azimuth, incidence |
UTMLoc¶
Var | Type | N | Meaning |
---|---|---|---|
Ld | Int64 | 1 | length of datum string in bytes |
datum | UInt8 | N | datum string |
zone | Int8 | 1 | UTM zone number |
hemi | Char | 1 | hemisphere |
E | UInt64 | 1 | Easting |
N | UInt64 | 1 | Northing |
F | Float64 | 4 | elevation, depth, azimuth, incidence |
XYLoc¶
Var | Type | N | Meaning |
---|---|---|---|
Ld | Int64 | 1 | Length of datum string in bytes |
datum | UInt8 | Ld | datum string |
F | Float64 | 8 | x, y, z, azimuth, incidence, origin x, origin y, origin z |
Response Types¶
GenResp¶
Var | Type | N | Meaning |
---|---|---|---|
Ld | Int64 | 1 | length of descriptive string in bytes |
desc | UInt8 | Ld | descriptive string |
nr | Int64 | 1 | Number of rows in complex response matrix |
nc | Int64 | 1 | Number of columns in complex response matrix |
resp | Complex{Float64,2} | nr*nc | complex response matrix |
PZResp¶
Var | Type | N | Meaning |
---|---|---|---|
c | Float32 | 1 | damping constant |
np | Int64 | 1 | number of complex poles |
p | Complex{Float32,1} | np | complex poles vector |
nz | Int64 | 1 | number of complex zeros |
z | Complex{Float32,1} | nz | complex zeros vector |
PZResp64 is identical to PZResp with Float64 values for c, p, z, rather than Float32.
The Misc Dictionary¶
Most compound objects below contain a dictionary (Dict{String,Any}) for
non-essential information in a field named :misc
. The tables below describe
how this field is written to disk.
Misc¶
Var | Type | N | Meaning |
---|---|---|---|
N | Int64 | 1 | number of items in dictionary [10] |
K | (StringVec) | 1 | dictionary keys |
{ | for i = 1:N | ||
c | UInt8 | 1 | Type code of object i |
o | variable | 1 | object i |
} |
[10] | If N == 0 , then N is the only value present. |
Dictionary Contents¶
These subtables describe how to read the possible data types in a Misc dictionary.
String Array (c == 0x81)¶
Var | Type | N | Meaning |
---|---|---|---|
A | (StringVec) | 1 | string vector |
Other Array (c == 0x80 or c > 0x81)¶
Var | Type | N | Meaning |
---|---|---|---|
nd | Int64 | 1 | number of dimensions in array |
dims | Int64 | nd | array dimensions |
arr | varies | prod(nd) | array |
String (c == 0x01)¶
Var | Type | N | Meaning |
---|---|---|---|
L | Int64 | 1 | size of string in bytes |
str | UInt8 | 1 | string |
Compound Object Types¶
Each of these objects contains at least one of the above simple object types.
PhaseCat¶
Var | Type | N | Meaning |
---|---|---|---|
N | Int64 | 1 | number of SeisPha objects to read [11] |
K | (StringVec) | 1 | dictionary keys |
pha | (SeisPha) | N | seismic phases |
[11] | If N == 0 , then N is the only value present. |
EventChannel¶
A single channel of data related to a seismic event
Var | Type | N | Meaning |
---|---|---|---|
Ni | Int64 | 1 | size of id string in bytes |
id | UInt8 | Ni | id string |
Nn | Int64 | 1 | size of name string in bytes |
name | UInt8 | Nn | name string |
Lt | UInt8 | 1 | location Type code |
loc | (Loc Type) | 1 | instrument position |
fs | Float64 | 1 | sampling frequency in Hz |
gain | Float64 | 1 | scalar gain |
Rt | UInt8 | 1 | response Type code |
resp | (Resp Type) | 1 | instrument response |
Nu | Int64 | 1 | size of units string in bytes |
units | UInt8 | Nu | units string |
az | Float64 | 1 | azimuth |
baz | Float64 | 1 | backazimuth |
dist | Float64 | 1 | source-receiver distance |
pha | (PhaseCat) | 1 | phase catalog |
Nr | Int64 | 1 | size of data source string in bytes |
src | UInt8 | Nr | data source string |
misc | (Misc) | 1 | dictionary for non-essential information |
notes | (StringVec) | 1 | notes and automated logging |
Nt | Int64 | 1 | length of time gaps matrix |
T | Int64 | 2Nt | time gaps matrix |
Xc | UInt8 | 1 | Type code of data vector |
Nx | Int64 | 1 | number of samples in data vector |
X | variable | NX | data vector |
SeisChannel¶
A single channel of univariate geophysical data
Var Type N Meaning Ni Int64 1 size of id string in bytes id UInt8 Ni id string Nn Int64 1 size of name string in bytes name UInt8 Nn name string Lt UInt8 1 location Type code loc (Loc Type) 1 instrument position fs Float64 1 sampling frequency in Hz gain Float64 1 scalar gain Rt UInt8 1 response Type code resp (Resp Type) 1 instrument response Nu Int64 1 size of units string in bytes units UInt8 Nu units string Nr Int64 1 size of data source string in bytes src UInt8 Nr data source string misc (Misc) 1 dictionary for non-essential information notes (StringVec) 1 notes and automated logging Nt Int64 1 length of time gaps matrix T Int64 2Nt time gaps matrix Xc UInt8 1 Type code of data vector Nx Int64 1 number of samples in data vector X variable NX data vector
EventTraceData¶
A multichannel record of time-series data related to a seismic event.
Var Type N Meaning N Int64 1 number of data channels Lc UInt8 N location Type codes for each data channel Rc UInt8 N response Type codes for each data channel Xc UInt8 N data Type codes for each data channel cmp UInt8 1 are data compressed? (0x01 = yes) Nt Int64 N number of rows in time gaps matrix for each channel Nx Int64 N length of data vector for each channel [12] id (StringVec) 1 channel ids name (StringVec) 1 channel names loc (Loc Type) N instrument positions fs Float64 N sampling frequencies of each channel in Hz gain Float64 N scalar gains of each channel resp (Resp Type) N instrument responses units (StringVec) 1 units of each channel’s data az Float64 N event azimuth baz Float64 N backazimuths to event dist Float64 N source-receiver distances pha (PhaseCat) N phase catalogs for each channel src (StringVec) 1 data source strings for each channel misc (Misc) N dictionaries of non-essential information for each channel notes (StringVec) N notes and automated logging for each channel { for i = 1:N T Int64 2Nt[i] Matrix of time gaps for channel i } { for i = 1:N X Xc[i] Nx[i] Data vector i [13] }
[12] | If cmp == 0x01, each value in Nx is the number of bytes of compressed data to read; otherwise, this is the number of samples in each channel. |
[13] | If cmp == 0x01, read Nx[i] samples of type UInt8 and pass through lz4 decompression to generate data vector i; else read Nx[i] samples of the type corresponding to code Xc[i]. |
SeisData¶
A record containing multiple channels of univariate geophysical data.
Var Type N Meaning N Int64 1 number of data channels Lc UInt8 N location Type codes for each data channel Rc UInt8 N response Type codes for each data channel Xc UInt8 N data Type codes for each data channel cmp UInt8 1 are data compressed? (0x01 = yes) Nt Int64 N number of rows in time gaps matrix for each channel Nx Int64 N length of data vector for each channel [14] id (StringVec) 1 channel ids name (StringVec) 1 channel names loc (Loc Type) N instrument positions fs Float64 N sampling frequencies of each channel in Hz gain Float64 N scalar gains of each channel resp (Resp Type) N instrument responses units (StringVec) 1 units of each channel’s data src (StringVec) 1 data source strings for each channel misc (Misc) N dictionaries of non-essential information for each channel notes (StringVec) N notes and automated logging for each channel { for i = 1:N T Int64 2Nt[i] Matrix of time gaps for channel i } { for i = 1:N X Xc[i] Nx[i] Data vector i [15] }
[14] | If cmp == 0x01, each value in Nx is the number of bytes of compressed data to read; otherwise, this is the number of samples in each channel. |
[15] | If cmp == 0x01, read Nx[i] samples of type UInt8 and pass through lz4 decompression to generate data vector i; else read Nx[i] samples of the type corresponding to code Xc[i]. |
SeisHdr¶
Var | Type | N | Meaning |
---|---|---|---|
Li | Int64 | 1 | length of event ID string |
id | UInt8 | Li | event ID string |
iv | UInt8 | 1 | intensity value |
Ls | Int64 | 1 | length of intensity scale string |
isc | UInt8 | Ls | intensity scale string |
loc | (EQLoc) | 1 | earthquake location |
mag | (EQMag) | 1 | earthquake magnitude |
misc | (Misc) | 1 | dictionary containing non-essential information |
notes | (StringVec) | 1 | notes and automated logging |
ot | Int64 | 1 | origin time [16] |
Lr | Int64 | 1 | length of data source string |
src | UInt8 | Lr | data source string |
Lt | Int64 | 1 | length of event type string |
typ | UInt8 | Lt | event type string |
[16] | Measured from Unix epoch time (1970-01-01T00:00:00Z) in integer microseconds |
SeisSrc¶
Var | Type | N | Meaning |
---|---|---|---|
Li | Int64 | 1 | length of source id string |
id | UInt8 | Li | id string |
Le | Int64 | 1 | length of event id string |
eid | UInt8 | Le | event id string |
m0 | Float64 | 1 | scalar moment |
Lm | Int64 | 1 | length of moment tensor vector |
mt | Float64 | Lm | moment tensor vector |
Ld | Int64 | 1 | length of moment tensor misfit vector |
dm | Float64 | Ld | moment tensor misfit vector |
np | Int64 | 1 | number of polarities |
gap | Float64 | 1 | max. azimuthal gap |
pad | Int64 | 2 | dimensions of principal axes matrix |
pax | Float64 | pad[1]*pad[2] | principal axes matrix |
pld | Int64 | 2 | dimensions of nodal planes matrix |
planes | Float64 | pld[1]*pld[2] | nodal planes matrix |
Lr | Int64 | 1 | length of data source string |
src | UInt8 | 1 | data source string |
st | (SourceTime) | 1 | source-time description |
misc | (Misc) | 1 | Dictionary containing non-essential information |
notes | (StringVec) | 1 | Notes and automated logging |
SeisEvent¶
Var | Type | N | Meaning |
---|---|---|---|
hdr | (SeisHdr) | 1 | event header |
source | (SeisSrc) | 1 | event source process |
data | (EventTraceData) | 1 | event trace data |
Data Type Codes¶
Each Type code is written to disk as a UInt8, with the important exception of SeisIO custom object Type codes (which use UInt32).
Loc Type Codes¶
UInt8 | Type |
---|---|
0x00 | GenLoc |
0x01 | GeoLoc |
0x02 | UTMLoc |
0x03 | XYLoc |
Resp Type Codes¶
UInt8 | Type |
---|---|
0x00 | GenResp |
0x01 | PZResp |
0x02 | PZResp64 |
Other Type Codes¶
Only the Types below are faithfully preserved in write/read of a :misc field
dictionary; other Types are not written to file and can cause wseis
to
throw errors.
Type | UInt8 | Type | UInt8 |
---|---|---|---|
Char | 0x00 | Array{Char,N} | 0x80 |
String | 0x01 | Array{String,N} | 0x81 |
UInt8 | 0x10 | Array{UInt8,N} | 0x90 |
UInt16 | 0x11 | Array{UInt16,N} | 0x91 |
UInt32 | 0x12 | Array{UInt32,N} | 0x92 |
UInt64 | 0x13 | Array{UInt64,N} | 0x93 |
UInt128 | 0x14 | Array{UInt128,N} | 0x94 |
Int8 | 0x20 | Array{Int8,N} | 0xa0 |
Int16 | 0x21 | Array{Int16,N} | 0xa1 |
Int32 | 0x22 | Array{Int32,N} | 0xa2 |
Int64 | 0x23 | Array{Int64,N} | 0xa3 |
Int128 | 0x24 | Array{Int128,N} | 0xa4 |
Float16 | 0x30 | Array{Float16,N} | 0xb0 |
Float32 | 0x31 | Array{Float32,N} | 0xb1 |
Float64 | 0x32 | Array{Float64,N} | 0xb2 |
Complex{UInt8} | 0x50 | Array{Complex{UInt8},N} | 0xd0 |
Complex{UInt16} | 0x51 | Array{Complex{UInt16},N} | 0xd1 |
Complex{UInt32} | 0x52 | Array{Complex{UInt32},N} | 0xd2 |
Complex{UInt64} | 0x53 | Array{Complex{UInt64},N} | 0xd3 |
Complex{UInt128} | 0x54 | Array{Complex{UInt128},N} | 0xd4 |
Complex{Int8} | 0x60 | Array{Complex{Int8},N} | 0xe0 |
Complex{Int16} | 0x61 | Array{Complex{Int16},N} | 0xe1 |
Complex{Int32} | 0x62 | Array{Complex{Int32},N} | 0xe2 |
Complex{Int64} | 0x63 | Array{Complex{Int64},N} | 0xe3 |
Complex{Int128} | 0x64 | Array{Complex{Int128},N} | 0xe4 |
Complex{Float16} | 0x70 | Array{Complex{Float16},N} | 0xf0 |
Complex{Float32} | 0x71 | Array{Complex{Float32},N} | 0xf1 |
Complex{Float64} | 0x72 | Array{Complex{Float64},N} | 0xf2 |
SeisIO Object Type codes¶
UInt32 Code | Object Type |
---|---|
0x20474330 | EventChannel |
0x20474331 | SeisChannel |
0x20474430 | EventTraceData |
0x20474431 | SeisData |
0x20495030 | GenLoc |
0x20495031 | GeoLoc |
0x20495032 | UTMLoc |
0x20495033 | XYLoc |
0x20495230 | GenResp |
0x20495231 | PZResp64 |
0x20495232 | PZResp |
0x20504330 | PhaseCat |
0x20534530 | SeisEvent |
0x20534830 | SeisHdr |
0x20535030 | SeisPha |
0x20535330 | SeisSrc |
0x20535430 | SourceTime |
0x45514c30 | EQLoc |
0x45514d30 | EQMag |
File Format Version History¶
File format versions <0.50 are no longer supported; please email us if you need to read in very old data.
Version | Date | Change |
---|---|---|
0.53 | 2019-09-11 | removed :i, :o from CoeffResp |
added :i, :o to MultiStageResp | ||
0.52 | 2019-09-03 | added CoeffResp, MultiStageResp |
0.51 | 2019-08-01 | added :f0 to PZResp, PZResp64 |
0.50 | 2019-06-05 | all custom Types can now use write() directly |
rewrote how :misc is stored | ||
Type codes for :misc changed | ||
deprecated BigFloat/BigInt support in :misc | ||
:n is no longer stored as a UInt32 | ||
:x compression no longer automatic | ||
:x compression changed from Blosc to lz4 |