SeisIO

SeisIO is a collection of utilities for reading and downloading geophysical timeseries data.

Introduction

SeisIO is a framework for working with univariate geophysical data on 64-bit systems. SeisIO is designed around three basic principles:

  • Ease of use: one shouldn’t need a PhD to understand command syntax.
  • Fluidity: working with data shouldn’t feel clumsy.
  • Performance: speed and efficient memory usage matter.

The project is home to an expanding set of web clients, file format readers, and analysis utilities.

Overview

SeisIO stores data in minimalist containers that track the bare necessities for analysis. New data are easily added with basic operators like +. Unwanted channels can be removed just as easily. Data can be written to a number of file formats.

Installation

From the Julia prompt: press ] to enter the Pkg environment, then type

add SeisIO; build; precompile

Dependencies should install automatically. To verify that everything works correctly, type

test SeisIO

and allow 10-20 minutes for tests to complete. Exit the Pkg environment by pressing Backspace or Control + C.

Getting Started

At the Julia prompt, type

using SeisIO

You’ll need to repeat this step whenever you restart Julia, as with any command-line interpreter (CLI) language.

Learning SeisIO

An interactive tutorial using Jupyter notebooks in a web browser can be accessed from the Julia prompt with these commands:

p = pathof(SeisIO)
d = dirname(realpath(p))
cd(d)
include("../tutorial/install.jl")

SeisIO also has an online tutorial guide, intended as a gentle introduction for people less familiar with the Julia language. The two are intentionally redundant; Jupyter isn’t compatible with all systems and browsers.

For a faster start, skip to any of these topics:

Updating

From the Julia prompt: press ] to enter the Pkg environment, then type update. Once package updates finish, restart Julia to use them.

First Steps

SeisIO is designed around easy, fluid, and fast data access. At the most basic level, SeisIO uses an array-like structure called a SeisChannel for single-channel data, and a multichannel structure named SeisData.

Start Here

Create a new, empty SeisChannel object with

Ch = SeisChannel()

The meanings of the field names are explained here; you can also type ?SeisChannel at the Julia prompt. You can edit field values manually, e.g., returning to the code block above,

Ch = SeisChannel()
Ch.loc = GeoLoc(lat=-90.0, lon=0.0, el=2835.0, az=0.0, inc=0.0)
Ch.name = "South pole"

or you can set them with keywords at creation:

Ch = SeisChannel(name="Templo de San José de La Floresta, Ajijic", fs=40.0)

Note that Strings in field names support full Unicode, so even graffiti can be saved and read back in. This is useful for data containing non-English characters.

SeisData structures are collections of channel data. They can be created with the SeisData() command, which can optionally create any number of empty channels at a time, e.g.,

S = SeisData()      # empty structure, no channels
S1 = SeisData(12)   # empty 12-channel structure

They can be explored similarly:

S = SeisData(1)
S.name[1] = "South pole"
S.loc[1] = GeoLoc(lat=-90.0, lon=0.0, el=2835.0, az=0.0, inc=0.0)

A collection of channels becomes a SeisData structure; for example,

L = GeoLoc(lat=46.1967, lon=-122.1875, el=1440.0, az=0.0, inc=0.0)
S = SeisData(SeisChannel(), SeisChannel())
S = SeisData(randSeisData(5), SeisChannel(),
      SeisChannel(id="UW.SEP..EHZ", name="Darth Exploded", loc=L))

You can push channels onto existing SeisData structures, like adding one key to a dictionary. For example,

push!(S, Ch)

copies Ch to a new channel in S. Note that the new S[3] is not a view into Ch. This is deliberate, as otherwise the workspace quickly becomes a mess of redundant channels. Clean up with Ch = [] to free memory before moving on.

Operations on SeisData structures

We’re now ready for a short tutorial of what we can do with data structures. In the commands below, as in most of this documentation, Ch is a SeisChannel object and S is a SeisData object.

Adding channels to a SeisData structure

You’ve already seen one way to add a channel to SeisData: push!(S, SeisChannel()) adds an empty channel. Here are others:

S = randSeisData(4)
n = 3
append!(S, SeisData(n))

This initializes an empty n-channel SeisData structure and appends it to the end of S, similar to appending one array to another. S will have 7 channels, but the last three are empty.

The addition operator, +, calls push! and add!, with one key difference: to ensure reflexivity (i.e., S1 + S2 == S2 + S1), the + operator sorts the output (command sort!) and prunes empty channels (command prune!). Thus,

S = SeisData(2)
S1 = randSeisData(3)
S += S1

outputs only the three channels of random data initialized in the second line of the code block; in addition, they’re sorted by ID, so it’s likely that S != S1.

Search, Sort, and Prune

The easiest way to find channels of interest in a data structure is to use findid or findchan. findid(id, S) returns the numeric index i of the first channel in S where S.id[i] == id. findchan(cha, S) returns an array of numeric indices in S to all channels whose IDs satisfy occursin(cha, S.id[i]).

For example:

L = GeoLoc(lat=46.1967, lon=-122.1875, el=1440.0, az=0.0, inc=0.0)
S = SeisData(randSeisData(5), SeisChannel(id="YY.STA1..EHZ"),
      SeisChannel(id="UW.SEP..EHZ", name="Darth Exploded", loc=L))
findid("UW.SEP..EHZ", S)    # 7
findchan("EHZ", S)          # [6, 7], maybe others (depending on randSeisData)

You can sort channels in a structure by channel ID with the sort! command.

Several functions exist to prune empty and unwanted channels from SeisData structures. Revisiting the previous code block, for example, try these:

deleteat!(S, 1:2)   # Delete first two channels of S
S -= 3              # Delete third channel of S

# Extract S[1] as a SeisChannel, removing it from S
C = pull(S, 1)

# Delete channels containing ".SEP."
delete!(S, ".SEP.", exact=false)

# Delete all channels whose S.x is empty
prune!(S)
S

S should have one channel left.

In the delete! command, specifying exact=false means that any channel whose ID partly matches the string “.SEP.” gets deleted; by default, delete!(S, str) only matches channels where str is the exact ID. This is an efficient way to remove unresponsive subnets and unwanted channel types, but beware of accidental over-matching.

Next Steps

Because tracking arbitrary operations can be difficult, several functions have been written to keep track of data and operations in a semi-automated way. See the next section, working with data, for detailed discussion of managing data from the Julia command prompt.

Working with Data

This section is a tutorial for tracking and managing SeisIO data.

Creating Data Containers

Create a new, empty object using any of the following commands:

Object Purpose
SeisChannel() A single channel of univariate (usually time-series) data
SeisData() Multichannel univariate (usually time-series) data
SeisHdr() Header structure for discrete seismic events
SeisEvent() Discrete seismic events; includes SeisHdr and SeisData objects

Acquiring Data

  • Read files with read_data
  • Make web requets with get_data
  • Initiate real-time streaming sessions to SeisData objects with seedlink

Keeping Track

A number of auxiliary functions exist to keep track of channels:

findchan(id::String, S::SeisData)
findchan(S::SeisData, id::String)

Get all channel indices i in S with id \(\in\) S.id[i]. Can do partial id matches, e.g. findchan(S, “UW.”) returns indices to all channels whose IDs begin with “UW.”.

findid(S::SeisData, id)

Return the index of the first channel in S where id = id. Requires an exact string match; faster than Julia findfirst on small structures.

findid(S::SeisData, Ch::SeisChannel)

Equivalent to findfirst(S.id.==Ch.id).

namestrip!(S[, convention])

Remove bad characters from the :name fields of S. Specify convention as a string (default is “File”):

Convention Characters Removed:sup:(a)
“File” "$*/:<>?@\^|~DEL
“HTML” "&';<>©DEL
“Julia” $\DEL
“Markdown” !#()*+-.[\]_`{}
“SEED” .DEL
“Strict” !"#$%&'()*+,-./:;<=>?@[\]^`{|}~DEL

(a) DEL here is \x7f (ASCII/Unicode U+007f).

timestamp()

Return current UTC time formatted yyyy-mm-ddTHH:MM:SS.

track_off!(S)

Turn off tracking in S and return a boolean vector of which channels were added or changed.

track_on!(S)

Begin tracking changes in S. Tracks changes to :id, channel additions, and changes to data vector sizes in S.x.

Does not track data processing operations on any channel i unless length(S.x[i]) changes for channel i (e.g. filtering is not tracked).

Warning: If you have or suspect gapped data in any channel, calling ungap! while tracking is active will always flag a channel as changed.

Source Logging

The :src field records the last data source used to populate each channel; usually a file name or URL.

When a data source is added to a channel, including the first time data are added, it’s also recorded in the :notes field. Use show_src(S, i) to print all data sources for channel S[i] to stdout (see below for details).

Channel Maintenance

A few functions exist specifically to simplify data maintenance:

prune!(S::SeisData)

Delete all channels from S that have no data (i.e. S.x is empty or non-existent).

C = pull(S::SeisData, id::String)

Extract the first channel with id=id from S and return it as a new SeisChannel structure. The corresponding channel in S is deleted.

C = pull(S::SeisData, i::integer)

Extract channel i from S as a new SeisChannel object C, and delete the corresponding channel from S.

Taking Notes

Functions that add and process data note these operations in the :notes field of each object affected. One can also add custom notes with the note! command:

note!(S, i, str)

Append str with a timestamp to the :notes field of channel number i of S.

note!(S, id, str)

As above for the first channel in S whose id is an exact match to id.

note!(S, str)

if str* mentions a channel name or ID, only the corresponding channel(s) in **S is annotated; otherwise, all channels are annotated.

Clear all notes from channel i of S.

clear_notes!(S, id)

Clear all notes from the first channel in S whose id field exactly matches id.

clear_notes!(S)

Clear all notes from every channel in S.

Checking Your Work

If you need to check what’s been done to a channel, or the sources present in the channel data, these commands are helpful:

show_processing(S::SeisData)
show_processing(S::SeisData, i::Int)
show_processing(C::SeisChannel)

Tabulate and print all processing steps in :notes to stdout in human-readable format.

show_src(S::SeisData)
show_src(S::SeisData, i::Int)
show_src(C::SeisChannel)

Tabulate and print all data sources in :notes to stdout.

show_writes(S::SeisData)
show_writes(S::SeisData, i::Int)
show_writes(C::SeisChannel)

Tabulate and print all write operations in :notes to stdout in human-readable format.

Getting Help

In addition to the Juypter notebooks and online tutorial guide, other sources of help are available:

Examples

Several worked examples exist throughout these documents, in addition to examples.jl and the interactive tutorial.

Invoke the command-prompt examples with the following command sequence:

p = pathof(SeisIO)
d = dirname(realpath(p))
cd(d)
include("../test/examples.jl")

Tests

The commands in tests/ can be used as templates; to install test data and run all tests, execute these commands:

using Pkg
Pkg.test("SeisIO")      # lunch break recommended. Tests can take 20 minutes.
                        # 99.5% code coverage wasn't an accident...
p = pathof(SeisIO)
cd(realpath(dirname(p) * "/../test/"))

Command-Line Help

A great deal of additional help functions are available at the Julia command prompt. All SeisIO functions and structures have their own docstrings. For example, typing ?SeisData at the Julia prompt produces the following:

SeisData

A custom structure designed to contain the minimum necessary information
for processing univariate geophysical data.

SeisChannel

A single channel designed to contain the minimum necessary information
for processing univariate geophysical data.

Fields
========

Field  Description
–––––– ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
:n     Number of channels [^1]
:c     TCP connections feeding data to this object [^1]
:id    Channel id. Uses NET.STA.LOC.CHAN format when possible
:name  Freeform channel name
:loc   Location (position) vector; any subtype of InstrumentPosition
:fs    Sampling frequency in Hz; fs=0.0 for irregularly-sampled data.
:gain  Scalar gain
:resp  Instrument response; any subtype of InstrumentResponse
:units String describing data units. UCUM standards are assumed.
:src   Freeform string describing data source.
:misc  Dictionary for non-critical information.
:notes Timestamped notes; includes automatically-logged info.
:t     Matrix of time gaps in integer μs, formatted [Sample# Length]
:x     Time-series data

Dedicated Help Functions

These functions take no arguments and dump information to stdout.

Submodule SEED
dataless_support()

Output lists of supported blockettes in dataless SEED to stdout.

mseed_support()

Output lists of supported blockettes in mini-SEED to stdout.

resp_wont_read()

Output “hall of shame” of known examples of broken RESP to stdout.

seed_support()

Output full information on SEED support to stdout.

Submodule SUDS
suds_support()

Dump info to STDOUT on SUDS support for each numbered SUDS structure.

  • Green structures are fully supported and read into memory.
  • Yellow structures can be dumped to stdout with read_data(“suds”, …, v=2).
  • Red structures are skipped and don’t exist in our test data.

Formats Guide

formats is a constant static dictionary with descriptive entries of each data format. Access the list of formats with sort(keys(formats)). Then try a command like formats[“slist”] for detailed info. on the slist format.

Help-Only Functions

These functions contain help docstrings but execute nothing. They exist to answer common questions.

?web_chanspec

Answers: how do I specify channels in a web request? Outputs channel id syntax to stdout.

?seis_www

Answers: which servers are available for FDSN queries? Outputs the FDSN server list to stdout.

?TimeSpec

All About Keywords

Invoke keywords help with ?SeisIO.KW for complete information on SeisIO shared keywords and meanings.

Time-Series Files

read_data!(S, fmt::String, filepat [, KWs])
S = read_data(fmt::String, filepat [, KWs])
Read data from a supported file format into memory.

fmt
Case-sensitive string describing the file format. See below.

filepat
Read files with names matching pattern filepat. Supports wildcards.

KWs
Keyword arguments; see also SeisIO standard KWs or type ?SeisIO.KW. See table below for the list.

Supported File Formats

File Format String Strict Match
AH-1 ah1 id, fs, gain, loc, resp, units
AH-2 ah2 id, fs, gain, loc, resp
Bottle (UNAVCO) bottle id, fs, gain
GeoCSV, time-sample pair geocsv id
GeoCSV, sample list geocsv.slist id
Lennartz ASCII lenartz id, fs
Mini-SEED mseed id, fs
PASSCAL SEG Y passcal id, fs, gain, loc
SAC sac id, fs, gain
SEG Y (rev 0 or rev 1) segy id, fs, gain, loc
SEISIO seisio id, fs, gain, loc, resp, units
SLIST (ASCII sample list) slist id, fs
SUDS suds id
UW data file uw id, fs, gain, units
Win32 win32 id, fs, gain, loc, resp, units

Strings are case-sensitive to prevent any performance impact from using matches and/or lowercase().

Note that read_data with file format “seisio” largely exists as a convenience wrapper; it reads only the first SeisIO object from each file that can be converted to a SeisData structure. For more complicated read operations, rseis should be used.

Warning: GeoCSV files must be Unix text files; DOS text files, whose lines end in “\r\n”, will not read properly. Convert with dos2unix or equivalent Windows Powershell commands.

Supported Keywords

Keyword Used By Type Default Meaning
cf win32 String “” win32 channel info filestr
full [1] Bool false read full header into :misc?
ll segy UInt8 0x00 set loc in :id? (see below)
memmap * Bool false use Mmap.mmap to buffer file?
nx_add [2] Int64 360000 minimum size increase of x
nx_new [3] Int64 86400000 length(x) for new channels
jst win32 Bool true are sample times JST (UTC+9)?
swap [4] Bool true byte swap?
strict * Bool true use strict match?
v * Integer 0 verbosity
vl * Bool 0 verbose source logging? [5]

Table Footnotes

[1]used by ah1, ah2, sac, segy, suds, uw; information read into :misc varies by file format.
[2]see table below.
[3]used by bottle, mseed, suds, win32
[4]used by bottle, mseed, suds, win32
[5]used by mseed, passcal, segy; swap is automatic for sac.
[6]adds one line to :notes per file read. It is not guaranteed that files listed in S.notes[i] contain data for channel i; rather, all files listed are from the read operation(s) that populated i.

Performance Tips

  1. mmap=true improves read speed for some formats, particularly ASCII readers, but requires caution. In our benchmarks, the following significant (>3%) speed changes are observed:
  • Significant speedup: ASCII formats, including metadata formats
  • Slight speedup: mini-SEED
  • Significant slowdown: SAC

2. With mseed or win32 data, adjust nx_new and nx_add based on the sizes of the data vectors that you expect to read. If the largest has Nmax samples, and the smallest has Nmin, we recommend nx_new=Nmin and nx_add=Nmax-Nmin.

Default values can be changed in SeisIO keywords, e.g.,

SeisIO.KW.nx_new = 60000
SeisIO.KW.nx_add = 360000

The system-wide defaults are nx_new=86400000 and nx_add=360000. Using these values with very small jobs will greatly decrease performance.

3. strict=true may slow read_data based on the fields matched as part of the file format. In general, any file format that can match on more than id and fs will read slightly slower with this option.

Channel Matching

By default, read_data continues a channel if data read from file matches the channel id (field :id). In some cases this is not enough to guarantee a good match. With strict=true, read_data matches against fields :id, :fs, :gain, :loc, :resp, and :units. However, not all of these fields are stored natively in all file formats. Column “Strict Match” in the first table lists which fields are stored (and can be logically matched) in each format with strict=true.

Examples

  1. S = read_data("uw", "99011116541W", full=true)
    • Read UW-format data file 99011116541W
    • Store full header information in :misc
  2. read_data!(S, "sac", "MSH80*.SAC")
    • Read SAC-format files matching string pattern MSH80*.SAC
    • Read into existing SeisData object S
  3. S = read_data("win32", "20140927*.cnt", cf="20140927*ch", nx_new=360000)
    • Read win32-format data files with names matching pattern 2014092709*.cnt
    • Use ASCII channel information filenames that match pattern 20140927*ch
    • Assign new channels an initial size of nx_new samples

Memory Mapping

memmap=true is considered unsafe because Julia language handling of SIGBUS/SIGSEGV and associated risks is undocumented as of SeisIO v1.0.0. Thus, for example, we don’t know what a connection failure during memory-mapped file I/O does. In some languages, this situation without additional signal handling was notorious for corrupting files.

Under no circumstances should mmap=true be used to read files directly from a drive whose host device power management is independent of the destination computer’s. This includes all work flows that involve reading files directly into memory from a connected data logger. It is not a sufficient workaround to set a data logger to “always on”.

Format Descriptions and Notes

Additional format information can be accessed from the command line by typing SeisIO.formats("FMT") where FMT is the format name; keys(SeisIO.formats) for a list.

  • AH (Ad-Hoc) was developed as a machine-independent seismic data format based on External Data Representation (XDR).
  • Bottle is a single-channel format maintained by UNAVCO (USA).
  • GeoCSV: an extension of “human-readable”, tabular file format Comma-Separated Values (CSV).
  • Lennartz: a variant of sample list (SLIST) used by Lennartz portable digitizers.
  • PASSCAL: A single- channel variant of SEG Y with no file header, developed by PASSCAL/New Mexico Tech and used with PASSCAL field equipment.
  • SAC: the Seismic Analysis Code data format, originally developed by LLNL for the eponymous command-line interpreter.
  • SEED: adopted by the International Federation of Digital Seismograph Networks (FDSN) as an omnibus seismic data standard. mini-SEED is a data-only variant that uses only data blockettes.
  • SEG Y: Society of Exploration Geophysicists data format. Common in the energy industry. Developed and maintained by SEG.
  • SLIST: An ASCII file with a one-line header and data written to file in ASCII string format.
  • SUDS: A similar format to SEED, developed by the US Geological Survey (USGS) in the late 1980s.
  • UW: created in the 1970s by the Pacific Northwest Seismic Network (PNSN), USA, for event archival; used until the early 2000s.
  • Win32: maintained by the National Research Institute for Earth Science and Disaster Prevention (NIED), Japan. Continuous data are divided into files that contain a minute of data from multiple channels stored in one-second segments.

Format-Specific Information

SEG Y

Only SEG Y rev 0 and rev 1 with standard headers are supported. The following are known support limitations:

  1. A few SEG Y headers are partially implemented or unused. These will be refined as we obtain more test data with standardized SEG Y headers and known results.
  2. Not all SEG Y files use the gain formula in the SEG Y rev 1 manual. Users are urged to consult equipment manufacturers and/or coders whose software converts proprietary data formats to SEG Y.
  3. SeisIO does not use the Textual File Header (file bytes 1-3600) or Extended Textual File Header records, as these were never standardized. Specify full=true to read the raw bytes into vectors in :misc. These byte vectors can be parsed manually by the user after file read.
Setting the Location Subfield

The location subfield within :id (“LL” in NN.SSSS.LL.CC) is normally blank, but can be set from an arbitrary Int32 quantity in SEG Y. The reason for this behavior is that SEG Y has at least six “recommended” quantities that can indicate a unique channel. Use one by passing the corresponding value from the table below to keyword “ll”:

Code U Bytes :misc Usual trace header quantity
0x00       None (Default); don’t set LL
0x01 Y 001-004 trace_seq_line Trace sequence number within line
0x02 Y 005-008 trace_seq_file Trace sequence number within SEG Y file
0x03   009-012 rec_no Original field record number
0x04 Y 013-016 channel_no Trace number within original field record
0x05   017-020 energy_src_pt Energy source point number
0x06   021-024 cdp Ensemble number
0x07 ? 025-028 trace_in_ensemble Trace number within the ensemble
0x08   037-040 src-rec_dist Distance from center of source point
0x09   041-044 rec_ele Receiver group elevation
0x0a   045-048 src_ele Surface elevation at source
0x0b   049-052 src_dep Source depth below surface (positive)
0x0c   053-056 rec_datum_ele Datum elevation at receiver group
0x0d   057-060 src_datum_ele Datum elevation at source
0x0e   061-064 src_water_dep Water depth at source
0x0f   065-068 rec_water_dep Water depth at group
0x10   073-076 src_x Source coordinate - X
0x11   077-080 src_y Source coordinate - Y
0x12   081-084 rec_x Group coordinate - X
0x13   085-088 rec_y Group coordinate - Y
0x14   181-184 cdp_x X coordinate of ensemble (CDP) position
0x15   185-188 cdp_y Y coordinate of ensemble (CDP) position
0x16   189-192 inline_3d For 3-D poststack data, in-line number
0x17   193-196 crossline_3d For 3-D poststack data, cross-line number
0x18   197-200 shot_point Shotpoint number (2-D post-stack data)
0x19   205-208 trans_mant Transduction Constant (mantissa)
0x1a ? 233-236 unassigned_1 Unassigned — For optional information
0x1b ? 237-240 unassigned_2 Unassigned — For optional information

A SEG Y file usually increments one (or more) of 0x01, 0x02, or 0x04 for each trace. Unfortunately, we can’t imagine any way to use all three, or even two, in a SEGY-compliant channel ID.

Warning: for any quantity above,

  1. Numeric values >1296 lead to nonstandard characters in the LL subfield
  2. Numeric values >7200 lead to non-unique :id fields, with undefined results
  3. Numeric values >9216 cause read_data to throw an InexactError

UW

Only UW v2 (UW-2) data files are supported. We have no reason to believe that any UW-1 data files are in circulation, and external converters to UW-2 exist.

Win32

Use older channel files with caution. They were not controlled by any central authority until the late 2010s. Inconsistencies between different versions of the same channel file were found by SeisIO developers as recently as 2015.

Other File I/O Functions

rseis(fname)

Read SeisIO native format data into an array of SeisIO structures.

sachdr(fname)

Print headers from SAC file to stdout.

segyhdr(fname[, PASSCAL=true::Bool])

Print headers from SEG Y file to stdout. Specify passcal=true for PASSCAL SEG Y.

Metadata Files

read_meta!(S, fmt::String, filepat [, KWs])
S = read_meta(fmt::String, filepat [, KWs])
Read metadata in file format fmt matching file pattern filestr into S.

fmt
Lowercase string describing the file format. See below.

filepat
Read files with names matching pattern filepat. Supports wildcards.

KWs
Keyword arguments; see also SeisIO standard KWs or type ?SeisIO.KW.

Supported File Formats

File Format String
Dataless SEED dataless
FDSN Station XML sxml
SACPZ sacpz
SEED RESP resp

Warning: Dataless SEED, SACPZ, and RESP files must be Unix text files; DOS text files, whose lines end in “\r\n”, will not read properly. Convert with dos2unix or equivalent Windows Powershell commands.

Supported Keywords

KW Used By Type Default Meaning
memmap all Bool false use Mmap.mmap to buffer file?
msr sxml Bool false read full MultiStageResp?
s all TimeSpec   Start time
t all TimeSpec   Termination (end) time
units resp Bool false fill in MultiStageResp units?
  dataless      
v all Integer 0 verbosity

Note: mmap=true improves read speed for ASCII formats but requires caution. Julia language handling of SIGBUS/SIGSEGV and associated risks is unknown and undocumented.

HDF5 Files

Of the increasingly popular HDF5-based formats for geophysical data, only ASDF is supported at present. Support for other (sub)formats is planned.

S = read_hdf5(fname::String, s::TimeSpec, t::TimeSpec, [, KWs])
read_hdf5!(S::GphysData, fname::String, s::TimeSpec, t::TimeSpec, [, KWs])
Read data in seismic HDF5 file format from file fname into S.

KWs
Keyword arguments; see also SeisIO standard KWs or type ?SeisIO.KW.

This has one fundamental design difference from read_data: HDF5 archives are assumed to be large files with data from multiple channels; they are scanned selectively for data of interest to read, rather than read into memory in their entirety.

Supported Keywords

KW Type Default Meaning
id String ...*” id pattern, formated nn.sss.ll.ccc
      (net.sta.loc.cha); FDSN-style wildcards (a)
msr Bool true read full (MultiStageResp) instrument resp?
v Integer 0 verbosity

(a) A question mark (‘?’) is a wildcard for a single character (exactly one); an asterisk (‘*’) is a wildcard for zero or more characters.

Writing to HDF5 volumes is supported through write_hdf5, described in Writing to File.

XML Meta-Data

SeisIO can read and write the following XML metadata formats:

  • QuakeML Version 1.2
  • StationXML Version 1.1

StationXML

read_sxml(fpat[, KWs])

Read FDSN StationXML files matching string pattern fpat into a new SeisData object.

Keywords:
s: start time. Format “YYYY-MM-DDThh:mm:ss”, e.g., “0001-01-01T00:00:00”.
t: termination (end) time. Format “YYYY-MM-DDThh:mm:ss”.
msr: (Bool) read instrument response info as MultiStageResp?

msr=true processes XML files to give full response information
at every documented stage of the acquisition process: sampling, digitization,
FIR filtering, decimation, etc.

How often is MultiStageResp needed?
Almost never. By default, the :resp field of each channel contains a
simple instrument response with poles, zeros, sensitivity (:a0), and
sensitivity frequency (:f0). Very few use cases require more detail.

QuakeML

read_qml(fpat)

Read QuakeML files matching string pattern fpat. Returns a tuple containing an array of SeisHdr objects H and an array of SeisSrc objects R. Each pair (H[i], R[i]) describes the preferred location (origin, SeisHdr) and event source (focal mechanism or moment tensor, SeisSrc) of event i.

If multiple focal mechanisms, locations, or magnitudes are present in a single Event element of the XML file(s), the following rules are used to select one of each per event:

FocalMechanism
1. preferredFocalMechanismID if present
2. Solution with best-fitting moment tensor
3. First FocalMechanism element

Magnitude
1. preferredMagnitudeID if present
2. Magnitude whose ID matches MomentTensor/derivedOriginID
3. Last moment magnitude (lowercase scale name begins with “mw”)
4. First Magnitude element

Origin
1. preferredOriginID if present
2. derivedOriginID from the chosen MomentTensor element
3. First Origin element

Non-essential QuakeML data are saved to misc in each SeisHdr or SeisSrc object as appropriate.

Web Services

Data requests use get_data! for FDSN, IRISWS, and IRIS PH5WS data services; for (near) real-time streaming, see SeedLink.

Time-Series Data

get_data!(S, method, channels; KWs)
S = get_data(method, channels; KWs)
Retrieve time-series data from a web archive to SeisData structure S.

method
“FDSN”: :FDSNWS dataselect. Change FDSN servers with keyword
src using the server list (see ?seis_www).
“IRIS”: IRISWS timeseries.
“PH5”: PH5WS timeseries.

channels
Channels to retrieve – string, string array, or parameter file.
Type ?chanspec at the Julia prompt for more info.

Keywords

Shared Keywords

fmt, nd, opts, rad, reg, si, to, v, w, y

Seismic Processing Keywords
  • unscale: divide gain from data after download
  • demean: demean data after download
  • detrend: detrend data after download
  • taper: taper data after download
  • ungap: remove gaps in data after download
  • rr: remove seismic instrument response after download
Other Keywords
  • autoname: determine file names from channel ID?
  • msr: get instrument responses as MultiStageResonse? (“FDSN” only)
  • s: start time
  • t: termination (end) time
  • xf: XML file name for output station XML
Special Behavior
  1. autoname=true attempts to emulate IRISWS channel file naming conventions. For this to work, however, each request must return exactly one channel. A wildcard (“*” or “?”) in a channel string deactivates autoname=true.
  2. Seismic processing keywords follow an order of operations that matches the ordering of the above list.
  3. IRISWS requests always remove the stage zero gain on the server side, because the service doesn’t include the gain constant in the request. This ensures that :gain is accurate in SeisIO.
  4. IRISWS requests don’t fill :loc or :resp fields in mini-SEED and don’t fill the :resp field in SAC. For cross-format consistency, the stage-zero (scalar) gain is removed from any request to IRISWS and the :gain field in such channels is 1.0.
Data Formats

SeisIO supports the following data format strings in timeseries web requests, subject to the limitations of the web service:

  • “miniseed” or “mseed” for mini-SEED
  • “sac” or “sacbl” for binary little-endian SAC
  • “geocsv” for two-column (tspair) GeoCSV

Station Metadata

FDSNsta!(S, chans, KW)
S = FDSNsta(chans, KW)

Fill channels chans of SeisData structure S with information retrieved from remote station XML files by web query.

Shared Keywords

src, to, v

Other Keywords

  • msr: get instrument responses as MultiStageResonse?
  • s: start time
  • t: termination (end) time
  • xf: XML file name for output station XML

Examples

Note that the “src” keyword is used by FDSNWS dataselect queries, but not by IRISWS or PH5WS timeseries queries.

1. Download 10 minutes of data from four stations at Mt. St. Helens (WA, USA), delete the low-gain channels, and save as SAC files in the current directory.

S = get_data("FDSN", "CC.VALT, UW.SEP, UW.SHW, UW.HSR", src="IRIS", t=-600)
S -= "UW.SHW..ELZ"
S -= "UW.HSR..ELZ"
writesac(S)

2. Get 5 stations, 2 networks, all channels, last 600 seconds of data at IRIS

CHA = "CC.PALM, UW.HOOD, UW.TIMB, CC.HIYU, UW.TDH"
TS = u2d(time())
TT = -600
S = get_data("FDSN", CHA, src="IRIS", s=TS, t=TT)

3. A request to FDSN Potsdam, time-synchronized, with some verbosity

ts = "2011-03-11T06:00:00"
te = "2011-03-11T06:05:00"
R = get_data("FDSN", "GE.BKB..BH?", src="GFZ", s=ts, t=te, v=1, y=true)

4. Get channel information for strain and seismic channels at station PB.B001:

S = FDSNsta("CC.VALT..,PB.B001..BS?,PB.B001..E??")
  1. Get trace data from IRISws from TS to TT at channels CHA
S = SeisData()
CHA = "UW.TDH..EHZ, UW.VLL..EHZ, CC.VALT..BHZ"
TS = u2d(time()-86400)
TT = 600
get_data!(S, "IRIS", CHA, s=TS, t=TT)

6. Get synchronized trace data from IRISws with a 55-second timeout on HTTP requests, written directly to disk.

CHA = "UW.TDH..EHZ, UW.VLL..EHZ, CC.VALT..BHZ"
TS = u2d(time())
TT = -600
S = get_data("IRIS", CHA, s=TS, t=TT, y=true, to=55, w=true)

7. Request 10 minutes of continuous vertical-component data from a small May 2016 earthquake swarm at Mt. Hood, OR, USA, and cosine taper after download:

STA = "UW.HOOD.--.BHZ,CC.TIMB.--.EHZ"
TS = "2016-05-16T14:50:00"; TE = 600
S = get_data("IRIS", STA, s=TS, t=TE)

8. Grab data from a predetermined time window in two different formats

ts = "2016-03-23T23:10:00"
te = "2016-03-23T23:17:00"
S = get_data("IRIS", "CC.JRO..BHZ", s=ts, t=te, fmt="sacbl")
T = get_data("IRIS", "CC.JRO..BHZ", s=ts, t=te, fmt="miniseed")

Bad Requests

Failed data requests are saved to special channels whose IDs begin with “XX.FAIL”. The HTTP response message is stored as a String in :misc["msg"]; display to STDOUT with println(stdout, S.misc[i]["msg"]).

Unparseable data requests are saved to special channels whose IDs begin with “XX.FMT”. The raw response bytes are stored as an Array{UInt8,1} in :misc["raw"] and can be dumped to file or parsed with external programs as needed.

One special channel is created per bad request.

Write Suppport

The table below sumamrizes the current write options for SeisIO. Each function is described in detail in this chapter.

Structure/Description Output Format Function Submodule
GphysChannel ASDF write_hdf5  
GphysChannel SAC timeseries writesac  
GphysChannel channel metadata StationXML write_sxml  
GphysChannel instrument response SAC polezero writesacpz  
GphysData ASDF write_hdf5  
GphysData SAC timeseries writesac  
GphysData channel metadata StationXML write_sxml  
GphysData instrument response SAC polezero writesacpz  
SeisEvent ASDF write_hdf5  
SeisEvent header and source info ASDF QuakeML asdf_wqml  
SeisEvent header and source info QuakeML write_qml  
SeisEvent trace data only SAC timeseries writesac  
Array{SeisEvent, 1} ASDF QuakeML asdf_wqml  
Array{SeisHdr, 1} QuakeML write_qml  
Array{SeisHdr, 1}, Array{SeisSrc, 1} ASDF QuakeML asdf_wqml  
Array{SeisHdr, 1}, Array{SeisSrc, 1} QuakeML write_qml  
SeisHdr QuakeML write_qml  
SeisHdr, SeisSrc ASDF QuakeML asdf_wqml  
SeisHdr, SeisSrc QuakeML wqml  
any SeisIO structure SeisIO file wseis  
primitive data type or array ASDF AuxiliaryData asdf_waux  

Methods for SeisEvent, SeisHdr, or SeisSrc are part of submodule SeisIO.Quake. asdf_waux and asdf_wqml are part of SeisIO.SeisHDF..


Write Functions

Functions are organized by file format.

HDF5/ASDF

write_hdf5(fname, S)

Write data from S to file fname in a seismic HDF5 format. The default file format is ASDF.

With ASDF files, if typeof(S) == SeisEvent, S.hdr and S.source are written (appended) to the “QuakeML ” element.

Supported Keywords
KW Type Default Meaning
add Bool false Add new traces to file as needed?
chans ChanSpec 1:S.n Channels to write to file
len Period Day(1) Length of new traces added to file
ovr Bool false Overwrite data in existing traces?
v Integer 0 verbosity
Write Method: Add (add=true)

Initializes new traces (filled with NaNs) of length = len as needed, and overwrite with data in appropriate places.

add=true follows these steps in this order: 1. Determine times of all data in S[chans] and all traces in “Waveforms/”. 2. For all data in S[chans] that cannot be written to an existing trace, a new trace of length = len sampled at S.fs[i] is initialized (filled with NaNs). 3. If a segment in S[chans] overlaps a trace in “Waveforms/” (including newly- created traces): + Merge the header data in S[chans] into the relevant station XML. + Overwrite the relevant segment(s) of the trace.

Unless len exactly matches the time boundaries of each segment in S, new traces will contain more data than S, with the extra samples initialized to NaNs. Presumably these will be replaced with real data in subsequent overwrites.

Write Method: Overwrite (ovr = true)

If ovr=true is specified, but add=false, write_hdf5 only overwrites existing data in hdf_out. * No new trace data objects are created in hdf_out. * No new file is created. If hdf_out doesn’t exist, nothing happens. * If no traces in hdf_out overlap segments in S, hdf_out isn’t modified. * Station XML is merged in channels that are partly overwritten.


QuakeML

write_qml(fname, Ev::SeisEvent; v::Integer=0)

Write event metadata from SeisEvent Ev to file fname.

write_qml(fname, SHDR::SeisHdr; v::Integer=0)
write_qml(fname, SHDR::Array{SeisHdr, 1}; v::Integer=0)

Write QML to file fname from SHDR.

If fname exists, and is QuakeML, SeisIO appends the existing XML. If the file exists, but is NOT QuakeML, an error is thrown; the file isn’t overwritten.

write_qml(fname, SHDR::SeisHdr, SSRC::SeisSrc; v::Integer=0) write_qml(fname, SHDR::Array{SeisHdr,1}, SSRC::Array{SeisSrc,1}; v::Integer=0)

Write QML to file fname from SHDR and SSRC.

Warning: To write data from SeisSrc structure R in array SSRC, it must be true that R.eid == H.id for some H in array SHDR.


SAC

writesac(S::GphysData, chans=CC, v=V)
writesac(C::GphysChannel; chans=CC, fname=FF, v=V)

Write SAC data to SAC files with auto-generated names. With any GphysChannel subtype, specifying fname=FF sets the filename to FF.

writesacpz(pzf, S[, chans=CC])

Write fields from SeisIO structure S to SACPZ file pzf. Specify which channels to write in a GphysDaya structure with chans=CC.

SeisIO Native

wseis(fname, S)
wseis(fname, S, T, U...)

Write SeisIO data to file fname. Multiple objects can be written at once.

Station XML

write_sxml(fname, S[, chans=CC])

Write station XML from the fields of S to file fname. Specify channel numbers to write in a GphysData object with chans=CC.

Use keyword chans=Cha to restrict station XML write to Cha. This keyword can accept an Integer, UnitRange, or Array{Int64,1} argument.

Data Processing

Supported processing operations are described below.

In most cases, a “safe” version of each function can be invoked to create a new object with the processed output.

Any function that can logically operate on a single-channel object will do so. Any function that operates on a SeisData object can be applied to the :data field of a SeisEvent object.

Basic Operations

These functions have no keywords that fundamentally change their behavior.

Remove the mean from all channels i with S.fs[i] > 0.0. Specify irr=true to also remove the mean from irregularly sampled channels. Specify chans=CC to restrict to channel number(s) CC. Ignores NaNs.

Remove the polynomial trend of degree n from every regularly-sampled channel in S using a least-squares polynomial fit. Specify chans=CC to restrict to channel number(s) CC. Ignores NaNs.

Compute the envelope of channel data in S. Only affects regularly-sampled channels.

nanfill!(S)

For each channel i in S, replace all NaNs in S.x[i] with the mean of non-NaN values.

resample!(S::GphysData [, chans=CC, fs=FS])
resample!(C::SeisChannel, fs::Float64)

Resample data in S to FS Hz. If keyword fs is not specified, data are resampled to the lowest non-zero value in S.fs[CC]. Note that a poor choice of fs can lead to upsampling and other bad behaviors.

Use keyword chans=CC to only resample channel numbers CC. By default, all channels i with S.fs[i] > 0.0 are resampled.

unscale!(S[, chans=CC, irr=false])

Divide the gains from all channels i with S.fs[i] > 0.0. Specify chans=CC to restrict to channel number(s) CC. Specify irr=true to also remove gains of irregularly-sampled channels.

Customizable Operations

Convert Seismograms

Seismograms can be converted to or from displacement, velocity, or acceleration using convert_seis:

convert_seis!(S[, chans=CC, units_out=UU, v=V])
convert_seis!(C[, units_out=UU, v=V])

Converts all seismic data channels in S to velocity seismograms, differentiating or integrating as needed.

Keywords
  • units_out=UU: specify output units: “m”, “m/s” (default), or “m/s²”
  • chans=CC: restrict seismogram conversion to seismic data channels in CC
  • v=V: verbosity
Behavior and Usage Warnings

Long Seismograms: convert_seis becomes less reversible as seismograms lengthen, especially at Float32 precision, due to loss of significance. At single (Float32) precision, seismograms with N ~ 10^6 samples are reconstructable after one conversion (e.g. “m” ==> “m/s” can be reversed, with output approximately equal to the original data). After multiple conversions (i.e., “m” ==> “m/s²” or “m/s²” ==> “m”), Float32 data cannot be perfectly reconstructed in this way, though reconstruction errors are typically small.

Rectangular Integration: integration is always rectangular; irregularly-spaced seismic data are not processed by convert_seis. Summation uses an in-place variant of Kahan-Babuška-Neumaier summation.


Fill Gaps

ungap!(S[, chans=CC, m=true, tap=false])
ungap!(C[, m=true, tap=false])

Fill time gaps in each channel with the mean of the channel data.

Keywords
  • chans=CC: only ungap channels CC.
  • m=false: this flag fills gaps with NaNs instead of the mean.
  • tap=true: taper data before filling gaps.

Merge

merge!(S::GphysData, U::GphysData)

Merge two GphysData structures. For timeseries data, a single-pass merge-and-prune operation is applied to value pairs whose sample times are separated by less than half the sampling interval.

merge!(S::GphysData)

“Flatten” a GphysData structure by merging data from identical channels.

Merge Behavior
Which channels merge?
  • Channels merge if they have identical values for :id, :fs, :loc, :resp, and :units.
  • An unset :loc, :resp, or :units field matches any set value in the corresponding field of another channel.
What happens to merged fields?
  • The essential properties above are preserved.
  • Other fields are combined.
  • Merged channels with different :name values use the name of the channel with the latest data before the merge; other names are logged to :notes.
What does merge! resolve?
Issue Resolution
Empty channels Delete
Duplicated channels Delete duplicate channels
Duplicated windows in channel(s) Delete duplicate windows
Multiple channels, same properties(a) Merge to a single channel
Channel with out-of-order time windows Sort in chronological order
Overlapping windows, identical data, time-aligned Windows merged
Overlapping windows, identical data, small time offset(a) Time offset corrected, windows merged
Overlapping windows, non-identical data Samples averaged, windows merged

(a) “Properties” here are :id, :fs, :loc, :resp, and :units.
(b) Data offset >4 sample intervals are treated as overlapping and non-identical.

When SeisIO Won’t Merge

SeisIO does not combine data channels if any of the five fields above are non-empty and different. For example, if a GphysData object S contains two channels, each with id “XX.FOO..BHZ”, but one has fs=100 Hz and the other fs=50 Hz, merge! does nothing.

It’s best to merge only unprocessed data. Data segments that were processed independently (e.g. detrended) will be averaged pointwise when merged, which can easily leave data in an unusuable state.

mseis!(S::GphysData, U::GphysData, ...)

Merge multiple GphysData structures into S.


Seismic Instrument Response

translate_resp!(S, resp_new[, chans=CC, wl=g])
translate_resp!(Ch, resp_new[, wl=g])

Translate the instrument response of seismic data channels to resp_new. Replaces field :resp with resp_new for all affected channels.

remove_resp!(S, chans=CC, wl=g])
remove_resp!(Ch, wl=g])

Remove (flatten to DC) the instrument response of Ch, or of seismic data channels CC in S. Replaces :resp with the appropriate (all-pass) response.

Keywords
  • C=cha restricts response translation for GphysData object S to channel(s) cha. Accepts an Integer, UnitRange, or Array{Int64,1} argument; does not accept string IDs. By default, all seismic data channels in S have their responses translated to resp_new.
  • wl=g sets the waterlevel to g (default: g = eps(Float32) ~ 1.1f-7). The waterlevel is the minimum magnitude (absolute value) of the normalized old frequency response; in other words, if the old frequency response has a maximum magnitude of 1.0, then no response coefficient can be lower than g. This is useful to prevent “divide by zero” errors, but setting it too high will cause errors.
Precision and Memory Optimization

To optimize speed and memory use, instrument response translation maps data to Complex{Float32} before translation; thus, with Float64 data, there can be minor rounding errors.

Instrument responses are also memory-intensive. The minimum memory consumption to translate the response of a gapless Float32 SeisChannel object is ~7x the size of the object itself.

More precisely, for an object S (of Type <: GphysData or GphysChannel), translation requires memory ~ 2 kB + the greater of (7x the size of the longest Float32 segment, or 3.5x the size of the longest Float64 segment). Translation uses four vectors – three complex and one real – that are updated and dynamically resized as the algorithm loops over each segment:

  • Old response container: Array{Complex{Float32,1}}(undef, Nx)
  • New response container: Array{Complex{Float32,1}}(undef, Nx)
  • Complex data container: Array{Complex{Float32,1}}(undef, Nx)
  • Real frequencies for FFT: Array{Float32,1}(undef, Nx)

…where Nx is the number of samples in the longest segment in S.

Causality

Response translation adds no additional processing to guarantee causality. At a minimum, most users will want to call detrend! and taper! before translating instrument responses.


Synchronize

sync!(S::GphysData[, s=ST, t=EN, v=VV])

Synchronize all data in S to start at ST and terminate at EN with verbosity level VV.

For regularly-sampled channels, gaps between the specified and true times are filled with the mean; this isn’t possible with irregularly-sampled data.

Specifying start time (s)
  • s=”last”: (Default) sync to the last start time of any channel in S.
  • s=”first”: sync to the first start time of any channel in S.
  • A numeric value is treated as an epoch time (?time for details).
  • A DateTime is treated as a DateTime. (see Dates.DateTime for details.)
  • Any string other than “last” or “first” is parsed as a DateTime.
Specifying end time (t)
  • t=”none”: (Default) end times are not synchronized.
  • t=”last”: synchronize all channels to end at the last end time in S.
  • t=”first” synchronize to the first end time in S.
  • numeric, datetime, and non-reserved strings are treated as for -s.

Taper

taper!(S[, keywords])

Cosine taper each channel in S around time gaps. Specify chans=CC to restrict to channel number(s) CC. Does not modify irregularly-sampled data channels.

taper!(C[, keywords])

Cosine taper each segment of time-series data in GphysChannel object C that contains at least N_min total samples. Returns if C is irregularly sampled.

Keywords
  • chans: Only taper the specified channels.
  • N_min: Data segments with N < N_min total samples are not tapered.
  • t_max: Maximum taper edge in seconds.
  • α: Taper edge area; as for a Tukey window, the first and last 100*:math:alpha`% of samples in each window are tapered, up to `t_max seconds of data.

Zero-Phase Filter

filtfilt!(S::GphysData[; KWs])

Apply a zero-phase filter to regularly-sampled data in S. Irregularly-sampled data are never processed by filtfilt!.

filtfilt!(C::SeisChannel[; KWs])

Apply zero-phase filter to C.x. Filtering is applied to each contiguous data segment in C separately.

Keywords
KW Default Type Description
chans [] (a) channel numbers to filter
fl 1.0 Float64 lower corner frequency [Hz] (b)
fh 15.0 Float64 upper corner frequency [Hz] (b)
np 4 Int64 number of poles
rp 10 Int64 pass-band ripple (dB)
rs 30 Int64 stop-band ripple (dB)
rt “Bandpass” String response type (type of filter)
dm “Butterworth” String design mode (name of filter)
(a) Allowed types are Integer, UnitRange, and Array{Int64, 1}.
(b) By convention, the lower corner frequency (fl) is used in a
Highpass filter, and fh is used in a Lowpass filter.

Default filtering KW values can be changed by adjusting the shared keywords, e.g., SeisIO.KW.Filt.np = 2 changes the default number of poles to 2.

Troubleshooting NaNs in Output

NaNs in the output of filtering operations (e.g., filtfilt!, translate_resp!) are nearly always the result of zeros in the denominator of the filter transfer function.

This is not a bug in SeisIO.

In particular, the increased speed of data processing at 32-bit precision comes with an increased risk of NaN output. The reason is that 32-bit machine epsilon (eps(Float32) in Julia) is ~1.0e-7; by comparison, 64-bit machine epsilon is ~1.0e-16.

Please check for common signal processing issues before reporting NaNs to SeisIO maintainers. For example:

Filtering
  • Is the pass band too narrow?
  • Is the lower corner frequency too close to DC?
  • Is the filter order (or number of poles) too high?
Instrument Responses
  • Are the roll-off frequencies of the old and new responses too far apart?
  • Is the water level appropriate for the data scaling?
Suggested References
  1. Oppenheim, A.V., Buck, J.R. and Schafer, R.W., 2009. Discrete-time signal processing (3rd edition). Upper Saddle River, NJ, USA: Prentice Hall.
  2. Orfanidis, S.J., 1995. Introduction to signal processing. Upper Saddle River, NJ, USA: Prentice Hall.

Submodules

Name Purpose
ASCII ASCII file formats (includes GeoCSV, SLIST, and variants)
FastIO Replacement low-level I/O functions to avoid thread locking
Quake Earthquake seismology
RandSeis Generate SeisIO structures with quasi-random entries
SEED Standard for the Exchange of Earthquake Data (SEED) file format
SUDS Seismic Unified Data System (SUDS) file format
SeisHDF Dedicated support for seismic HDF5 subformats
UW University of Washington data format

Using Submodules

At the Julia prompt, type using SeisIO.NNNN where NNNN is the submodule name; for example, using SeisIO.Quake loads the Quake submodule.

RandSeis

This submodule is used to quickly generate SeisIO objects with quasi-random field contents. Access it with “using SeisIO.RandSeis”

The following are true of all random data objects generated by the RandSeis module:

  • Channels have SEED-compliant IDs, sampling frequencies, and data types.
  • Random junk fills :notes and :misc.
  • Sampling frequency (:fs) is chosen from a set of common values.
  • Channel data are randomly generated.
  • Time gaps are automatically inserted into regularly-sampled data.
randPhaseCat()

Generate a random seismic phase catalog suitable for testing EventChannel, EventTraceData, and SeisEvent objects.

randSeisChannel([c=false, s=false])

Generate a SeisChannel of random data. Specify c=true for campaign-style (irregularly-sampled) data (fs = 0.0); specify s=true to guarantee seismic data. s=true overrides c=true.

randSeisData([c=0.2, s=0.6])
randSeisData(N[, c=0.2, s=0.6])

Generate 8 to 24 channels of random seismic data as a SeisData object. Specify N for a fixed number of channels.

  • 100*c% of channels will have irregularly-sampled data (fs = 0.0)
  • 100*s% of channels are guaranteed to have seismic data.


randSeisEvent([c=0.2, s=0.6])
randSeisEvent(N[, c=0.2, s=0.6])

Generate a SeisEvent structure filled with random values. Specify N for a fixed number of channels.

  • 100*c% of channels will have irregularly-sampled data (fs = 0.0)
  • 100*s% of channels are guaranteed to have seismic data.


randSeisHdr()

Generate a SeisHdr structure filled with random values.

randSeisSrc()

Generate a SeisSrc structure filled with random values.

SEED

Submodule for the Standard for the Exchange of Earthquake Data (SEED) file format; includes additional functionality.

dataless_support()

Dump status of dataless SEED blockette support to stdout.

mseed_support()

Dump status of mini-SEED blockette support to stdout.

seed_support()

Dump status of SEED blockette support to stdout.

Scanning SEED Volumes

scan_seed(fname)

Scan a single SEED file and report on the contents. Much faster than read_data as no samples are decoded/read and most blockettes are skipped. Control output behavior with keywords.

Supported Keywords
KW Type Default Meaning
memmap Bool false memory-map file?
npts Bool true output samples per channel?
ngaps Bool true output time gaps per channel?
nfs Bool true output number of fs vals per channel?
quiet Bool false true prints nothing to stdout
seg_times Bool false output exact gap times?
fs_times Bool false output exact times of fs changes?
v Integer 0 verbosity

Note that seg_times and fs_times dump verbose per-channel tabulation to stdout.

Users are encourage to submit feature request Issues if there’s a need to scan for other changes within a SEED volume.

Interaction with Online Requests

scan_seed cannot interact directly with online SEED requests. As a workaround, do get_data(..., w=true) to dump the raw request directly to disk, then scan the file(s) created.

UW

The UW submodule extends functionality for the University of Washington (UW) file format(s).

The UW data format was created in the 1970s by the Pacific Northwest Seismic Network (PNSN), USA, for event archival. It remained in use through the 1990s. A UW event is described by a pickfile and a corresponding data file, whose filenames were identical except for the last character. The data file is self-contained; the pick file is not required to read raw trace data. However, station locations were stored in an external text file.

Only UW-2 data files are supported by SeisIO. We have only seen UW-1 data files in Exabyte tapes from the 1980s.

uwpf(pf[, v])

Read UW-format seismic pick file pf. Returns a tuple of (SeisHdr, SeisSrc).

uwpf!(W, pf[, v::Int64=KW.v])

Read UW-format seismic pick info from pickfile f into SeisEvent object W. Overwrites W.source and W.hdr with pickfile information. Keyword v controls verbosity.

Nodal

The Nodal submodule is intended to handle data from nodal arrays. Nodal arrays differ from standard seismic data in that the start and end times of data segments are usually synchronized.

Reading Nodal Data Files

S = read_nodal(fmt, fname [, KWs])

Read data in file format fmt from file fname into memory. Returns a NodalData object.

Supported Keywords

KW Type Default Meaning
chans ChanSpec Int64[] channels to read
nn String N0 network subfield in :id
s TimeSpec 0001-01-01T00:00:00 start time
t TimeSpec 9999-12-31T12:59:59 end time
v Integer 0 verbosity
Non-Standard Behavior

Real values supplied to keywords s= and t= are treated as seconds relative to file begin time. Most SeisIO functions that accept TimeSpec arguments treat Real values as seconds relative to now().

Supported File Formats

File Format String Notes
Silixa TDMS silixa Limited support; see below
SEG Y segy Field values are different from read_data output
Silixa TDMS Support Status
  • Currently only reads file header and samples from first block
  • Not yet supported (test files needed):
    • first block additional samples
    • second block
    • second block additional samples
  • Awaiting manufacturer clarification:
    • parameters in :info
    • position along cable; currently loc.(x,y,z) = 0.0 for all channels
    • frequency response; currently :resp is an all-pass placeholder
Nodal SEG Y Support Status

See SEG Y Support.

Working with NodalData objects

NodalData objects have one major structural difference from SeisData objects: the usual data field :x is a set of views to an Array{Float32, 2} (equivalent to a Matrix{Float32}) stored in field :data. This allows the user to apply two-dimensional data processing operations directly to the data matrix.

NodalData Assumptions

  • S.t[i] is the same for all i.
  • S.fs[i] is constant for all i.
  • length(S.x[i]) is constant for all i.

Other Differences from SeisData objects

  • Operations like push! and append! must regenerate :data using hcat(), and therefore consume a lot of memory.
  • Attempting to push! or append! channels of unequal length throws an error.
  • Attempting to push! or append! same-length channels with different :t or :fs won’t synchronize them! You will instead have columns in :data that aren’t time-aligned.
  • Irregularly-sampled data (:fs = 0.0) are not supported.

Types

See docstrings for field names and descriptions.

NodalLoc()

Nodal location. Currently only stores position along optical cable.

NodalData()

Structure to hold nodal array data. Similar to a SeisData object.

NodalChannel()

A single channel of data from a nodal array. Similar to a SeisChannel object.

Quake

The Quake submodule was introduced in SeisIO v0.3.0 to isolate handling of discrete earthquake events from handling of continuous geophysical data. While the channel data are similar, fully describing an earthquake event requires many additional Types (objects) and more information (fields) in channel descriptors.

Types

See Type help text for field descriptions and SeisIO behavior.

EQMag()

Earthquake magnitude object.

EQLoc()

Structure to hold computed earthquake location data.

EventChannel()

A single channel of trace data (digital seismograms) associated with a discrete event (earthquake).

EventTraceData()

A custom structure designed to describe trace data (digital seismograms) associated with a discrete event (earthquake).

PhaseCat()

A seismic phase catalog is a dictionary with phase names for keys (e.g. “pP”, “PKP”) and SeisPha objects for values.

SeisEvent()

A compound Type comprising a SeisHdr (event header), SeisSrc (source process), and EventTraceData (digital seismograms.)

SeisHdr()

Earthquake event header object.

SeisPha()

A description of a seismic phase measured on a data channel.

SeisSrc()

Seismic source process description.

SourceTime()

QuakeML-compliant seismic source-time parameterization.

Web Queries

Keyword descriptions for web queries appear at the end of this section.

FDSNevq(ot)

Event header query. Multi-server query for the event(s) with origin time(s) closest to ot. Returns a tuple consisting of an Array{SeisHdr,1} and an Array{SeisSrc,1}, so that the i`th entry of each array describes the header and source process of event `i.

Keywords: evw, mag, nev, rad, reg, src, to, v

Notes

  • Specify ot as a string formatted YYYY-MM-DDThh:mm:ss in UTC (e.g. “2001-02-08T18:54:32”).
  • Incomplete string queries are read to the nearest fully-specified time constraint; thus, FDSNevq(“2001-02-08”) returns the nearest event to 2001-02-08T00:00:00.
  • If no event is found in the specified search window, FDSNevq exits with an error.
  • For FDSNevq, keyword src can be a comma-delineated list of sources, provided each has a value in ?seis_www; for example, src="IRIS, INGV, NCEDC" is valid.
FDSNevt(ot::String, chans::String)

Get header and trace data for the event closest to origin time ot on channels chans. Returns a SeisEvent structure.

Keywords: evw, fmt, len, mag, model, nd, opts, pha, rad, reg, src, to, v, w

Notes

  • Specify ot as a string formatted YYYY-MM-DDThh:mm:ss in UTC (e.g. “2001-02-08T18:54:32”).
  • Incomplete string queries are read to the nearest fully-specified time constraint; thus, FDSNevq(“2001-02-08”) returns the nearest event to 2001-02-08T00:00:00.
  • If no event is found in the specified search window, FDSNevt exits with an error.
  • Unlike FDSNevq, number of events cannot be specified and src must be a single source String in ?seis_www.


get_pha!(S::Data[, keywords])

Command-line interface to IRIS online travel time calculator, which calls TauP. Returns a matrix of strings.

Keywords: pha, model, to, v

References

  1. TauP manual: http://www.seis.sc.edu/downloads/TauP/taup.pdf
  2. Crotwell, H. P., Owens, T. J., & Ritsema, J. (1999). The TauP Toolkit: Flexible seismic travel-time and ray-path utilities, SRL 70(2), 154-160.

Web Query Keywords

KW Default T [1] Meaning
evw [600.0, 600.0] A{F,1} search window in seconds [2]
fmt “miniseed” S request data format
len 120.0 I desired trace length [s]
mag [6.0, 9.9] A{F,1} magnitude range for queries
model “iasp91” S Earth velocity model for phase times
nd 1 I number of days per subrequest
nev 0 I number of events returned per query [3]
opts “” S user-specified options [4]
pha “P” S phases to get [5]
rad [] A{F,1} radial search region [6]
reg [] A{F,1} rectangular search region [7]
src “IRIS” S data source; type ?seis_www for list
to 30 I read timeout for web requests [s]
v 0 I verbosity
w false B write requests to disk? [8]

Table Footnotes

[1]Types: A = Array, B = Boolean, C = Char, DT = DateTime, F = Float, I = Integer, S = String, U8 = Unsigned 8-bit integer (UInt8)
[2]search range is always ot-|evw[1]| t ot+|evw[2]|
[3]nev=0 returns all events in the query
[4]String is passed as-is, e.g. “szsrecs=true&repo=realtime” for FDSN. String should not begin with an ampersand.
[5]Comma-separated String, like “P, pP”; use “ttall” for all phases
[6]Specify region [center_lat, center_lon, min_radius, max_radius, dep_min, dep_max], with lat, lon, and radius in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches.
[7]Specify region [lat_min, lat_max, lon_min, lon_max, dep_min, dep_max], with lat, lon in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches.
[8]If w=true, a file name is automatically generated from the request parameters, in addition to parsing data to a SeisData structure. Files are created from the raw download even if data processing fails, in contrast to get_data(… wsac=true).

Example

Get seismic and strainmeter records for the P-wave of the Tohoku-Oki great earthquake on two borehole stations and write to native SeisData format:

S = FDSNevt("201103110547", "PB.B004..EH?,PB.B004..BS?,PB.B001..BS?,PB.B001..EH?")
wseis("201103110547_evt.seis", S)

Utility Functions

distaz!(Ev::SeisEvent)

Compute distance, azimuth, and backazimuth by the Haversine formula. Overwrites Ev.data.dist, Ev.data.az, and Ev.data.baz.

gcdist([lat_src, lon_src, ]rec)

Compute great circle distance, azimuth, and backazimuth from a single source with coordinates [s_lat, s_lon] to receivers rec with coordinates [r_lat r_lon] in each row.

show_phases(P::PhaseCat)

Formatted display of seismic phases in dictionary P.

fill_sac_evh!(Ev::SeisEvent, fname[; k=N])

Fill (overwrite) values in Ev.hdr with data from SAC file fname. Keyword k=i specifies the reference channel i from which the absolute origin time Ev.hdr.ot is set. Potentially affects header fields :id, :loc (subfields .lat, .lon, .dep only), and :ot.

Reading Earthquake Data Files

S = read_quake(fmt::String, filename [, KWs])
Read data in file fmt from file filename into memory.

fmt
Case-sensitive string describing the file format. See below.

KWs
Keyword arguments; see also SeisIO standard KWs or type ?SeisIO.KW.
Standard keywords: full, nx_add, nx_new, v
Other keywords: See below.

Supported File Formats

File Format String Notes
PC-SUDS suds  
QuakeML qml, quakeml only reads first event from file
UW uw  

Supported Keywords

KW Used By Type Default Meaning
full suds, uw Bool false read full header into :misc?
v all Integer 0 verbosity

QuakeML

read_qml(fpat::String)

Read QuakeML files matching string pattern fpat. Returns a tuple containing an array of SeisHdr objects H and an array of SeisSrc objects R. Each pair (H[i], R[i]) describes the preferred location (origin, SeisHdr) and event source (focal mechanism or moment tensor, SeisSrc) of event i.

If multiple focal mechanisms, locations, or magnitudes are present in a single Event element of the XML file(s), the following rules are used to select one of each per event:

FocalMechanism
1. preferredFocalMechanismID if present
2. Solution with best-fitting moment tensor
3. First FocalMechanism element

Magnitude
1. preferredMagnitudeID if present
2. Magnitude whose ID matches MomentTensor/derivedOriginID
3. Last moment magnitude (lowercase scale name begins with “mw”)
4. First Magnitude element

Origin
1. preferredOriginID if present
2. derivedOriginID from the chosen MomentTensor element
3. First Origin element

Non-essential QuakeML data are saved to misc in each SeisHdr or SeisSrc object as appropriate.

write_qml(fname, Ev::SeisEvent; v::Integer=0)

See writing.

SeisHDF

This submodule contains dedicated support for seismic subformats of the HDF5 file format.

Additional Functions

asdf_waux(fname, path, X)

Write X to AuxiliaryData/path in file fname. If an object already exists at AuxiliaryData/path, it will be deleted and overwritten with X.

asdf_rqml(fpat)

Read QuakeXML (qml) from ASDF file(s) matching file string pattern fpat. Returns:

  • H, Array{SeisHdr,1}
  • R, Array{SeisSrc,1}
asdf_wqml(fname, H, R[, keywords])
asdf_wqml(fname, EV[, keywords])

Write to ASDF “QuakeML ” group in file fname. In the above function calls, H can be a SeisHdr or Array{SeisHdr, 1}; R can be a SeisSource or Array{SeisSource, 1}; EV can be a SeisEvent or Array{SeisEvent, 1}.

Supported Keywords

KW Type Default Meaning
ovr Bool false Overwrite data in existing traces?
v Integer 0 verbosity
read_asdf_evt(filestr, event_id[, keywords])
read_asdf_evt(filestr[, keywords])

Read data in seismic HDF5 format with ids matching event_id from files matching pattern filestr. Returns an array of SeisEvent structures. With only one input argument, all event IDs are read.

Keywords:

  • msr: (Bool) read full (MultiStageResp) instrument response?
  • v: (Integer) verbosity level
scan_hdf5(h5file)
scan_hdf5(hdf, level="trace")

Scan HDF5 archive h5file and return station names with waveform data contained therein as a list of strings formatted “nn.sssss” (network.station).

Set level=”trace” to return channel names with waveform data as a list of strings formatted “nn.sssss.ll.ccc” (network.station.location.channel).

Appendices

Time Syntax

Functions that allow time specification use two reserved keywords or arguments to track time:

  • s: Start (begin) time
  • t: Termination (end) time

Specify each as a DateTime, Real, or String.

  • Real numbers are interpreted as seconds. Special behavior is invoked when both s and t are of Type Real.
  • DateTime values should follow Julia documentation
  • Strings have the expected format spec YYYY-MM-DDThh:mm:ss.ssssss
    • Fractional second is optional and accepts up to 6 decimal places (μs)
    • Incomplete time Strings treat missing fields as 0.
    • Example: s=”2016-03-23T11:17:00.333”

It isn’t necessary to choose values so that st. The two values are always sorted, so that t < s doesn’t error.

Time Types and Behavior

typeof(s) typeof(t) Behavior
DateTime DateTime convert to String, then sort
DateTime Real add t seconds to s, convert to String, then sort
DateTime String convert s to String, then sort
Real DateTime add s seconds to t, convert to String, then sort
Real Real treat as relative (see below), convert to String, sort
Real String add s seconds to t, convert to String, then sort
String DateTime convert t to String, then sort
String Real add t seconds to s, convert to String, then sort
String String sort
Special Behavior with two Real arguments

If s and t are both Real numbers, they’re treated as seconds measured relative to the start of the current minute. This convention may seem unusual, but it greatly simplifies web requests; for example, specifying s=-1200.0, t=0.0 is a convenient shorthand for “the last 20 minutes of data”.

Data Requests Syntax

Channel ID Syntax

NN.SSSSS.LL.CC (net.sta.loc.cha, separated by periods) is the expected syntax for all web functions. The maximum field width in characters corresponds to the length of each field (e.g. 2 for network). Fields can’t contain whitespace.

NN.SSSSS.LL.CC.T (net.sta.loc.cha.tflag) is allowed in SeedLink. T is a single-character data type flag and must be one of DECOTL: Data, Event, Calibration, blOckette, Timing, or Logs. Calibration, timing, and logs are not in the scope of SeisIO and may crash SeedLink sessions.

The table below specifies valid types and expected syntax for channel lists.

Type Description Example
String Comma-delineated list of IDs “PB.B004.01.BS1, PB.B002.01.BS1”
Array{String,1} String array, one ID string per entry [“PB.B004.01.BS1”, “PB.B002.01.BS1”]
Array{String,2} String array, one set of IDs per row [“PB” “B004” “01” “BS1”;
    “PB” “B002” “01” “BS1”]

The expected component order is always network, station, location, channel; thus, “UW.TDH..EHZ” is OK, but “UW.TDH.EHZ” fails.

chanspec()

Type ?chanspec in Julia to print the above info. to stdout.

Wildcards and Blanks

Allowed wildcards are client-specific.

  • The LOC field can be left blank in any client: "UW.ELK..EHZ" and ["UW" "ELK" "" "EHZ"] are all valid. Blank LOC fields are set to -- in IRIS, * in FDSN, and ?? in SeedLink.
  • ? acts as a single-character wildcard in FDSN & SeedLink. Thus, CC.VALT..??? is valid.
  • * acts as a multi-character wildcard in FDSN. Thus, CC.VALT..* and CC.VALT..??? behave identically in FDSN.
  • Partial specifiers are OK, but a network and station are always required: "UW.EL?" is OK, ".ELK.." fails.
Channel Configuration Files

One entry per line, ASCII text, format NN.SSSSS.LL.CCC.D. Due to client-specific wildcard rules, the most versatile configuration files are those that specify each channel most completely:

# This only works with SeedLink
GE.ISP..BH?.D
NL.HGN
MN.AQU..BH?
MN.AQU..HH?
UW.KMO
CC.VALT..BH?.D

# This works with FDSN and SeedLink, but not IRIS
GE.ISP..BH?
NL.HGN
MN.AQU..BH?
MN.AQU..HH?
UW.KMO
CC.VALT..BH?

# This works with all three:
GE.ISP..BHZ
GE.ISP..BHN
GE.ISP..BHE
MN.AQU..BHZ
MN.AQU..BHN
MN.AQU..BHE
MN.AQU..HHZ
MN.AQU..HHN
MN.AQU..HHE
UW.KMO..EHZ
CC.VALT..BHZ
CC.VALT..BHN
CC.VALT..BHE

SeisIO Standard Keywords

SeisIO.KW is a memory-resident structure of default values for common keywords used by package functions. KW has one substructure, SL, with keywords specific to SeedLink. These defaults can be modified, e.g., SeisIO.KW.nev=2 changes the default for nev to 2.

KW Default T [1] Meaning
comp 0x00 U8 compress data on write? [2]
fmt “miniseed” S request data format [3]
mag [6.0, 9.9] A{F,1} magnitude range for queries
n_zip 100000 I compress if length(:x) > n_zip
nd 1 I number of days per subrequest
nev 1 I number of events returned per query
nx_add 360000 I length increase of undersized data array
nx_new 8640000 I number of samples for a new channel
opts “” S user-specified options [4]
prune true B call prune! after get_data?
rad [] A{F,1} radial search region [5]
reg [] A{F,1} rectangular search region [6]
si true B autofill station info on data req? [7]
src “IRIS” S data source; type ?seis_www for list
to 30 I read timeout for web requests (s)
v 0 I verbosity
w false B write requests to disk? [8]
y false B sync data after web request?

Table Footnotes

[1]Types: A = Array, B = Boolean, C = Char, DT = DateTime, F = Float, I = Integer, S = String, U8 = Unsigned 8-bit integer (UInt8)
[2]If KW.comp == 0x00, never compress data; if KW.comp == 0x01, only compress channel i if length(S.x[i]) > KW.n_zip; if comp == 0x02, always compress data.
[3]Strings have the same names and spellings as file formats in read_data. Note that “sac” in a web request is aliased to “sacbl”, i.e., binary little-endian SAC, to match the native endianness of the Julia language.
[4]String is passed as-is, e.g. “szsrecs=true&repo=realtime” for FDSN. String should not begin with an ampersand.
[5]Specify region [center_lat, center_lon, min_radius, max_radius, dep_min, dep_max], with lat, lon, and radius in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches.
[6]Specify region [lat_min, lat_max, lon_min, lon_max, dep_min, dep_max], with lat, lon in decimal degrees (°) and depth in km with + = down. Depths are only used for earthquake searches.
[7]FDSNWS timeseries only.
[8]If w=true, a file name is automatically generated from the request parameters, in addition to parsing data to a SeisData structure. Files are created from the raw download even if data processing fails, in contrast to get_data(… wsac=true).

Utility Functions

This appendix covers utility functions that belong in no other category.

d2u(DT::DateTime)

Aliased to Dates.datetime2unix.

fctoresp(fc, c)

Generate a generic PZResp object for a geophone with critical frequency fc and damping constant c. If no damping constant is specified, assumes c = 1/sqrt(2).

find_regex(path::String, r::Regex)

OS-agnostic equivalent to Linux find. First argument is a path string, second is a Regex. File strings are postprocessed using Julia’s native PCRE Regex engine. By design, find_regex only returns file names.

getbandcode(fs, fc=FC)

Get SEED-compliant one-character band code corresponding to instrument sample rate fs and corner frequency FC. If unset, FC is assumed to be 1 Hz.

get_file_ver(fname::String)

Get the version of a SeisIO native format file.

get_seis_channels(S::GphysData)

Get numeric indices of channels in S whose instrument codes indicate seismic data.

guess(fname::String)

Attempt to guess data file format and endianness using known binary file markers.

inst_code(C::GphysChannel)
inst_code(S::GphysData, i::Int64)
inst_code(S::GphysData)

Get instrument codes.

ls(s::String)

Similar functionality to Bash ls with OS-agnostic output. Accepts wildcards in paths and file names. * Always returns the full path and file name. * Partial file name wildcards (e.g. “ls(data/2006*.sac)) invoke glob. * Path wildcards (e.g. ls(/data/*/*.sac)) invoke find_regex to circumvent glob limitations. * Passing ony “*” as a filename (e.g. “ls(/home/*)) invokes find_regex to recursively search subdirectories, as in the Bash shell.

ls()

Return full path and file name of files in current working directory.

j2md(y, j)

Convert Julian day j of year y to month, day.

md2j(y, m, d)

Convert month m, day d of year y to Julian day j.

Remove unwanted characters from S.

parsetimewin(s, t)

Convert times s and t to strings \(\alpha, \omega\) sorted \(\alpha < \omega\). s and t can be real numbers, DateTime objects, or ASCII strings. Expected string format is “yyyy-mm-ddTHH:MM:SS.nnn”, e.g. 2016-03-23T11:17:00.333.

resp_a0!(R::InstrumentResponse)
resp_a0!(S::GphysData)

Update sensitivity :a0 of PZResp/PZResp64 responses.

resptofc(R)

Attempt to guess critical frequency from poles and zeros of a PZResp/PZResp64.

set_file_ver(fname::String)

Sets the SeisIO file version of file fname.

u2d(x)

Alias to Dates.unix2datetime.

validate_units(S::GphysData)

Validate strings in :units field to ensure UCUM compliance.

vucum(str::String)

Check whether str contains valid UCUM units. .. _seisio_file_format:

SeisIO Native Format

Invoking the command wseis writes SeisIO structures to a native data format in little-endian byte order. This page documents the low-level file format. Abbreviations used:

Type Meaning C Fortran 77
Char Unicode character wchar CHARACTER*4
Float32 32-bit float float REAL
Float64 64-bit float double REAL*8
Int8 signed 8-bit int short INTEGER
Int16 signed 16-bit int int INTEGER*2
Int32 signed 32-bit int long INTEGER*4
Int64 signed 64-bit integer long long INTEGER*8
UInt8 unsigned 8-bit int unsigned short CHARACTER
UInt16 unsigned 16-bit int unsigned  
UInt32 unsigned 32-bit int unsigned long  
UInt64 unsigned 64-bit int unsigned long long  
Special instructions:

Parentheses, “()”, denote a custom object Type.
“{ (condition)” denotes the start of a loop; (condition) is the control flow.
“}” denotes the end of a loop.

Note that String in Julia has no exact C equivalence. SeisIO writes each String in two parts: an Int64 (String length in bytes) followed by the String contents (as bytes, equivalent to UInt8). Unlike C/Fortran, there are no issues with strings that contain the null character (0x00 or '\x0').

SeisIO File

Var Meaning T N
  “SEISIO” UInt8 6
V SeisIO file format version Float32 1
J # of SeisIO objects in file UInt32 1
C SeisIO object codes for each object UInt32 J
B Byte indices for each object UInt64 J
{     for i = 1:J
  (Objects) variable J
}      
ID ID hashes UInt64 variable
TS Start times Int64 variable
TE End times Int64 variable
P Parent object index in C and B variable  
bID Byte offset of ID array Int64 1
bTS Byte offset of TS array Int64 1
bTE Byte offset of TE array Int64 1
bP Byte offset of P array Int64 1

ID, TS, and TE are the ID, data start time, and data end time of each channel in each object. P is the index of the parent object in C and B. TS and TE are measured from Unix epoch time (1970-01-01T00:00:00Z) in integer microseconds.

Intent: when seeking data from channel i between times s and t, if hash(i) matches ID[j] and the time windows overlap, retrieve index k = P[j] from NP, seek to byte offset B[k], and read an object of type C[k] from file.

If an archive contains no data objects, ID, TS, TE, and P are empty; equivalently, bID == bTS.

Simple Object Types

Fields of these objects are written in one of three ways: as “plain data” types, such as UInt8 or Float64; as arrays; or as strings.

In a simple object, each array is stored as follows: 1. Int64 number of dimensions (e.g. 2) 2. Int64 array of dimensions themselves (e.g. 2, 2) 3. Array values (e.g. 0.08250153, 0.023121119, 0.6299772, 0.79595184)

EQLoc
Var Type N Meaning
lat Float64 1 latitude
lon Float64 1 longitude
dep Float64 1 depth
dx Float64 1 x-error
dy Float64 1 y-error
dz Float64 1 z-error
dt Float64 1 t-error (error in origin time)
se Float64 1 standard error
rms Float64 1 rms pick error
gap Float64 1 azimuthal gap
dmin Float64 1 minimum source-receiver distance in location
dmax Float64 1 maximum source-receiver distance in location
nst Int64 1 number of stations used to locate earthquake
flags UInt8 1 one-bit flags for special location properties
Ld Int64 1 length of “datum” string in bytes
datum UInt8 Ld Datum string
Lt Int64 1 length of “typ” (event type) string in bytes
typ UInt8 Lt earthquake type string
Li Int64 1 length of “sig” (error significance) string in bytes
sig UInt8 Li earthquake location error significance string
Lr Int64 1 length of “src” (data source) string in bytes
src UInt8 Lr data source string
flag meanings: (0x01 = true, 0x00 = false)
1. x fixed?
2. y fixed?
3. z fixed?
4. t fixed?
In Julia, get the value of flag[n] with >>(<<(flags, n-1), 7).
EQMag
Var Type N Meaning
val Float32 1 magnitude value
gap Float64 1 largest azimuthal gap between stations in magnitude
nst Int64 1 number of stations used in magnitude computation
Lsc Int64 1 length of magnitude scale string
msc UInt8 Lsc magnitude scale string
Lr Int64 1 length of data source string
src UInt8 Lr data source string
SeisPha
Var Type N Meaning
F Float64 8 amplitude, distance, incidence angle, residual,
      ray parameter, takeoff angle, travel time, uncertainty
C Char 2 polarity, quality
SourceTime
Var Type N Meaning
Ld Int64 1 size of descriptive string in bytes
desc UInt8 1 descriptive string
F Float64 3 duration, rise time, decay time
StringVec

A vector of variable-length strings; its exact Type in Julia is Array{String,1}.

StringVec
Var Type N Meaning
ee UInt8 1 is this string vector empty? [9]
L Int64 1 number of strings to read
{     i = 1:L
nb Int64 1 length of string in bytes
str UInt8 nb string
}      
[9]If ee == 0x00, then no values are stored for L, nb, or str.

Location Types

GenLoc
Var Type N Meaning
Ld Int64 1 length of datum string in bytes
datum UInt8 Ld datum string
Ll Int64 1 length of location vector in bytes
loc Float64 Ll location vector
GeoLoc
Var Type N Meaning
Ld Int64 1 length of datum string in bytes
datum UInt8 Ld datum string
F Float64 6 latitude, longitude, elevation,
      depth, azimuth, incidence
UTMLoc
Var Type N Meaning
Ld Int64 1 length of datum string in bytes
datum UInt8 N datum string
zone Int8 1 UTM zone number
hemi Char 1 hemisphere
E UInt64 1 Easting
N UInt64 1 Northing
F Float64 4 elevation, depth, azimuth, incidence
XYLoc
Var Type N Meaning
Ld Int64 1 Length of datum string in bytes
datum UInt8 Ld datum string
F Float64 8 x, y, z, azimuth, incidence, origin x, origin y, origin z

Response Types

GenResp
Var Type N Meaning
Ld Int64 1 length of descriptive string in bytes
desc UInt8 Ld descriptive string
nr Int64 1 Number of rows in complex response matrix
nc Int64 1 Number of columns in complex response matrix
resp Complex{Float64,2} nr*nc complex response matrix
PZResp
Var Type N Meaning
c Float32 1 damping constant
np Int64 1 number of complex poles
p Complex{Float32,1} np complex poles vector
nz Int64 1 number of complex zeros
z Complex{Float32,1} nz complex zeros vector

PZResp64 is identical to PZResp with Float64 values for c, p, z, rather than Float32.

The Misc Dictionary

Most compound objects below contain a dictionary (Dict{String,Any}) for non-essential information in a field named :misc. The tables below describe how this field is written to disk.

Misc
Var Type N Meaning
N Int64 1 number of items in dictionary [10]
K (StringVec) 1 dictionary keys
{     for i = 1:N
c UInt8 1 Type code of object i
o variable 1 object i
}      
[10]If N == 0, then N is the only value present.
Dictionary Contents

These subtables describe how to read the possible data types in a Misc dictionary.

String Array (c == 0x81)
Var Type N Meaning
A (StringVec) 1 string vector
Other Array (c == 0x80 or c > 0x81)
Var Type N Meaning
nd Int64 1 number of dimensions in array
dims Int64 nd array dimensions
arr varies prod(nd) array
String (c == 0x01)
Var Type N Meaning
L Int64 1 size of string in bytes
str UInt8 1 string
Bits Type (c == 0x00 or 0x01 < c < 0x7f)

Read a single value whose Type corresponds to the UInt8 Type code.

Compound Object Types

Each of these objects contains at least one of the above simple object types.

PhaseCat
Var Type N Meaning
N Int64 1 number of SeisPha objects to read [11]
K (StringVec) 1 dictionary keys
pha (SeisPha) N seismic phases
[11]If N == 0, then N is the only value present.
EventChannel

A single channel of data related to a seismic event

Var Type N Meaning
Ni Int64 1 size of id string in bytes
id UInt8 Ni id string
Nn Int64 1 size of name string in bytes
name UInt8 Nn name string
Lt UInt8 1 location Type code
loc (Loc Type) 1 instrument position
fs Float64 1 sampling frequency in Hz
gain Float64 1 scalar gain
Rt UInt8 1 response Type code
resp (Resp Type) 1 instrument response
Nu Int64 1 size of units string in bytes
units UInt8 Nu units string
az Float64 1 azimuth
baz Float64 1 backazimuth
dist Float64 1 source-receiver distance
pha (PhaseCat) 1 phase catalog
Nr Int64 1 size of data source string in bytes
src UInt8 Nr data source string
misc (Misc) 1 dictionary for non-essential information
notes (StringVec) 1 notes and automated logging
Nt Int64 1 length of time gaps matrix
T Int64 2Nt time gaps matrix
Xc UInt8 1 Type code of data vector
Nx Int64 1 number of samples in data vector
X variable NX data vector
SeisChannel

A single channel of univariate geophysical data

Var Type N Meaning
Ni Int64 1 size of id string in bytes
id UInt8 Ni id string
Nn Int64 1 size of name string in bytes
name UInt8 Nn name string
Lt UInt8 1 location Type code
loc (Loc Type) 1 instrument position
fs Float64 1 sampling frequency in Hz
gain Float64 1 scalar gain
Rt UInt8 1 response Type code
resp (Resp Type) 1 instrument response
Nu Int64 1 size of units string in bytes
units UInt8 Nu units string
Nr Int64 1 size of data source string in bytes
src UInt8 Nr data source string
misc (Misc) 1 dictionary for non-essential information
notes (StringVec) 1 notes and automated logging
Nt Int64 1 length of time gaps matrix
T Int64 2Nt time gaps matrix
Xc UInt8 1 Type code of data vector
Nx Int64 1 number of samples in data vector
X variable NX data vector
EventTraceData

A multichannel record of time-series data related to a seismic event.

Var Type N Meaning
N Int64 1 number of data channels
Lc UInt8 N location Type codes for each data channel
Rc UInt8 N response Type codes for each data channel
Xc UInt8 N data Type codes for each data channel
cmp UInt8 1 are data compressed? (0x01 = yes)
Nt Int64 N number of rows in time gaps matrix for each channel
Nx Int64 N length of data vector for each channel [12]
id (StringVec) 1 channel ids
name (StringVec) 1 channel names
loc (Loc Type) N instrument positions
fs Float64 N sampling frequencies of each channel in Hz
gain Float64 N scalar gains of each channel
resp (Resp Type) N instrument responses
units (StringVec) 1 units of each channel’s data
az Float64 N event azimuth
baz Float64 N backazimuths to event
dist Float64 N source-receiver distances
pha (PhaseCat) N phase catalogs for each channel
src (StringVec) 1 data source strings for each channel
misc (Misc) N dictionaries of non-essential information for each channel
notes (StringVec) N notes and automated logging for each channel
{     for i = 1:N
T Int64 2Nt[i] Matrix of time gaps for channel i
}      
{     for i = 1:N
X Xc[i] Nx[i] Data vector i [13]
}      
[12]If cmp == 0x01, each value in Nx is the number of bytes of compressed data to read; otherwise, this is the number of samples in each channel.
[13]If cmp == 0x01, read Nx[i] samples of type UInt8 and pass through lz4 decompression to generate data vector i; else read Nx[i] samples of the type corresponding to code Xc[i].
SeisData

A record containing multiple channels of univariate geophysical data.

Var Type N Meaning
N Int64 1 number of data channels
Lc UInt8 N location Type codes for each data channel
Rc UInt8 N response Type codes for each data channel
Xc UInt8 N data Type codes for each data channel
cmp UInt8 1 are data compressed? (0x01 = yes)
Nt Int64 N number of rows in time gaps matrix for each channel
Nx Int64 N length of data vector for each channel [14]
id (StringVec) 1 channel ids
name (StringVec) 1 channel names
loc (Loc Type) N instrument positions
fs Float64 N sampling frequencies of each channel in Hz
gain Float64 N scalar gains of each channel
resp (Resp Type) N instrument responses
units (StringVec) 1 units of each channel’s data
src (StringVec) 1 data source strings for each channel
misc (Misc) N dictionaries of non-essential information for each channel
notes (StringVec) N notes and automated logging for each channel
{     for i = 1:N
T Int64 2Nt[i] Matrix of time gaps for channel i
}      
{     for i = 1:N
X Xc[i] Nx[i] Data vector i [15]
}      
[14]If cmp == 0x01, each value in Nx is the number of bytes of compressed data to read; otherwise, this is the number of samples in each channel.
[15]If cmp == 0x01, read Nx[i] samples of type UInt8 and pass through lz4 decompression to generate data vector i; else read Nx[i] samples of the type corresponding to code Xc[i].
SeisHdr
Var Type N Meaning
Li Int64 1 length of event ID string
id UInt8 Li event ID string
iv UInt8 1 intensity value
Ls Int64 1 length of intensity scale string
isc UInt8 Ls intensity scale string
loc (EQLoc) 1 earthquake location
mag (EQMag) 1 earthquake magnitude
misc (Misc) 1 dictionary containing non-essential information
notes (StringVec) 1 notes and automated logging
ot Int64 1 origin time [16]
Lr Int64 1 length of data source string
src UInt8 Lr data source string
Lt Int64 1 length of event type string
typ UInt8 Lt event type string
[16]Measured from Unix epoch time (1970-01-01T00:00:00Z) in integer microseconds
SeisSrc
Var Type N Meaning
Li Int64 1 length of source id string
id UInt8 Li id string
Le Int64 1 length of event id string
eid UInt8 Le event id string
m0 Float64 1 scalar moment
Lm Int64 1 length of moment tensor vector
mt Float64 Lm moment tensor vector
Ld Int64 1 length of moment tensor misfit vector
dm Float64 Ld moment tensor misfit vector
np Int64 1 number of polarities
gap Float64 1 max. azimuthal gap
pad Int64 2 dimensions of principal axes matrix
pax Float64 pad[1]*pad[2] principal axes matrix
pld Int64 2 dimensions of nodal planes matrix
planes Float64 pld[1]*pld[2] nodal planes matrix
Lr Int64 1 length of data source string
src UInt8 1 data source string
st (SourceTime) 1 source-time description
misc (Misc) 1 Dictionary containing non-essential information
notes (StringVec) 1 Notes and automated logging
SeisEvent
Var Type N Meaning
hdr (SeisHdr) 1 event header
source (SeisSrc) 1 event source process
data (EventTraceData) 1 event trace data

Data Type Codes

Each Type code is written to disk as a UInt8, with the important exception of SeisIO custom object Type codes (which use UInt32).

Loc Type Codes
UInt8 Type
0x00 GenLoc
0x01 GeoLoc
0x02 UTMLoc
0x03 XYLoc
Resp Type Codes
UInt8 Type
0x00 GenResp
0x01 PZResp
0x02 PZResp64
Other Type Codes

Only the Types below are faithfully preserved in write/read of a :misc field dictionary; other Types are not written to file and can cause wseis to throw errors.

Type UInt8 Type UInt8
Char 0x00 Array{Char,N} 0x80
String 0x01 Array{String,N} 0x81
UInt8 0x10 Array{UInt8,N} 0x90
UInt16 0x11 Array{UInt16,N} 0x91
UInt32 0x12 Array{UInt32,N} 0x92
UInt64 0x13 Array{UInt64,N} 0x93
UInt128 0x14 Array{UInt128,N} 0x94
Int8 0x20 Array{Int8,N} 0xa0
Int16 0x21 Array{Int16,N} 0xa1
Int32 0x22 Array{Int32,N} 0xa2
Int64 0x23 Array{Int64,N} 0xa3
Int128 0x24 Array{Int128,N} 0xa4
Float16 0x30 Array{Float16,N} 0xb0
Float32 0x31 Array{Float32,N} 0xb1
Float64 0x32 Array{Float64,N} 0xb2
Complex{UInt8} 0x50 Array{Complex{UInt8},N} 0xd0
Complex{UInt16} 0x51 Array{Complex{UInt16},N} 0xd1
Complex{UInt32} 0x52 Array{Complex{UInt32},N} 0xd2
Complex{UInt64} 0x53 Array{Complex{UInt64},N} 0xd3
Complex{UInt128} 0x54 Array{Complex{UInt128},N} 0xd4
Complex{Int8} 0x60 Array{Complex{Int8},N} 0xe0
Complex{Int16} 0x61 Array{Complex{Int16},N} 0xe1
Complex{Int32} 0x62 Array{Complex{Int32},N} 0xe2
Complex{Int64} 0x63 Array{Complex{Int64},N} 0xe3
Complex{Int128} 0x64 Array{Complex{Int128},N} 0xe4
Complex{Float16} 0x70 Array{Complex{Float16},N} 0xf0
Complex{Float32} 0x71 Array{Complex{Float32},N} 0xf1
Complex{Float64} 0x72 Array{Complex{Float64},N} 0xf2
SeisIO Object Type codes
UInt32 Code Object Type
0x20474330 EventChannel
0x20474331 SeisChannel
0x20474430 EventTraceData
0x20474431 SeisData
0x20495030 GenLoc
0x20495031 GeoLoc
0x20495032 UTMLoc
0x20495033 XYLoc
0x20495230 GenResp
0x20495231 PZResp64
0x20495232 PZResp
0x20504330 PhaseCat
0x20534530 SeisEvent
0x20534830 SeisHdr
0x20535030 SeisPha
0x20535330 SeisSrc
0x20535430 SourceTime
0x45514c30 EQLoc
0x45514d30 EQMag

File Format Version History

File format versions <0.50 are no longer supported; please email us if you need to read in very old data.
Version Date Change
0.53 2019-09-11 removed :i, :o from CoeffResp
    added :i, :o to MultiStageResp
0.52 2019-09-03 added CoeffResp, MultiStageResp
0.51 2019-08-01 added :f0 to PZResp, PZResp64
0.50 2019-06-05 all custom Types can now use write() directly
    rewrote how :misc is stored
    Type codes for :misc changed
    deprecated BigFloat/BigInt support in :misc
    :n is no longer stored as a UInt32
    :x compression no longer automatic
    :x compression changed from Blosc to lz4