B. The SEISAN waveform file format

The file is written from Fortran as an unformatted file. This means that the file contains additional characters (not described below, see end of this Appendix) between each block, which must be taken into account if the file is read as a binary file. If read as Fortran unformatted, the content will appear as described below. However, the internal structure is different on Sun, Linux and PC. SEISAN automatically corrects for these differences. The SEISAN ASCII format has identical headers to the binary files, however the binary samples are written as formatted integers, one channel at the time just like the in the binary format.


 EVENT FILE HEADER
                   
 CONTAINS MINIMUM 12 ASCII STRINGS OF 80 BYTES.
 ALL FORMATS I OR A UNLESS OTHERWISE SPECIFIED.
 
line 1
   1     1: FREE
   2    30: NETWORK NAME,
            COULD E.G. BE WESTERN NORWAY NETWORK
  31    33: NUMBER OF CHANNELS, MAX 999
  34    36: YEAR-1900, e.g. 101 for 2001 (I3)
  37
  38    40: DOY
  41
  42    43: MONTH
  44
  45    46: DAY
  47
  48    49: HR
  50
  51    52: MIN
  53
  54    59: SEC, FORMAT F6.3
  60
  61    69: TOTAL TIME WINDOW (SECS), FORMAT F9.3
  70    80: FREE
  71      
  72
  73    80: FREE

                                                              |
 line 2
   1    80: FREE
                                                              |
 line 3
   1
   2     5: STATION CODE (A4), first 4 characters             
   6     7: FIRST two COMPONENT CODES (A2), SEED style
   8      : NOT USED
   9      : LAST COMPONENT CODE (A1), SEED style
  10      : STATION CODE (A1), LAST CHARACTER IF 5 CHARACTER STATION CODE
  11    17: START TIME RELATIVE TO EVENT FILE TIME (SECS) F7.2  
  18      : BLANK        
  19    26: STATION DATA INTERVAL LENGTH (SECS)  F8.2         
  27    52: SECOND CHANNEL                                   
  53    78: THIRD  CHANNEL                               
  79    80: BLANK                                             |
                                                              |

 line 4-XX, where XX depends on number of channels, however, XX 
            is at least 12 so there might be some blank lines.
   1    80: THREE MORE CHANNELS (SAME FORMAT AS line 3)
                                                              |

 EVENT FILE CHANNEL HEADER
                          
 HEADER IS 1040 BYTES LONG, WRITTEN AS ONE VARIABLE DEFINED AS
 CHARACTER*1040
 THE PARAMETERS ARE WRITTEN FORMATTED WITH INTERNAL WRITE INTO
 1040 BYTE TEXT STRING.
 FORMAT IS ALWAY I FORMAT UNLESS OTHERWISE SPECIFIED

   1     5: STATION CODE (A5)
   6     7: FIRST TWO COMPONENT CODES (A2), SEED style  
   8      : FIRST LOCATION CODE (A1), SEED style
   9      : LAST COMPONENT CODE (A1), SEED style

  10    12: YEAR - 1900, e.g. 101 for 2001, (I3)
  13      : SECOND LOCATION CODE (A1), SEED style
  14    16: DOY
  17      : FIRST NETWORK CODE (A1), SEED style
  18    19: MONTH
  20      : SECOND NETWORK CODE (A1), SEED style
  21    22: DAY
  23
  24    25: HR
  26
  27    28: MIN
  29      : TIMING INDICATOR, BLANK: TIME IS OK, E: UNCERATIAN TIME
  30    35: SECOND (F6.3)
  36
  37    43: SAMPLE RATE  (F7.2 or any f-format)
  44    50: NUMBER OF SAMPLES (I7)
  51
  52    59: LATITUDE (F8.4), optional
  60
  61    69: LONGITUDE (F9.4), optional
  70
  71    75: ELEVATION (METERS), optional
  76      : Indicate gain factor: Blank: No gain factor, G: Gain factor in 
          column 148 to 159
  77      : 2 OR 4 FOR 2 OR 4 BYTE INTEGER, BLANK IS 2 BYTE
  78      : P: Poles and zeros used for response info, blank: Seismometer
               period etc used for response info. See below for details.
            T: Use up to 30 tabulated values irrespective of what is given
               below. If less than 30, blank characters must be given.
  79      : C: a combination of table, poles and zeros or instrument
               constants have been used, for information only. Value in 78
               must then be T.
            F: Force use of header response, e.g. generated by MULPLT. Only
               gain at 1 hz is correct and 78 must be set to T.
  80 -  80: FREE
  148- 159: Normally comment, if 76 set to G, this is a gain factor, 
            format G12.7. All samples read from channel are multipled
            by this factor when read by routine seisinc. Used when data
            is stored in units of e.g. nm where values can be less than 1.
            Currently  generated by MULPLT when option OUT is used to
            extract part of a waveform file. Alse program WAVETOOL will
            generate these files (used with Out option). Some conversion
            programs may also write this.
  81 - 160: COMMENT LINE DESCRIBING THE SYSTEM RESPONSE (A80)
 
 If character 78 is blank, option 1:

 161 - 240: (10G8.3) 1. SEISMOMETER PERIOD
                     2. FRACTION OF CRITICAL DAMPING
                     3. SEISMOMETER GENERATOR CONSTANT (V/m/s) or
                        ACCELEROMETER SENSITIVITY (V/G)
                     4. AMPLIFIER GAIN
                     5. RECORDING MEDIA GAIN (I.E. 2048 COUNTS/VOLT)
                     6. GAIN AT 1.0 HZ, UNITS: COUNTS/METER
                     7. CUTOFF FREQUENCY FOR FILTER1 (HZ)
                     8. # OF POLES FOR FILTER1 (NEGATIVE FOR HIGHPASS)
                     9. CUTOFF FREQUENCY FOR FILTER2 (HZ)
                    10. # OF POLES FOR FILTER2 (NEGATIVE FOR HIGHPASS)
 241 - 320: (10G8.3) FREQUENCIES AND #'S OF POLES FOR FIVE MORE FILTERS
 321 -1040: RESPONSE CURVES (9(10G8.3) FREQ., AMPL. (REL. 1.0 HZ) AND PHASE,
            WRITTEN IN GROUPS OF 10 FREQUENCIES, 10 AMPLITUDES AND 10 PHASES

  If character 78 is P, option 2:

  161 - 182 (1X,2I5,G11.4) 1. NUMBER OF POLES
                           2. NUMBER OF ZEROS
                           3. NORMALIZATION CONSTANT, COUNTS/M
  183 - 240 (5G11.4)       2 Poles in pairs of real and imaginary parts
  241 -1040 (G11.4)        Remaining poles and zeros. 7 values are written
                           and then 3 spaces are left blank, see example 
                           below.

For each pole or zero, there are two real numbers representing the real and imaginary part of the pole or zero, thus the number of poles is half the number of values written. First all the poles are written in pairs of real and imaginary parts, then follow the zeros. There is room for a total of 37 poles and zeros (74 pairs). The poles and zeros are written in a simulated line mode to make it easier to read, thus the 3 blanks after writing 7 values. It is assumed that the response is in displacment with units of counts/m.


SLR  L  E 86 199  7 18 15  6 35.960   1.000   1320                          4P  
                                                                                
    11    5  .2760E+11  .3770      .1830      .3770      .1830      .6540       
  .0000      .2320      .0000      .2320      .0000      .2320      .0000       
  .3280      .0000      .3280      .0000      .3280      .0000      .2140E 01   
  .0000      .2140E 01  .0000      .0000      .0000      .0000      .0000       
  .0000      .0000      .0000      .0000      .0000      .0000                  
 

NOTE: The component information in character 6 IS VERY IMPORTANT. It MUST be A if an accelerometer is used, any other character assumes a velocity transducer. This is only relevant however if option 1 is used where response values will be calculated from the free period etc. If option 1 with discrete values or poles and zeros are used, the first component character can be anything.


 
                   -------------------                                     
                   | EVENT  FILE     | at least 12* 80 BYTES
                   |    HEADER       |
                   -------------------                                     
                            |
                   -------------------                                     
                   | EVENT  FILE     |
                   | FIRST  CHANNEL  | 1040 BYTES
                   |    HEADER       |
                   -------------------                                     
                            |
                   -------------------                                     
                   |      DATA       |
                   | FIRST  CHANNEL  |
                   -------------------                                     
                            |
                   -------------------                                     
                   | EVENT  FILE     |
                   | NEXT   CHANNEL  | 1040 BYTES
                   |    HEADER       |
                   -------------------                                     
                            |
                   -------------------                                     
                   |      DATA       |
                   | NEXT   CHANNEL  |
                   -------------------                                     
                            |
                            |
                            |
                            |
                   -------------------                                     
                   | EVENT  FILE     |
                   | LAST   CHANNEL  | 1040 BYTES
                   |    HEADER       |
                   -------------------                                     
                            |
                   -------------------                                     
                   |      DATA       |
                   | LAST   CHANNEL  |
                   -------------------                                     
 

To write a SEISAN file: If main headers are called mhead, channel header chead, data is data (integer), there is nchan channels and each has nsamp samples, then the file is written as

Do i=1,12
  Write(1) mhead(i)
Enddo
Do k=1,nchan
  Write(1) chead
  Write(1) (data(i),i=1,nsmap)
Enddo

This example only works up to 30 channels when writing main header. For more channels, see e.g. program SEISEI how to do it.

Details of binary file structure

When Fortran writes a files opened with "form=unformatted", additional data is added to the file to serve as record separators which have to be taken into account if the file is read from a C-program or if read binary from a Fortran program. Unfortunately, the number of and meaning of these additional characters are compiler dependent. On Sun, Linux, MaxOSX and PC from version 7.0 (using Digital Fortran), every write is preceded and terminated with 4 additional bytes giving the number of bytes in the write. On the PC, Seisan version 6.0 and earlier using Microsoft Fortran, the first 2 bytes in the file are the ASCII character "KP". Every write is preceded and terminated with one byte giving the number of bytes in the write. If the write contains more than 128 bytes, it is blocked in records of 128 bytes, each with the start and end byte which in this case is the number 128. Each record is thus 130 bytes long. All of these additional bytes are transparent to the user if the file is read as an unformatted file. However, since the structure is different on Sun, Linux, MacOSX and PC, a file written as unformatted on Sun, Linux or MacOSX cannot be read as unformatted on PC or vice versa. . The files are very easy to write and read on the same computer but difficult to read if written on a different computer. To further complicate matters, the byte order is different on Sun and PC. With 64 bit systems, 8 bytes is used to define number of bytes written. This type of file can also be read with SEISAN, but so far only data written on Linux have been tested for reading on all systems. This means that version 7.0 can read all earlier waveform files on all platforms from all platforms. However, files written on version 7.0 PC cannot be read by any earlier versions of Seisan without modifying the earlier seisan version. In SEISAN, all files are written as unformatted files. In order to read the files independently of where they were written, the reading routine (buf_read in seisinc, in LIB) reads the file from Fortran as a direct access file with a record length of 2048 bytes. The additional bytes are thrown away, the relevant bytes fished out and swapped if the file is written on a different computer than where it is read. Since there is no information stored in the header of the file giving the byte address of each channel, the routine must read the first file-header, calculate how many bytes there are down to where the next channel starts, jump down and repeat the process until the desired channel is reached (this is also how SUDS files are read). However, compared to reading the file as unformatted, only a fraction of the file is read to fish out a particular channel. Once the channel header has been read, the start address is stored in the subroutine so any subsequent access to that channel is very fast. Overall, random access to SEISAN waveform files is much faster with the binary read than the previous (version 5.0 and earlier) unformatted read. Only in the case where the whole file is read is the unformatted read faster.



         PC file structure                       Sun and Linux file structure
     Up to and inluding version 6.0              PC structure from version 7.0

----------------------------------------         -----------------------------
one byte: K indicates start of file              4 bytes: # of bytes following
----------------------------------------         -----------------------------
one byte: # of bytes following                   one block of data
----------------------------------------         -----------------------------
128 bytes or less of data                        4 bytes: # bytes in prev. write     
----------------------------------------         -------------------------------
one byte: # of bytes in previous record          4 bytes: # of bytes following
----------------------------------------         -------------------------------
one byte: # of bytes in following record         one block of data
----------------------------------------         -------------------------------
128 bytes or less of data                        ........
.....                                            ........
.....                                            ........                      

For 64 bit systems, the above 4 byte numbers are 8 byte numbers.

From version 7.0,the Linux and PC file structures are exactly the same. On Sun the structure is the same except that the bytes are swapped. This is used by SEISAN to find out where the file was written. Since there is always 80 characters in the first write, character one in the Linux and PC file will be the character P (which is represented by 80) while on Sun character 4 is P.

Peter Voss : Tue Jun 8 13:38:42 UTC 2021