[teqc] helpful tip of week 1898

Lou Estey lou at unavco.org
Wed May 25 09:36:08 MDT 2016

This week's tip: understanding teqc's auto-identification when reading a format.

No matter what you are doing with teqc (translating, editing, quality checking),
teqc has to know what it is reading in order to do anything with it.  (Remember:
no matter what you think the input is, it's just a binary pile of '1's and '0's.)
The first thing you need to keep in mind is that teqc is nominally a one-pass filter
or, in other words, it reads the majority of the input -- this pile of '1's and '0's
you are giving it -- only once.

If the input is a file (as opposed to stdin), then teqc will read a few bytes of
the beginning of the file to try to determine what it is (and then rewind the file
to the beginning).  Surprisingly, if the data starts off in a "normal" way, it
either takes very few bytes to determine what the format is (true for the majority
of the formats that teqc can read) or there's essentially no practical way for teqc
to figure out what the input is and it must be explicitly told (true for just a
few of the formats that teqc can read).

When using stdin, you must tell teqc what the input format is.  The reason?  Stdin
is not a file type that can be rewound upon opening.  (If you don't know what
"standard input" is, see http://postal.unavco.org/pipermail/teqc/2016/002072.html )

How do you know if teqc can correctly determine the format of the input?  One
way is to gamble and just hope it does.  This is usually fine if you are doing
something non-critical yourself on the command line.  If things go astray, you
can always force teqc to read the input as a specific format, e.g.

-binex   == read the input as BINEX format
-jav jps == read the input as Javad JPS format
-lei mdb == read the input as Leica MDB format
-sep sbf == read the input as Septentrio Binary Format (SBF)
-top tps == read the input as Topcon TPS format
-tr d    == read the input as Trimble .dat/.tgd format
... and so on.  Note that there is no equivalent flag for RINEX, since either
teqc will figure out that the RINEX input really is RINEX, or if not and the file is
supposed to be RINEX, then that file is so messed up it is not really RINEX anyway.

To determine if teqc's idea of what the format is matches your idea of what the format is,
use the '+mdf' option (= "output the metadata format").  For example (a file in
a directory where I happen to be working at the moment):

[298] teqc +mdf prx51440.16_
probable format of 'prx51440.16_': Septentrio SBF

If using '+mdf' returns with "probable format", like above, odds are very high
this is correct.

However it returns with "possible format", odds are very low that teqc knows what the
format is and in that case you will need to include a format flag to tell teqc what the
format is supposed to be.

Note 1: Don't try to use '+mdf' with stdin; this functionality is not defined (yet).
Note 2: Unless specified otherwise, the format of stdin is assumed to be RINEX
(which could be RINEX observations, a RINEX navigation file, or a RINEX met file).

The formats where you absolutely _must_ include the flag specifying the format are
(fortunately not commonly used by most users):

-ash u  == read the input as Ashtech U-file format
-rtigs  == read the input as IGS Real-time format
-soc    == read the input as JPL Soc format

Of course, there is also the issue, as stated above, that the success of teqc
being able to auto-identify the input format depends on the bytes starting off
in a "normal" way.  This is a little hard to define in general, but it assumes
a normal file or stream and it must start at the beginning of a record or message
of that format that is normally encountered and that initial record or message
must also not be be corrupted (or at least not in the first "few" bytes).
Typically for downloaded files, this is exactly what you should have.

Caveat when reading multiple file input: If the format is not specified in the
command, teqc opens only the first file listed to try to determine the format
and all the following files are assumed to be of the same format type.

Happy teqc-ing!


Louis H. Estey, Ph.D.              office:  [+001] 303-381-7456
UNAVCO, 6350 Nautilus Drive           FAX:  [+001] 303-381-7451
Boulder, CO  80301-5554            e-mail:  lou  unavco.org
      WWW:  http://www.unavco.org   http://jules.unavco.org

"If the universe is the answer, what is the question?"
                                                -- Leon Lederman

Past helpful tips:

week 1894: using teqc config files - http://postal.unavco.org/pipermail/teqc/2016/002067.html
week 1895: qc of high-rate data - http://postal.unavco.org/pipermail/teqc/2016/002071.html
week 1896: UNIX/Linux shells for Windows - http://postal.unavco.org/pipermail/teqc/2016/002072.html
week 1897: '-' vs. '+' teqc options - http://postal.unavco.org/pipermail/teqc/2016/002076.html

More information about the teqc mailing list