[teqc] another request for teqc +qc

Lou Estey lou at unavco.org
Thu Jul 31 14:07:35 MDT 2008


All,

If you go back to the "alternate summary line" thread in March 2007:
http://ls.unavco.org/pipermail/teqc/2007/000459.html
to
http://ls.unavco.org/pipermail/teqc/2007/000472.html
(last email included at the end of this one), I now have something that
is dangerously close -- in fact, it appears to be done except for cleaning
up a few details.  Here's where it's currently at:

* the alternate summary line kicks in automatically if:
    1) the user specifies a window that starts earlier and/or ends later than
       the actual data epochs being qc-ed,
    2) there have to be navigation messages being read (e.g. RINEX nav file(s)),
       i.e. this is a "qc full" run, and
    3) teqc has to find the antenna position successfully

* because the alternate summary line is primarily dependent on windowing,
it's being called "SWN" (unless someone can argue successfully for a
better name)

* the original SUM line will also be present with only one change:
the "hrs" value (currently) indicates the total time span of the
data from the first detected epoch to the last detected epoch
(within any specified time window, if used)

Example:
I have an input file with nominally 3 hours of data at 15-sec sampling:

[1132] teqc +mds part.obs
2008-01-28 10:00:00  2008-01-28 12:59:45    600197  part.obs

If I do just an ordinary qc:

[1133] teqc +qcq part.obs

the normal summary (SUM) line is produced as part of the output:

       first epoch    last epoch    hrs   dt  #expt  #have   %   mp1   mp2 o/slps
SUM 08  1 28 10:00 08  1 28 12:59 2.996  15   5986   5683  95  0.77  0.71    406

Now specifying a nominal 24-hour window in which this data occurs --
_and_ using a navigation file which includes the possible SVs that
should have been tracked during that 24-hour window (this last point
cannot be stressed enough!):

[1134] teqc -st 00:00:00 -e 23:59:45 +qcq part.obs

the summary lines are now:

       first epoch    last epoch    hrs   dt  #expt  #have   %   mp1   mp2 o/slps
SUM 08  1 28 10:00 08  1 28 12:59 2.996  15   5986   5683  95  0.77  0.71    406
SWN 08  1 28 00:00 08  1 28 23:59 24.00  15  48583   5683  12  0.77  0.71    406

You'll notice that the SUM lines are identical between the two runs -- as
I would argue they should be.  The "epoch" bounds on the new SWN line are
the first and last theoretical epochs that could have been tracked at that
site (starting with the actual data start and end times and the sampling
interval).

You'll also notice that some of the stats (i.e. dt, #have, mp[12], o/slps) are
duplicated, as per John Beavan's suggestion (below).

Doing the same thing with previous versions of teqc, you would have gotten:

no window, e.g. old `teqc +qcq part.obs`:
SUM 08  1 28 10:00 08  1 28 12:59 2.996  15   5977   5674  95  0.77  0.71    405

24-hour window, e.g. old `teqc -st 00:00:00 -e 23:59:45 +qcq part.obs`:
SUM 08  1 28 10:00 08  1 28 12:59 24.00  15   5977   5674  95  0.77  0.71    405

Note: the slight difference in the stats other than the "hours" value
is due to a recent bug fixes in the code so that it should now always
start the qc statistics on the first epoch, whereas in earlier versions
this was usually not the case.

The main difference, though, are "hours" values on the SUM used to
show the time span being qc-ed ("2.996" vs. "24.00"), which by default was
set by the actual data epochs, or indicated the length of a specified
time window.  With the two new summary lines, these two values are now
distinctly separated -- which I think is the better solution.

Bug #1:
To address Herb Dragert's first question (below), I still have to fiddle
a bit so that the hrs on the new SUM line will give the expected result
if there are one or more gaps of epochs in between the first and last
data epochs in the window.  (At the moment, the number being shown is
just the difference between the first and last data epochs, with no
adjustment for detected gaps.)

Bug #2:
If I do a qc on the full 24-hours of data at this site:

[1142] teqc +qcq full.obs 2> /dev/null | grep ^SUM
SUM 08  1 28 00:00 08  1 28 23:59 24.00  15  48599  43273  89  0.90  0.79    231

you'll notice that the expected number of observations is 48599, rather
than the 48583 observations determined above for the 3-hour file.  (I
suspect I'm not accounting for one or two epochs somewhere, but this is
still somewhat of a mystery.)

Bug #3:
The SV count on the "+<cutoff>" line is wrong in the windowed case, e.g.

[1144] teqc -st 00:00:00 -e 23:59:45 +qcq part.obs 2> /dev/null | grep ^+10
+10|                              999999888888888888888888888888888888888888|+10

compared to:
[1145] teqc +qcq full.obs 2> /dev/null | grep ^+10
+10|aa99bba9999999988778999988789999999988898888867788888aa99bbbcca988aabbba|+10

Primary user error:
using navigation messages covering not enough SVs!
In the windowed example above, I was using a RINEX nav file covering all
SVs that could be tracked at the site during those 24-hours, and:

[1151] teqc +mds part.gps 2> /dev/null
2008-01-28 00:00:00  2008-01-28 23:59:44     96314  part.gps

But let me window that nav file:

[1152] teqc -notice -st 10:00:00 -e 12:59:59 part.gps > tmp; mv tmp part.gps
[1153] teqc +mds part.gps 2> /dev/null
2008-01-28 10:00:00  2008-01-28 12:00:00     12704  part.gps

and now:

[1154] teqc -st 00:00:00 -e 23:59:45 +qcq part.obs 2> /dev/null | grep ^S
SUM 08  1 28 10:00 08  1 28 12:59 2.996  15   5874   5580  95  0.77  0.71    465
SWN 08  1 28 00:00 08  1 28 23:59 24.00  15  19252   5580  29  0.77  0.71    465

Compare with above (repeated here):
SUM 08  1 28 10:00 08  1 28 12:59 2.996  15   5986   5683  95  0.77  0.71    406
SWN 08  1 28 00:00 08  1 28 23:59 24.00  15  48583   5683  12  0.77  0.71    406

This is not a teqc bug!  This is _operator error_!  The lower #expt
values on both lines and the higher "%" value in the SWN line is a
direct consequence of my using not a complete set of navigation
messages covering the SVs that could have been tracked from this site.
(The lower #have values is related: the new part.gps file is probably
missing navigation messages for an SV that is just rising or setting --
so the observations for that SV are not counted.  This is normal qc
behaviour in teqc.)

This was long, I know, but hopefully some of you can get through it all
and send some feedback.

The next teqc version will have this functionality, come unresolved code bugs,
hell and high water.

cheers,
--lou

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Louis H. Estey, Ph.D.              office:  [+001] 303-381-7456
UNAVCO, 6350 Nautilus Drive           FAX:  [+001] 303-381-7451
Boulder, CO  80301-5554            e-mail:  lou  unavco.org
    WWW:  http://www.unavco.org   http://jules.unavco.org

"If the universe is the answer, what is the question?"
                                                -- Leon Lederman
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> Giovanni Sella wrote:
> 
>> The main issue I see with the current system is how to catch if there 
>> is a gap in data between the start time and end time. This problem is 
>> valid for both episodic and continuous sites.
>> Currently the hrs in the SUM line is just a subtraction of start and 
>> end time.
> 
> ... of the qc window, yes.  (i.e. not subtraction of first and last epoch
> times of data within the qc window)
> 
>> For simplicity lets use the case if an hour is missing and start and 
>> end times are 00:00 and 04:00 then you still get 04 hrs, wherease 
>> infact you only have 3 hrs.  It is true that if you run teqc +qc with 
>> a +nav you will get a lower percentage of expected observations, but 
>> what happens if this is due to a receiver a malfunction or because the 
>> receiver is operating at higher cutoff angle than the teqc default 
>> (10). I like running teqc +qc with a cutoff angle of 5 if you have a 
>> completed data set you get about 85-87% expected observables.  If 
>> there is a gap it drops a bit the math starts getting complicated. If 
>> I could check the total hrs knowing that this represented the actual 
>> hours then I can quickly check if there is a gap, by comparing my 
>> expected start and end times with this SAC hrs item.
> 
> ----------------
> 
> John Beavan wrote:
> 
>  > I agree with this suggestion.  You definitely need to keep the
>  > SUM line as there are many historical scripts that use it.
>  >
>  > But it would be nice if teqc also gave the info in the "ZZZ"
>  > line, so that new users don't have to write extra code to
>  > implement it - as we, and presumably many others, have
>  > done already.
>  >
>  > I would be inclined to repeat the dt, mp1, etc., in the new
>  > line.
> 
> ---------------
> 
> Herb Dragert wrote:
> 
>  > What [would happen in] two seperate cases where I have 1 hr of continuous
>  > data in a 24hr period compared to the case where I have 1 hr of data but
>  > these data are spread out over the entire day in 2.5 minute chunks so that
>  > I still have a total of only one hour of data for the entire day?  What is
>  > the value for your variable "span" for these two cases?
> 
> ~1 hr for both cases, even though they are two very different cases.
> 
>  > The first case is straightforward and your "span" would report 0.9958
>  > in your ZZZ line under "hrs" as per your example. What would it report
>  > if the 60 min of data are spread out in isolated chunks throughout the day?
> 
> ~1 hr is the current thinking.
> 
>  > I realize that this scenario is highly artificial, so perhaps something
>  > more realistic is the presence of say two chunks of data, each 5 hours
>  > in duration (e.g. one chunk from 01:00 to 06:00; the other chunk from
>  > 12:00 to 17:00) In such a case, what would the "span" variable report?
> 
> Span on the new "hrs" sum line would be ~10 hours in this case.
> 
> ----------------
> 
> I'm now inclined to have this new "ZZZ" line (whatever we end up calling
> it) be a standard part of any qc -- no option or explicit windowing needed
> to invoke it.  This plus the original SUM stats are just two different
> normalizations; take your pick -- or use both (or neither).
> 
> During some off-line discussion with Herb trying on how to come up with
> this number, I'm now thinking that it should be a simple:
> 
> number of epochs with observations within the qc window * nominal sample 
> interval
> 
> It seems like this addresses the original request; e.g. a "24-hr" session
> with no gapgs of 30-sec sampling with epochs from 00:00:00 to 23:59:30 
> would
> have 2880 epochs, though this results in a "24.00" value on the "ZZZ" line
> and "23.99" on the current SUM line.  (I suppose it could be:
> 
> (number of epochs with observations within the qc window - 1)*nominal 
> sample interval
> 
> to bring the two into alignment, though this is a bit arbitrary for data
> with gaps.)
> 
> Final thoughts/comments?
> 
> --lou


More information about the teqc mailing list