When I sselect a particular file I get this message forward link zeo;" What causes this error?
(Thread taken directly from CDP, with original formatting preserved.)


========
Newsgroups: comp.databases.pick
Subject: Who's right?
From: Nuedit@cris.com (David Smith)
Date: 24 Feb 1996 19:27:47 GMT

Mani viswanathan  asks:

When I sselect a particular file I get this message
forward link zeo; reg=15 abort @ es.gosortx:16D
!

When I just select the same file it selects okay with no gfe.
Has anyone ever had this problem and resolved it.
What causes this error
******************************************************************************

Brig Campbell  wrote:
>A foward link zero in the english sort processor may be a bug.
>The select has probably already selected all the item-ids
>in the file and is now attempt to sort them in workspace.
>
>If this is R83, it had a fix workspace and this can happen.
>
>A GFE is a data structure problem which would also show
>with just a straight select as well as a sselect.
>
>Transent GFE's occur when items in a group are updated or
>deleted and data shifts.  Some implementation dont set
>a group lock when using english/access/retrieve etc... and
>the english processor gets "lost".
>
>-brig
***************************************************************************
Paul Roberts  responded:
Have had happen with a large select list that did not match item-ids
in the following select,list,sort statement (to many not on files)

******************************************************************************
David Smith CREATE-FILE TEMP 1,1 301,1      [or appropriate size]

         2. >COPY filename * (I              [to suppress ItemID's]
             to:(TEMP

         3. >CLEAR-FILE DATA filename

         4. >COPY TEMP *
             to:(filename

         5. >DELETE-FILE TEMP

-- Hope this helps. Dave.
*****************************************************************************

Question: Who's right?



========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: taj@news (Terry A. Johnson)
Date: 26 Feb 1996 18:27:13 GMT

Regarding the abort:

: When I sselect a particular file I get this message
: forward link zeo; reg=15 abort @ es.gosortx:16D
: !

several possible causes were suggested, among them:

: I run a Pick/Ultimate machine and had the same problem once.
:
: The root cause is that there are two items with the same ID in the file.
: ItemID's are suppossed to be unique, but, erroneously, the system created
: two items with the same name; possibly this 'name' is non-printable and so
: you don't see it when you do a list. Are there any blank lines listed?

Then David Smith asked:

: Question: Who's right?

While duplicate item-ids might be the problem, no one ever had this abort
on an Ultimate machine.  The "es.gosortx" part of the abort message is a
way of identifying the ABS frame that was never used by any version of
the Ultimate system.

Putting on my "GDH" hat, I want to make a comment on interpreting abort
messages (I know many of the readers already know this, but this thread
shows that many people don't):

The abort message identifies two things:
        1. The abort condition (forward link zero, etc).  There are about
           20 such conditions, depending on the implementation.  Each one
           is a condition that the assembly language is incapable of
           dealing with.
        2. The program address where the abort occurred.  The first part
           of the address is the ABS frame and the second part is the
           location in the frame.

Each abort is unique.  A forward link zero abort in one frame has nothing
to do (usually) with a forward link zero abort in another frame.  Think
about the Basic message, [B10] VARIABLE HAS NOT BEEN ASSIGNED A VALUE;
ZERO USED!  On line 10 of program X it might mean that an item that the
program expected to be in a file wasn't there, while on line 20 of
program Y it might mean that a total field wasn't initialized to zero.

Likewise different addresses mean entirely different things.  An error
at line 100 of a Basic program is, in general, unrelated to an error at
line 200.  Same with the assembly language.  There's nothing mystical
about it.

So, unless someone is looking at the Pick source code or has seen the
EXACT SAME message before, anything they say is simply a guess.

Since everyone else is guessing what is wrong, I'll jump in with one, too.
SELECT did not abort but SSELECT did.  Look for system delimiters in
the attributes (including item-id) that are being sorted by.  Can you
SELECT the file, then SORT it?  If so, this is probably a bug in the
retrieval language.  If the abort is repeatable (it must be or you
wouldn't have posted the question, right?), transient GFEs are not the
problem.  (OK, that was 3 guesses, not one.)

Good luck.

Terry Johnson


========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: heggers@netcom.com (Henry Eggers)
Date: Mon, 26 Feb 1996 19:41:40 GMT

David Smith (Nuedit@cris.com) wrote:
: Mani viswanathan  asks:

: When I sselect a particular file I get this message
: forward link zeo; reg=15 abort @ es.gosortx:16D
: !

First, the people with the source code should answer this.  The id of
the release of {R84 | Open Architecture | Advanced Pick | Really Advanced
Pick} would help.  I believe the modularization of the system, with
the retention of the historical entrypoint names to be unique to OA
and its descendents.

: When I just select the same file it selects okay with no gfe.
: Has anyone ever had this problem and resolved it.
: What causes this error?

: **************************************************************************

: Brig Campbell  wrote:
: >A foward link zero in the english sort processor may be a bug.
: >The select has probably already selected all the item-ids
: >in the file and is now attempt to sort them in workspace.

No.  Gosortx is after the sort has completed.  It is the phase which
is extracting each item-id from the sorted list, and retrieving
the item, by means of a call to RETIX, as necessary.
: >
: >If this is R83, it had a fix workspace and this can happen.

I would question this, because the register is (probably) pointed
into either temporary space or the small workspaces buffer, neither
of which has anything to do with 'workspace' repair.  Since my memory
is not good enough to remember all the uses of R15 in Gosortx (frm118?),
heh, and noting that R15 is the all-purpose scratch register, I can't be
sure.  If the problem is repeatable, and the machine is busy enough
with other people doing other things to make overflow use random,
it isn't likely to be a 'workspace' problem:  if it were an overflow
problem, one would be inclined to suspect double released frames;
but in that case, the sort process itself would (probably) fail.

: >
: >A GFE is a data structure problem which would also show
: >with just a straight select as well as a sselect.

Maaaaaybe.  The retrieval process of GNSEQI and GNTBLI are just a
bit different.  But if it were retrieval per se, it would fail in
RETIX.
: >
: >Transent GFE's occur when items in a group are updated or
: >deleted and data shifts.  Some implementation dont set
: >a group lock when using english/access/retrieve etc... and
: >the english processor gets "lost".

That would be 'read lock'.  Does _any_ implementation set a
traditional _group_ lock, making only one read to a group possible?
In any case, transients would not be repeatable, in general, and
would appear in RETIX, rather than in gosortx.
: >
: >-brig
: ***************************************************************************
: Paul Roberts  responded:

: Have had happen with a large select list that did not match item-ids
: in the following select, list, sort statement (too many not on files).

That is the old 706(?) error message, which ran off the end of the HS
workspace (history string) in frame 10:  upd.history or do.history,
or ts25?  Not related to gosortx.


: **************************************************************************
: David Smith CREATE-FILE TEMP 1,1 301,1      [or appropriate size]
:
:          2. >COPY filename * (I              [to suppress ItemID's]
:              to:(TEMP

:          3. >CLEAR-FILE DATA filename

:          4. >COPY TEMP *
:              to:(filename

:          5. >DELETE-FILE TEMP

: -- Hope this helps. Dave.
: *************************************************************************

: Question: Who's right?
:


IMHO, neither.  Nor is it obvious what's happening.  Without the code,
I note that gosortx obtains the next item id from the sort key list,
and retrieves the item, unless it is an SSELECT without output
attributes.

The mechanism of this retrieval depends on the format of the sort key
list:  highest key <> next highest key <>.. lowest key <> item-id {]vmc} SB.
If R15 is being used as the 'leading scan pointer', the one which is supposed
to find the SM, leaving the pointer to the last 'high key mark', traditionally
FA (having left a trail from 7F to DF to E8 over the years).  Having the
sorted list link into an unrelated frame, like a print file, would cause
this, because the sort terminator is SM.  There are very few SMs in
print files....  This suggests doubly-released overflow.

The other alternative is that R15 is the target in the copy from the
sort key list to BMSBEG, which would require that the item id be larger
than a frame, or that BMSBEGDISP is non zero.  The latter is relatively
unlikely.  For the item id to be larger than a frame suggests that
there is an 'item' in the group, with a valid count field -- or at
least a good supply of AMs in the target area -- with a suspiciously
large item id.  Anyone want to try a LIST FN with L0 > {50 or 100, as
applicable}?  Sort by-dsnd FN... doesn't seem to be the best
approach.  :-)

So, we don't know which is right.  It would really help to know what
the instruction at gosortx:16D is, or even how many single-steps
into gosortx it is...

Regards, hve.

========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: dauphin@aztec.co.za (Bob Dubery)
Date: Wed, 28 Feb 1996 21:06:39 GMT

heggers@netcom.com (Henry Eggers) wrote:

>Meanwhile, I don't know that anyone ever figured out what the mechanism of
>the creation of two items of the same item-id was;  but I've not seen
>it in a long time...
Off the top of my head, after a tough day, what if a record from a
group spanning multiple frames is read? If this record was in the
first frame and the record was updated then it would be appended to
the end of the last frame in the group. Now the machine crashes before
all the frames have been flushed to disk - two frames in the same
group could contain records with the same key.

>:          The TCL commands on my Ultimate machine are:
>:
>:          1. >CREATE-FILE TEMP 1,1 301,1      [or appropriate size]
>:
>:          2. >COPY filename * (I              [to suppress ItemID's]
>:              to:(TEMP

>:          3. >CLEAR-FILE DATA filename

>:          4. >COPY TEMP *
>:              to:(filename

>:          5. >DELETE-FILE TEMP
As Henry pointed out, if a situation of duplicate item-ids (in itself
a type of GFE) occurs then succesive updates of that record will in
fact alternately update the two records with the duplicated ID as the
record may be read from anywhere in the group but will always be
written to the end of the group. You may need to isloate the records
with the duplicated key first and reconcile the data in those two so
as to incorporate all the updates in to one record. Merely copying
backwards and forwards WILL remove one of the records but may also
remove some data that you'd rather retain.

Bob Dubery
Johannesburg
South Africa


========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: heggers@netcom.com (Henry Eggers)
Date: Sun, 3 Mar 1996 17:54:22 GMT

Bob Dubery (dauphin@aztec.co.za) wrote:
: heggers@netcom.com (Henry Eggers) wrote:

: >Meanwhile, I don't know that anyone ever figured out what the mechanism of
: >the creation of two items of the same item-id was;  but I've not seen
: >it in a long time...

: Off the top of my head, after a tough day, what if a record from a
: group spanning multiple frames is read? If this record was in the
: first frame and the record was updated then it would be appended to
: the end of the last frame in the group. Now the machine crashes before
: all the frames have been flushed to disk - two frames in the same
: group could contain records with the same key.

Unlikely, but...

The 'conventional' method of 'copying' an item onto the end is to start
at the 'existing' copy in the group and copy the succeeding item 'to the
left' over it, and its successor, and its successor, u mak galue (and
so on), until the end is reached, and then to copy the new version
onto the end, with the wite wequied (ww) flag transferred to the
FID table page control byte when the destination register leaves each
frame.  ['Write required' was apparantly passed through an Elmer Fudd
filter at some point in the distant past.]  In this case, it would be
difficult to have the first-updated frames remain in memory, and the
New instance of the item be successfully written to disk.

Particularly because at least the frame preceeding the new instance,
or containing the beginning of the new instance, would have to have
been written to disk, because at least the EOG AM or SM, depending
on vintage, would have to have been overwritten by the first byte
of the 'count field' or 'item control field' (block, if one prefers,
or I suppose 'bloc' if one is political) in order to have the read
scan pass to the new instance.  But that requires that the previous
item's count field point there.  But, by definition, the previous
item has been not copied to the left, so that the count filed must point
to the right of the beginning of the 'new instance', by induction.
Therefore, this method might work if the previous two items were the
same size, so that the prior penultimate item's count field points
at the right end of the ultimate item.  This would at least cause
the previously last updated item in the group to disappear.  Looking
for things which aren't there is harder.

Now, in the old LRU algorithm, I think that it is canonically impossible
for a later-updated frame to be written than a previously updated frame;
but in the clock algorithm and its variants, which arise with uProcessor
implements, I expect that a scenario can be educed which flushes the
latter updated two frames, and none of the at least one more prior
frames.  On the one hand, I wouldn't want to assign a probility to
the the event, but on the other, I suspect that it isn't _that_ low.

Other update strategies are more likely to result in this case,
particularly writting on the end and then pulling everything back,
although that idea fails on grounds both of performance and uniqueness.
That doesn't mean that someone didn't do it as a twisted doppelganger
of the approach of putting _all_ items on to the disk as 'indirect'
items, assuring the completion of the disk write of all the frames
in the group, updating the group with the new indirect address,
assuring that the group was written and then deleting the storage
used by the old copy of the item and returning it to the overflow
pool.  Not tough to do.  Just tough to live with, because this means
at least two synchronous disk writes, never mind the overflow table,
so that perceived individual update time would go from 1ms to 60ms.

Well, lest our ACID RDB friends become piqued by this, I observe that
the first thing which one does is to pass this update process off
to a 'trusted process', in order to mask the real time from the
user.  Of course, what happens when the 'trusted process' has a
backlog of 10**5 updates pending and something bad happens?  It is
possible to generate non-tirvial updates on a Pick machine which
exceed the ability of any backing store to handle as above.  This
brings me back to an observation which I make from time to time,
that Pick machines consciously accept a level of inaccuracy, in
order to function cost-effectively, and that customers who have
been comfortable with pick machines have intuitively understood this
and dealt with it.  It is one of the things which makes pick 'hard
to explain' to computer scientists.  It speaks to the uncomfort of
even tangential contact with the vagueness of the real world and
the absolute rightness of a mathematical world  (albeit I suspect
not the world of working mathematicians).

My suspicion is that the update routine has 'failed to recognize' the
first instance of the item in the group due to the failure of a
BSTE, probably going over a frame boundary, and possibly associated
with a particular batch of 74174 chips...  But, since I expect that
all those machines have long since slunk off to the green recycler,
or have migrated to walls, in frames, as trophies, I suspect that
they're not the only possible methodology.

In other words, I think that the double instance of the item
probably got in the group as the result of an error of instruction
execution, or possibly a very rare path in Upditm.  RETIX has been
cast in concrete for twenty-odd years.


Regards, hve


"Underneath the placid, moderately boring surface of the Internet is
a world of immense technical complexity inhabited by strange people."
--  David Gelernter, reviewing three views of Mt Shinomitnik, NYT,
27 February, 1996, at page B4.


========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: Martin Taylor 
Date: Mon, 26 Feb 96 20:58:28 GMT

In article <4gnorj$8fm@spectator.cris.com> Nuedit@cris.com "David Smith" writes:

> Mani viswanathan  asks:
>
> When I sselect a particular file I get this message
> forward link zeo; reg=15 abort @ es.gosortx:16D
> !
>
[snip]
> The root cause is that there are two items with the same ID in the file.
> ItemID's are suppossed to be unique, but, erroneously, the system created
> two items with the same name; possibly this 'name' is non-printable and so
> you don't see it when you do a list. Are there any blank lines listed?
>
> The SELECT processes the names sequentially, and doesn't recognize any
> problem, that is why that works. The SSELECT gives the Forward Link error
> when it encounters the non-unique ItemID's, and aborts.
>
> To resolve the situation, you just have to COPY the file to a temporary
> file that you've created; CLEAR the original file, and then COPY the file
> back to the original file. The reason this works is that COPY is a
> sequential process like SELECT, and thus will not give an abort.
>
> Since you probably want to identify the duplicated item for possible
> fixing, you should inhibit ItemID's from displaying, and do the COPY without
> overlay. That way, the only screen display would be the message 'Item xxxx
> already exists on file'. Otherwise, the ItemID's will scroll off the screen,
> as will the error message, and you won't know what item to fix thru editor.
>
[snip]

From the solution I recognise the problem.

There are two or more items in the file with the same id. Should be
impossible. Another manifestiation is an item which can be seen when you
LIST, but not when you SORT. The idea of a sequential access (i.e. LIST
and non-sequential (anything resulting from a SELECT for instance SORT or
a BASIC program following SELECT) following the hashing algorithm are the
keys to understanding this.

In principle it is *not* possible to have two items in a Pick file with the
same id. In practice I have seen it happen. Usually it is manifested as a
corrupted item-id which now hashes into the wrong group. When a sequential
process goes through the file, both items are hit. If this is a SELECT process
the subsequent command then hits the same item twice. This leads to the usual
way of detecting the problem which is a user asking "why is this record on the
report twice?". You cannot use any hashing process (such as the editor) to
fix the problem, since you can only get to the "good" record. You have to
use a process which reads sequentially and writes hashed. Hence the solution
using copy is a correct one. You could also use T-DUMP and T-LOAD.

It is a form of a gfe which *never* reports as a gfe. In fact, most often
it doesn't report as anything at all. I presume this type of file corruption
could remain hidden for years if the data is otherwise well behaved.

Here's a good way of finding if a file has one of these ghostly gfes:

SELECT FILE

nnn items selected

COUNT FILE
nnn-1 items counted
Item xxx not on file.


--
Martin Taylor (Author of "Pick for Users" and half of "Unix and Unidata")
Datamatters Ltd         http://www.dmatters.co.uk
13 Market Place, Heywood, Lancashire, UK
+44 (0) 1706 625478 fax +44 (0) 1706 625740  martin@dmatters.co.uk

========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: dauphin@aztec.co.za (Bob Dubery)
Date: Wed, 28 Feb 1996 21:06:41 GMT

Martin Taylor  wrote:

>report twice?". You cannot use any hashing process (such as the editor) to
>fix the problem, since you can only get to the "good" record.


Could you not just use

SELECT  = "DUPLICATED ID"?

PICK will traverse the whole file and check for records meeting the
selction criteria.

Bob Dubery
Johannesburg
South Africa


========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: jwjanner@tamu.edu (Jeffrey W. Janner)
Date: Wed, 28 Feb 1996 20:35:39 -0600

In article <4h2gar$38r@aztec.co.za>, dauphin@aztec.co.za (Bob Dubery) wrote:

} Martin Taylor  wrote:
} 
} >report twice?". You cannot use any hashing process (such as the editor) to
} >fix the problem, since you can only get to the "good" record.
} 
}
} Could you not just use
}
} SELECT  = "DUPLICATED ID"?
}
} PICK will traverse the whole file and check for records meeting the
} selction criteria.
}
}

That'll get you the IDs, but you'll only be able to get to one of the
actual records, i.e. the one that the hash finds (if any).

--
Jeff Janner  |  jwjanner@tamu.edu
"Read a Book!" -- Handy

========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: heggers@netcom.com (Henry Eggers)
Date: Sun, 3 Mar 1996 18:10:55 GMT

Jeffrey W. Janner (jwjanner@tamu.edu) wrote:
: In article <4h2gar$38r@aztec.co.za>, dauphin@aztec.co.za (Bob Dubery) wrote:

: } Martin Taylor  wrote:
: } 
: } >report twice?". You cannot use any hashing process (such as the editor)
: } >to fix the problem, since you can only get to the "good" record.
: } 
: }
: ...

: That'll get you the IDs, but you'll only be able to get to one of the
: actual records, i.e. the one that the hash finds (if any).

This brings up the possiblity of the failure of the 'hash' routine, which
is again probably a processor/ucode error.  Hash has been cast in concrete
longer than Retix.  Or, it's the result of a modification to the
field containing the base address of the file.  This had better be
pretty infrequent.  Since that would probably be programatic, it would
require a path.  Doesn't seem likely.


Regards, hve.


"Underneath the placid, moderately boring surface of the Internet is
a world of immense technical complexity inhabited by strange people."
--  David Gelernter, reviewing three views of Mt Shinomitnik, NYT,
27 February, 1996, at page B4.

========
Newsgroups: comp.databases.pick
Subject: Re: Who's right?
From: heggers@netcom.com (Henry Eggers)
Date: Sun, 3 Mar 1996 18:07:02 GMT

Bob Dubery (dauphin@aztec.co.za) wrote:
: Martin Taylor  wrote:
: 
: >report twice?". You cannot use any hashing process (such as the editor) to
: >fix the problem, since you can only get to the "good" record.
: 

: Could you not just use

: SELECT  = "DUPLICATED ID"?

: PICK will traverse the whole file and check for records meeting the
: selction criteria.

The problem with this is that the english processor only has knowledge
of the item at which it is currently looking.  At most, during
break processing, it is juggling the break data, seen as a virtual
item (you have no idea how hinkey this is -- more flying bubble bee
material), the prior item, and the next item.  But that doesn't
help.

In other words, selection can only take place based on the contents of
the attributes (or values).  It has no knowledge of the 'file', or
the context of the items.  To wit, the request, "I want to select
all the items which participate in accounts payable which exceed
100,000$ can't be done directly.  (It should be doable.  It is more
general than the 'having' capability of SQL).  It hasn't been done,
because of the perception that it requires two sorts, which seems
like 'too much work' and because english isn't recursible.

What is possible (on 7.3) is to use sort-list (nu, if I remember the
options:  If I do, nu means 'not unique', so that the list will
return those members of the list which occur more than once.  Or,
maybe it was (m, for multiple.  This was all part of an extention
of sort-list to and-list, or-list, sub-list which allowed the
combination of k lists into one, with options to bring back all of
the 'keys', or just one copy of each multiple key, or just those
keys which had k instances, and a few other things, which are in the
MAN file, if anyone has one of those around.  :-)

In any case, this is necessary a meta operation on the file.  As such, 
it is not clear how english ought to view it, what the class of
meta operations includes, and how many of these, or which kind,
ought to be implemented in english, if any.  There is a need for
a taxonomy of meta operations on pick files.

Regards, hve.


   
Return to CDP FAQ Main Index