knowledge-database (beta)

Current group: comp.lang.forth

2nd RfD: extension queries

2nd RfD: extension queries  
Anton Ertl
 Re: 2nd RfD: extension queries  
Bruce McFarling
 Re: 2nd RfD: extension queries  
Anton Ertl
 Re: 2nd RfD: extension queries  
Bruce McFarling
 Re: 2nd RfD: extension queries  
Anton Ertl
 Re: 2nd RfD: extension queries  
Bruce McFarling
 Re: 2nd RfD: extension queries  
Bruce McFarling
From:Anton Ertl
Subject:2nd RfD: extension queries
Date:Mon, 17 Jan 2005 22:26:17 GMT
CHANGE HISTORY

2nd RfD 2005-01-17
No change to the normative section. New, non-trivial reference
implementation. More discussion of "Why not let ENVIRONMENT?
return a flag and true, like for wordset queries?" and "Why the
"X:" prefix?". New sections "Won't there be too many extension
names?", "How about defining a way to query for implementor
extensions?", "Why not just ask for word names with [UNDEFINED]?"


PROBLEM

How does a program know whether the system it runs on supports one of
the extensions that ran through the RfD/CfV process, so that the
program can implement the extension itself or work around its absence?


PROPOSAL

If the string passed to ENVIRONMENT? starts with "X:", ENVIRONMENT?
returns false if the system does not implement the extension indicated
by the query string in full, or if there is no such extension that has
gone to a CfV.

For an extension from the list of CfVs
, take the
linked-to filename, delete the ".html", and prepend "X:" to construct
a query string for the extension.

If the system implements the extension, ENVIRONMENT? may return true
(without additional values) or false.


TYPICAL USE

S" X:deferred" ENVIRONMENT? 0= [IF]
.... \ reference implementation of the deferred words proposal
[THEN]


REMARKS

Why allow returning false when the system supports the extension?

Returning false when the system supports the extension will usually be
safer than returning true when the system does not support the
extension; in the former case the program will be slower, or have
degraded features; in the latter case the program will usually fail in
unpredictable ways.

Therefore, systems must not return true for extensions that have not
yet gone to a CfV (the proposal for the extension could still change).

So, if a system happens to already support the extension, it will have
to report false on queries for the extension at least from the time
when the proposal goes to a CfV until the time that an update of the
system with updated extension queries is released.

Moreover (and possibly more importantly), this feature means that
systems whose implementors have never heard of (or ignore) RfDs and
CfVs will work correctly for extension queries (as long as they don't
support any queries starting with "X:" on their own), so a program
written to cope with this specification will usually work correctly
even on such systems.


Why not let ENVIRONMENT? return a flag and true, like for wordset queries?

This proposal is easier to use. What is the point of returning an
extra flag? "Yes, we have heard of that extension, but no, we have not
implemented it"? That's not a useful information to have; what should
a program do with that information?

Mitch Bradley and Guido Draheim would prefer a wordset-query-like
behaviour, i.e., an additional flag if ENVIRONMENT? returns
true. Richard Borrell would prefer a single flag (i.e., the proposed
behaviour).

With the wordset-query-like behaviour, the typical use above would
look like:

S" X:deferred" ENVIRONMENT? dup [IF]
drop
[THEN]
0= [IF]
.... \ reference implementation of the deferred words proposal
[THEN]

Here are some statistics about the use of ENVIRONMENT? in general and
wordset queries in particular in ANS Forth programs:

Program Author ENVIRONMENT? wordset queries
brainless David Kuehling yes no
brew-0_03z9 Robert Epprecht yes no
brew_..._38 Robert Epprecht yes floating-ext
CD16v11 Brad Eckert no no
anagrams Wil Baden no no
pentomino Bruce Hoyt no no
tscp Ian Osgood no no
Gray4 Anton Ertl yes no
garbage-coll Anton Ertl yes no

I believe that one of the reason that wordset queries are used so
rarely is that they are so cumbersome to use (that's certainly one of
the reasons that kept me from using it).


Why the "X:" prefix?

This will hopefully ensure that there is no naming conflict with any
existing environmental query of any system; it also reserves a part of
the environmental query name space (by requiring a false result for
anything that has not gone to a CfV), without consuming all of it.

If you know of any name conflict of the "X:" prefix with an existing
system and have a better suggestion for a prefix, let me know.

Bruce McFarling has suggested changing the prefix such that the query
string can be used as a filename on DOS/Windows. However the prefix
can be cut away easily, leading to a filename compatible with
DOS/Windows (if the extension name is compatible), as the reference
implementation demonstrates.

Guido Draheim suggested using a suffix, as it has advantages with
input completion. A prefix can also be used to profit with input
completion, and overall this issue does not seem very
important. Prefixes are more traditional for tagging names in programs
(while suffixes are used for file names).


What about extension proposals that have not (yet) gone to a CfV?

If you want to introduce queries for them, do it with a different
prefix.


Why not include extension proposals that have not (yet) gone to a CfV?

They may still change before they go to a CfV, so it would not be
clear if the system and the querying program refer to the same
proposal.


Won't there be too many extension names?

Guido Draheim thinks that we will see many backwards-compatible
revisions of proposals, so the number of extension names that should
be known around for a the extensions would be a problem (since the nth
revision would satisfy the requirements of revision 1-n, and thus n
names would have to be kept around).

One way to deal with this would be to use a consistent naming scheme
for backwards-compatible extensions: deferred, deferred-2, deferred-3
etc. Then the system just needs to store the base name and the current
revision number, and can check with a little code whether the
queried-for extension is supported.

Keeping this naming scheme would be the responsibility of the person
who maintains the list of CfVs. (It's not the responsibility of the
system implementor and therefore not in the normative part of this
proposal).


How about defining a way to query for implementor extensions?

Guido Draheim suggested this. With this one system implementor could
formally define an extension, and another system could adopt that
extension, and programs could query for this extension.

Doing this would require reserving some part of the ENVIRONMENT? name
space for implementor-extension queries, with each query having an
implementor part, and an extension name. A naming authority would
implementor part names to implementors, and the implementor would
assign extension part names. (Actually one the implementor could use
it for arbitrary queries, not just extension queries).

I am not convinced that there is enough demand for that. In any case,
I would like to see it handled in a separate RfD/CfV.


Why not just ask for word names with [UNDEFINED]?

Hans Bezemer and Albert van der Horst favour this approach.

That only tells whether a word exists in the system, but not how it
behaves.

It would also require asking separately for each word. And for
on-demand loading, it would require organizing each word
separately. Lots of effort for the programmer and the system
implementor.


Implementation and Tests

* Reference implementation, appended below
;
to make it useful, you should also download and unpack
,
containing a directory of reference implementations of voted-on
extensions (if available).
* Tests

EXPERIENCE

All ANS Forth systems I know implement this proposal in a minimal way
(answer all queries with false). None implement it in a non-minimal
way. No programs have used the proposal yet.

- anton

\ reference implementation for extension queries

\ public domain

\ This implementation can be used both by systems and programs to
\ implement extension queries; it loads the implementation of the
\ extension from an appropriately named file, if the system does not
\ have the extension already.

\ This implementation can be stacked: the system uses it to load its
\ implementation of the extension (if the extension is not already
\ present); and the program uses another instance of this
\ implementation to check what the system has, and load the reference
\ implementation (where available) if necessary.

\ the main downside of this implementation is the location of the
\ extension-dir: If you use an absolute directory name, you have to
\ set it when you install the directory on each system; if you use a
\ relative name, you can only run the program from a specific working
\ directory.

\ here are parameters that that you may have to change:

: extension-dir ( -- c-addr u )
\ directory containing the files implementing the extension;
\ this directory must only contain files for official extensions,
\ otherwise this will not be a correct implementation of the proposal
\ you can find a directory with reference implementations at
\ http://www.complang.tuwien.ac.at/forth/ansforth/extensions/
s" extensions/" ;

: extension-prefix ( -- c-addr u )
\ string at the start of evert extension query
s" X:" ;

: extension-file-extension ( -- c-addr u )
\ the file extension of the implementation files
s" .fs" ;

\ end of parameters

31 constant max-word-length

create eq-path
max-word-length extension-dir nip + extension-file-extension nip + chars allot

extension-dir eq-path swap chars move

eq-path extension-dir nip chars + constant eq-filename

: make-eq-filename ( c-addr1 u1 -- c-addr2 u2 )
dup >r
eq-filename swap chars move
extension-file-extension eq-filename r@ chars + swap chars move
eq-path r> extension-dir nip + extension-file-extension nip + ;

: string-prefix? ( c-addr1 u1 c-addr2 u2 -- f ) \ gforth
tuck 2>r min 2r> compare 0= ;

variable extension-list 0 extension-list !
\ linked list of extension names; each node: link, then counted string

\ I would have liked to do "['] defnoop execute-parsing" here, but
\ that's not stadard (yet:-).

: insert-extension ( addr u -- )
dup cell+ char+ allocate throw
extension-list @ over ! dup extension-list !
cell+ 2dup c! \ store count
char+ swap chars move ;

: lookup-extension ( c-addr u -- f )
\ true if extension is in list
2>r extension-list @ begin ( list r: c-addr u )
dup while
dup cell+ count 2r@ compare 0= if
drop 2r> 2drop true exit then
@ repeat
2r> 2drop ;

s" extension-query" insert-extension \ this is already loading

: environment? ( c-addr u -- false | ... true )
2dup 2>r environment? dup if ( f ) \ system answers query in another way
2r> 2drop exit then
2r@ extension-prefix string-prefix? 0= if \ not an extension query
2r> 2drop exit then
drop 2r> extension-prefix nip /string ( c-addr1 u1 )
dup max-word-length u> if \ the name is too long
2drop false exit then
2dup lookup-extension if \ the extension is already (being) loaded
2drop true exit then
2dup make-eq-filename r/o open-file if ( c-addr1 u1 fid ) \ doesn't exist
drop 2drop false exit then
>r insert-extension ( r:fid ) \ break cycle before including
r> include-file
true ;



\ tests for this implementation

0 [if]
require test/tester.fs

\ { s" core" environment? -> true true } \ !! could also be false
{ s" Y:blabla" environment? -> false }
{ s" X:abcdefghijabcdefghijabcdefghijabcdefghij" environment? -> false }
{ s" X:extension-query" environment? -> true }
{ s" X:non-existant file" environment? -> false }
{ s" X:deferred" environment? -> true }
{ s" X:deferred" environment? -> true } \ try again
{ s" X:extension-query" environment? -> true } \ test lookup again
[then]

--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.complang.tuwien.ac.at/forth/ansforth/forth200x.html
From:Bruce McFarling
Subject:Re: 2nd RfD: extension queries
Date:18 Jan 2005 20:27:51 -0800

Anton Ertl wrote:
> Why not let ENVIRONMENT? return a flag and true, like for wordset
queries?

> This proposal is easier to use. What is the point of returning an
> extra flag? "Yes, we have heard of that extension, but no, we have
not
> implemented it"? That's not a useful information to have; what should
> a program do with that information?

The most useful role it could play is,

TRUE="Yes, we implement it"
Flag="Here is how".

However, that is only truly useful if the Flag is standardised,
rather than system-defined, as the ideal would be to have at
most one ifdef per means of access plus one if the implementation
is not supported, rather than one ifdef per system variant.

I note that your implementation includes a "test" subdirectory
in the test. Will subdirectories be in the scope of the
RfD/CfV process, or will that remain in the relied on but
unstandardised zone?
From:Anton Ertl
Subject:Re: 2nd RfD: extension queries
Date:Wed, 19 Jan 2005 17:55:25 GMT
"Bruce McFarling" writes:
>
>Anton Ertl wrote:
>> Why not let ENVIRONMENT? return a flag and true, like for wordset
>queries?
>
>> This proposal is easier to use. What is the point of returning an
>> extra flag? "Yes, we have heard of that extension, but no, we have
>not
>> implemented it"? That's not a useful information to have; what should
>> a program do with that information?
>
>The most useful role it could play is,
>
>TRUE="Yes, we implement it"
>Flag="Here is how".

I don't know what a flag would indicate about the "how", and why that
would be interesting.

But in any case, that's not how wordset queries work. There the role
they play is:

true="This query string is known."
flag="true if the wordset is present"

>Will subdirectories be in the scope of the
>RfD/CfV process, or will that remain in the relied on but
>unstandardised zone?

Sure, they are in scope (anything is in scope), but somebody needs to
make an RfD about them. I have that on my to-do list, but I guess that
this will be a tough one.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.complang.tuwien.ac.at/forth/ansforth/forth200x.html
From:Bruce McFarling
Subject:Re: 2nd RfD: extension queries
Date:19 Jan 2005 17:30:11 -0800

Anton Ertl wrote:
> I don't know what a flag would indicate about the "how", and why that
> would be interesting.

> But in any case, that's not how wordset queries work. There the role
> they play is:

> true="This query string is known."
> flag="true if the wordset is present"

At the moment, since the flag for "some but not
all the words in the wordset are present" is by
implication FALSE, the combination would tell
you nothing more than that IF you should find one
of the words in the wordset, it is safe to leave
it and just define the missing words. Otherwise,
since an implementation that does not know the
query string may not even KNOW the definitions as
contained in the wordset, then to be on the safe
side you have to assume that all the wordset words
are missing, even if words of the same name
appear in the dictionary.

However, since TRUE is all bits set, that means
that there are 2^16-2 standard values that are
neither true nor false. If "is present" is
meant literally, then a wordset that is available
but not present is in a middle ground state of
being "potentially present".

So, a standard set of values would be defined,
and returning that value as the flag would inform
the user that the wordset is not present, but
available.

0 - Not available
1 - Only available through user intervention
2 - Available by loading a BLOCK file
3 - Available by loading a file
4 - Available by bringing a wordlist into the search order
5 - Available by executing the query string as a word

I'm not sure if there are any other automated options.

Its great when query strings trigger a process that
makes a wordset available. But simple solutions
for simple systems. Indeed, just in writing it I
am thinking even the above is too much. A long
established common practice to make BLOCKs accessible
by name is to define a constant with the block number,
and with the standardisation of VALUEs that is now
mutable in a standard way. So that could be:

2 - Available through a BLOCK constant with that name

Queries and files are both strings, but files are
strings that may have more limitations. A file
providing facilities may require prepending with
device or subdirectory information, it may require
appending file type information, and it may require
a conversion to fit within limitations on width
or character set -- especially if X: is used. So
it seems like a word to translate from an
environmental query to a complete filename for the
system is required, maybe:

ENVIRONMENT-FILE ( ca u -- ca' u' )

Obviously a system that never returns "3" never needs
ENVIRONMENT-FILE.

Of course another way of bringing up a facility that
comes to my mind is by executing the facility name.
So maybe the list stops with:

4 Available by executing the query string as a word.

Obviously if some systems are already using flags
in a system-specific version of this, it may be
necessary to count down from -2 rather than
count up from 1.
From:Anton Ertl
Subject:Re: 2nd RfD: extension queries
Date:Fri, 21 Jan 2005 13:04:22 GMT
"Bruce McFarling" writes:
>
>Anton Ertl wrote:
>> I don't know what a flag would indicate about the "how", and why that
>> would be interesting.
>
>> But in any case, that's not how wordset queries work. There the role
>> they play is:
>
>> true="This query string is known."
>> flag="true if the wordset is present"
>
>At the moment, since the flag for "some but not
>all the words in the wordset are present" is by
>implication FALSE, the combination would tell
>you nothing more than that IF you should find one
>of the words in the wordset, it is safe to leave
>it and just define the missing words. Otherwise,
>since an implementation that does not know the
>query string may not even KNOW the definitions as
>contained in the wordset, then to be on the safe
>side you have to assume that all the wordset words
>are missing, even if words of the same name
>appear in the dictionary.

If a system claims to be ANS Forth, and it provides one of the names
defined in ANS Forth, the name must behave as described in the
standard:

|3. Usage requirements
|
|A system shall provide all of the words defined in 6.1 Core Words. It
|may also provide any words defined in the optional word sets and
|extensions word sets. No standard word provided by a system shall
|alter the system state in a way that changes the effect of execution
|of any other standard word except as provided in this Standard. A
|system may contain non-standard extensions, provided that they are
|consistent with the requirements of this Standard.

However, even if a system were allowed to give non-standard meanings
to standard names, I don't see anything in the definition of
ENVIRONMENT? or the wordset queries that would change that. So, in
either case, providing an additional flag does not provide additional
information.

For the extension queries, one might propose such a meaning of the
FALSE TRUE result, but that would deviate from the wordset-query usage
(which is the main point of adding the flag).

>2 - Available by loading a BLOCK file
>3 - Available by loading a file

Not useful, since no file name and/or block number is provided.

>4 - Available by bringing a wordlist into the search order

Not useful, since the WID is not provided. Moreover, if the system
provides the search order wordset, all standard names should be in the
FORTH-WORDLIST; and if the system does not provide the search order
wordset, the user has no standard way to get the wordlist in the
search order (and typically, there is no search order that the user
could bring the wordlist into).

>5 - Available by executing the query string as a word

That would be moderately useful (see below).

However, ANS Forth does not give this meaning to the result (and it
actually allows only flags), so I still don't see that the flag
produced by the standard wordset queries is useful.

If a program asks whether a wordset is available, it typically wants
to use that wordset (or work around its absence); so why not make it
available upon the query and return true instead of having an extra
word there that does this and return 5?

Ok, there may be some programs that are just doing reporting on the
completeness of the system and check the wordsets without using them;
but these programs typically require few other resources, so making
the wordsets available for those programs should not be a problem,
either.

>Its great when query strings trigger a process that
>makes a wordset available. But simple solutions
>for simple systems. Indeed, just in writing it I
>am thinking even the above is too much. A long
>established common practice to make BLOCKs accessible
>by name is to define a constant with the block number,

How many lines of code or bytes do you think that will save? I think
it will cost both bytes and lines of code, because with that approache
you typically need the name twice: once for ENVIRONMENT?, and once in
forth-wordlist. With ENVIRONMENT? using its own wordlist, instead of
having a constant in the environment wordlist, and another in the
forth-wordlist, you would just define

: wordset-query ( n "name" -- )
create ,
does> ( -- true )
@ load true ;

25 wordset-query float-ext ( -- true )

(in both approaches, provisions against loading twice should be
added).

And this saves the trouble of programs having to write, for each
wordset, things like

s" " environment? [if]
dup true = [if]
drop \ everything ok
[else] dup 2 = [if]
drop load
[else] dup 3 = [if]
drop included
[else] dup 5 = [if]
drop
[else]
drop s" " included
[then] [then] [then] [then]
[else]
s" " included
[then]

Yes, you might factor that into a definition, but that needs even more
space (the system does not have to consider all the cases that you
have invented, but the program would have to).

>Queries and files are both strings, but files are
>strings that may have more limitations. A file
>providing facilities may require prepending with
>device or subdirectory information, it may require
>appending file type information, and it may require
>a conversion to fit within limitations on width
>or character set -- especially if X: is used.

Which brings me to the question: Have you looked at the new reference
implementation of extension queries? Any comments about the filename
handling there? Does it address your concerns.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.complang.tuwien.ac.at/forth/ansforth/forth200x.html
From:Bruce McFarling
Subject:Re: 2nd RfD: extension queries
Date:23 Jan 2005 00:24:35 -0800

Anton Ertl wrote:
> If a system claims to be ANS Forth, and it provides one of the names
> defined in ANS Forth, the name must behave as described in the
> standard:

Yes, but if it claims to be ANS Forth 94, it will not necessarily
know anything about any newly standardised words (similarly wrt
ANS 200x regarding anything standardised in ANS 201x).
From:Bruce McFarling
Subject:Re: 2nd RfD: extension queries
Date:23 Jan 2005 00:53:50 -0800

Anton Ertl wrote:
> >2 - Available by loading a BLOCK file
> >3 - Available by loading a file

> Not useful, since no file name and/or block number is provided.

As noted in the following section, for the first, executing the
query string as a BLOCK constant would give the Block Number,
and for the second, you would need a standard word to translate
from query string to filename, but at least it could in the
simpler cases be nothing but a string manipulation.

> >4 - Available by bringing a wordlist into the search order

> Not useful, since the WID is not provided. Moreover, if the system
> provides the search order wordset, all standard names should be in
the
> FORTH-WORDLIST;

Aha, so that would have to merge with 5, as it would be some
extra-standard manipulation that brings the wordlist into the
Forth wordlist, and that may as well be the query string executed
as a word.

> >5 - Available by executing the query string as a word

> However, ANS Forth does not give this meaning to the result (and it
> actually allows only flags), so I still don't see that the flag
> produced by the standard wordset queries is useful.

Certainly ANS Forth 94 does not give this meaning to the result,
but ANS Forth 94 is still crippled wrt social programming. The
goal should be a compatible extension. The flag produced by
the standard wordset queries as a boolean is not very useful,
but it allows a compatible extension with useful additional
extensions.

> If a program asks whether a wordset is available, it typically wants
> to use that wordset (or work around its absence); so why not make it
> available upon the query and return true instead of having an extra
> word there that does this and return 5?

You got it in one. If a program asks whether a wordset is
available, it TYPICALLY wants to use that wordset or work
around its absence. What about somehing that wants to
check whether the wordset is present, and if not installs
itself to make the wordset available? It can in fact make
the wordset available in a standard way even if that is
not the way that the system it is loaded into normally does
things.

Implmentations will provide their own tools for handling these
things, and they may or may not be integrated into the ENVIRONMENT
query system. On the implementation side, its a question of
whether there is a straightforward mapping of how the
implementation does it.

> And this saves the trouble of programs having to write, for each
> wordset, things like

> s" " environment? [if]
> dup true = [if]
> drop \ everything ok
> [else] dup 2 = [if]
> drop load
> [else] dup 3 = [if]
> drop included
> [else] dup 5 = [if]
> drop
> [else]
> drop s" " included
> [then] [then] [then] [then]
> [else]
> s" " included
> [then]

> Yes, you might factor that into a definition, but that needs even
more
> space (the system does not have to consider all the cases that you
> have invented, but the program would have to).

This is as opposed to the current state of the art, which is:

[IF]
4thXYZ action
[ELSE] 4THQRP action
[ELSE]
etcetera
....
[THEN]
[THEN]

A single word to be handed an the name to work with the
standard standard eqr (environmental query result index)
to load it is a massive step in the right direction.

Sure, it would be loverly if everyone would simply settle
on loading the thing when the query is made, and returning
TRUE, but I do not believe for one second it will happen.
So out in the real world, the best I will hope for is a
standard way for the implementation to express how the
implementation does things.

Indeed, that is why the second set had everything floating
around the string of the implementation query, so that you
could easily define it once if it was not already present
and then just hand it a string. I could see not even
loading that word until an environmental query returned
an eqr between TRUE and FALSE.

> >Queries and files are both strings, but files are
> >strings that may have more limitations. A file
> >providing facilities may require prepending with
> >device or subdirectory information, it may require
> >appending file type information, and it may require
> >a conversion to fit within limitations on width
> >or character set -- especially if X: is used.

> Which brings me to the question: Have you looked at the new reference
> implementation of extension queries? Any comments about the filename
> handling there? Does it address your concerns.

I think it addresses the X: concern, or maybe shows
why the X: issue is a minor inconvenience rather than
any serious problem. There is the other question of
how to map from that to DOS/Rock Ridge 8+3 filenames,
but if it is not going to be possible for a script to
find out how the system loads these kinds of things,
then the answer to that will obviously be RTFM, if
you are lucky enough to have a manual.
   

Copyright © 2006 knowledge-database   -   All rights reserved