|
|
 | | From: | Anton Ertl | | Subject: | 2nd RfD: extension queries | | Date: | Mon, 17 Jan 2005 22:26:17 GMT |
|
|
 | CHANGE HISTORY
2nd RfD 2005-01-17 No change to the normative section. New, non-trivial reference implementation. More discussion of "Why not let ENVIRONMENT? return a flag and true, like for wordset queries?" and "Why the "X:" prefix?". New sections "Won't there be too many extension names?", "How about defining a way to query for implementor extensions?", "Why not just ask for word names with [UNDEFINED]?"
PROBLEM
How does a program know whether the system it runs on supports one of the extensions that ran through the RfD/CfV process, so that the program can implement the extension itself or work around its absence?
PROPOSAL
If the string passed to ENVIRONMENT? starts with "X:", ENVIRONMENT? returns false if the system does not implement the extension indicated by the query string in full, or if there is no such extension that has gone to a CfV.
For an extension from the list of CfVs , take the linked-to filename, delete the ".html", and prepend "X:" to construct a query string for the extension.
If the system implements the extension, ENVIRONMENT? may return true (without additional values) or false.
TYPICAL USE
S" X:deferred" ENVIRONMENT? 0= [IF] .... \ reference implementation of the deferred words proposal [THEN]
REMARKS
Why allow returning false when the system supports the extension?
Returning false when the system supports the extension will usually be safer than returning true when the system does not support the extension; in the former case the program will be slower, or have degraded features; in the latter case the program will usually fail in unpredictable ways.
Therefore, systems must not return true for extensions that have not yet gone to a CfV (the proposal for the extension could still change).
So, if a system happens to already support the extension, it will have to report false on queries for the extension at least from the time when the proposal goes to a CfV until the time that an update of the system with updated extension queries is released.
Moreover (and possibly more importantly), this feature means that systems whose implementors have never heard of (or ignore) RfDs and CfVs will work correctly for extension queries (as long as they don't support any queries starting with "X:" on their own), so a program written to cope with this specification will usually work correctly even on such systems.
Why not let ENVIRONMENT? return a flag and true, like for wordset queries?
This proposal is easier to use. What is the point of returning an extra flag? "Yes, we have heard of that extension, but no, we have not implemented it"? That's not a useful information to have; what should a program do with that information?
Mitch Bradley and Guido Draheim would prefer a wordset-query-like behaviour, i.e., an additional flag if ENVIRONMENT? returns true. Richard Borrell would prefer a single flag (i.e., the proposed behaviour).
With the wordset-query-like behaviour, the typical use above would look like:
S" X:deferred" ENVIRONMENT? dup [IF] drop [THEN] 0= [IF] .... \ reference implementation of the deferred words proposal [THEN]
Here are some statistics about the use of ENVIRONMENT? in general and wordset queries in particular in ANS Forth programs:
Program Author ENVIRONMENT? wordset queries brainless David Kuehling yes no brew-0_03z9 Robert Epprecht yes no brew_..._38 Robert Epprecht yes floating-ext CD16v11 Brad Eckert no no anagrams Wil Baden no no pentomino Bruce Hoyt no no tscp Ian Osgood no no Gray4 Anton Ertl yes no garbage-coll Anton Ertl yes no
I believe that one of the reason that wordset queries are used so rarely is that they are so cumbersome to use (that's certainly one of the reasons that kept me from using it).
Why the "X:" prefix?
This will hopefully ensure that there is no naming conflict with any existing environmental query of any system; it also reserves a part of the environmental query name space (by requiring a false result for anything that has not gone to a CfV), without consuming all of it.
If you know of any name conflict of the "X:" prefix with an existing system and have a better suggestion for a prefix, let me know.
Bruce McFarling has suggested changing the prefix such that the query string can be used as a filename on DOS/Windows. However the prefix can be cut away easily, leading to a filename compatible with DOS/Windows (if the extension name is compatible), as the reference implementation demonstrates.
Guido Draheim suggested using a suffix, as it has advantages with input completion. A prefix can also be used to profit with input completion, and overall this issue does not seem very important. Prefixes are more traditional for tagging names in programs (while suffixes are used for file names).
What about extension proposals that have not (yet) gone to a CfV?
If you want to introduce queries for them, do it with a different prefix.
Why not include extension proposals that have not (yet) gone to a CfV?
They may still change before they go to a CfV, so it would not be clear if the system and the querying program refer to the same proposal.
Won't there be too many extension names?
Guido Draheim thinks that we will see many backwards-compatible revisions of proposals, so the number of extension names that should be known around for a the extensions would be a problem (since the nth revision would satisfy the requirements of revision 1-n, and thus n names would have to be kept around).
One way to deal with this would be to use a consistent naming scheme for backwards-compatible extensions: deferred, deferred-2, deferred-3 etc. Then the system just needs to store the base name and the current revision number, and can check with a little code whether the queried-for extension is supported.
Keeping this naming scheme would be the responsibility of the person who maintains the list of CfVs. (It's not the responsibility of the system implementor and therefore not in the normative part of this proposal).
How about defining a way to query for implementor extensions?
Guido Draheim suggested this. With this one system implementor could formally define an extension, and another system could adopt that extension, and programs could query for this extension.
Doing this would require reserving some part of the ENVIRONMENT? name space for implementor-extension queries, with each query having an implementor part, and an extension name. A naming authority would implementor part names to implementors, and the implementor would assign extension part names. (Actually one the implementor could use it for arbitrary queries, not just extension queries).
I am not convinced that there is enough demand for that. In any case, I would like to see it handled in a separate RfD/CfV.
Why not just ask for word names with [UNDEFINED]?
Hans Bezemer and Albert van der Horst favour this approach.
That only tells whether a word exists in the system, but not how it behaves.
It would also require asking separately for each word. And for on-demand loading, it would require organizing each word separately. Lots of effort for the programmer and the system implementor.
Implementation and Tests
* Reference implementation, appended below ; to make it useful, you should also download and unpack , containing a directory of reference implementations of voted-on extensions (if available). * Tests
EXPERIENCE
All ANS Forth systems I know implement this proposal in a minimal way (answer all queries with false). None implement it in a non-minimal way. No programs have used the proposal yet.
- anton
\ reference implementation for extension queries
\ public domain
\ This implementation can be used both by systems and programs to \ implement extension queries; it loads the implementation of the \ extension from an appropriately named file, if the system does not \ have the extension already.
\ This implementation can be stacked: the system uses it to load its \ implementation of the extension (if the extension is not already \ present); and the program uses another instance of this \ implementation to check what the system has, and load the reference \ implementation (where available) if necessary.
\ the main downside of this implementation is the location of the \ extension-dir: If you use an absolute directory name, you have to \ set it when you install the directory on each system; if you use a \ relative name, you can only run the program from a specific working \ directory.
\ here are parameters that that you may have to change:
: extension-dir ( -- c-addr u ) \ directory containing the files implementing the extension; \ this directory must only contain files for official extensions, \ otherwise this will not be a correct implementation of the proposal \ you can find a directory with reference implementations at \ http://www.complang.tuwien.ac.at/forth/ansforth/extensions/ s" extensions/" ;
: extension-prefix ( -- c-addr u ) \ string at the start of evert extension query s" X:" ;
: extension-file-extension ( -- c-addr u ) \ the file extension of the implementation files s" .fs" ;
\ end of parameters
31 constant max-word-length
create eq-path max-word-length extension-dir nip + extension-file-extension nip + chars allot
extension-dir eq-path swap chars move
eq-path extension-dir nip chars + constant eq-filename
: make-eq-filename ( c-addr1 u1 -- c-addr2 u2 ) dup >r eq-filename swap chars move extension-file-extension eq-filename r@ chars + swap chars move eq-path r> extension-dir nip + extension-file-extension nip + ;
: string-prefix? ( c-addr1 u1 c-addr2 u2 -- f ) \ gforth tuck 2>r min 2r> compare 0= ;
variable extension-list 0 extension-list ! \ linked list of extension names; each node: link, then counted string
\ I would have liked to do "['] defnoop execute-parsing" here, but \ that's not stadard (yet:-).
: insert-extension ( addr u -- ) dup cell+ char+ allocate throw extension-list @ over ! dup extension-list ! cell+ 2dup c! \ store count char+ swap chars move ;
: lookup-extension ( c-addr u -- f ) \ true if extension is in list 2>r extension-list @ begin ( list r: c-addr u ) dup while dup cell+ count 2r@ compare 0= if drop 2r> 2drop true exit then @ repeat 2r> 2drop ;
s" extension-query" insert-extension \ this is already loading
: environment? ( c-addr u -- false | ... true ) 2dup 2>r environment? dup if ( f ) \ system answers query in another way 2r> 2drop exit then 2r@ extension-prefix string-prefix? 0= if \ not an extension query 2r> 2drop exit then drop 2r> extension-prefix nip /string ( c-addr1 u1 ) dup max-word-length u> if \ the name is too long 2drop false exit then 2dup lookup-extension if \ the extension is already (being) loaded 2drop true exit then 2dup make-eq-filename r/o open-file if ( c-addr1 u1 fid ) \ doesn't exist drop 2drop false exit then >r insert-extension ( r:fid ) \ break cycle before including r> include-file true ;
\ tests for this implementation
0 [if] require test/tester.fs
\ { s" core" environment? -> true true } \ !! could also be false { s" Y:blabla" environment? -> false } { s" X:abcdefghijabcdefghijabcdefghijabcdefghij" environment? -> false } { s" X:extension-query" environment? -> true } { s" X:non-existant file" environment? -> false } { s" X:deferred" environment? -> true } { s" X:deferred" environment? -> true } \ try again { s" X:extension-query" environment? -> true } \ test lookup again [then]
-- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.complang.tuwien.ac.at/forth/ansforth/forth200x.html
|
|
 | | From: | Bruce McFarling | | Subject: | Re: 2nd RfD: extension queries | | Date: | 18 Jan 2005 20:27:51 -0800 |
|
|
 | Anton Ertl wrote: > Why not let ENVIRONMENT? return a flag and true, like for wordset queries?
> This proposal is easier to use. What is the point of returning an > extra flag? "Yes, we have heard of that extension, but no, we have not > implemented it"? That's not a useful information to have; what should > a program do with that information?
The most useful role it could play is,
TRUE="Yes, we implement it" Flag="Here is how".
However, that is only truly useful if the Flag is standardised, rather than system-defined, as the ideal would be to have at most one ifdef per means of access plus one if the implementation is not supported, rather than one ifdef per system variant.
I note that your implementation includes a "test" subdirectory in the test. Will subdirectories be in the scope of the RfD/CfV process, or will that remain in the relied on but unstandardised zone?
|
|
 | | From: | Anton Ertl | | Subject: | Re: 2nd RfD: extension queries | | Date: | Wed, 19 Jan 2005 17:55:25 GMT |
|
|
 | "Bruce McFarling" writes: > >Anton Ertl wrote: >> Why not let ENVIRONMENT? return a flag and true, like for wordset >queries? > >> This proposal is easier to use. What is the point of returning an >> extra flag? "Yes, we have heard of that extension, but no, we have >not >> implemented it"? That's not a useful information to have; what should >> a program do with that information? > >The most useful role it could play is, > >TRUE="Yes, we implement it" >Flag="Here is how".
I don't know what a flag would indicate about the "how", and why that would be interesting.
But in any case, that's not how wordset queries work. There the role they play is:
true="This query string is known." flag="true if the wordset is present"
>Will subdirectories be in the scope of the >RfD/CfV process, or will that remain in the relied on but >unstandardised zone?
Sure, they are in scope (anything is in scope), but somebody needs to make an RfD about them. I have that on my to-do list, but I guess that this will be a tough one.
- anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.complang.tuwien.ac.at/forth/ansforth/forth200x.html
|
|
 | | From: | Bruce McFarling | | Subject: | Re: 2nd RfD: extension queries | | Date: | 19 Jan 2005 17:30:11 -0800 |
|
|
 | Anton Ertl wrote: > I don't know what a flag would indicate about the "how", and why that > would be interesting.
> But in any case, that's not how wordset queries work. There the role > they play is:
> true="This query string is known." > flag="true if the wordset is present"
At the moment, since the flag for "some but not all the words in the wordset are present" is by implication FALSE, the combination would tell you nothing more than that IF you should find one of the words in the wordset, it is safe to leave it and just define the missing words. Otherwise, since an implementation that does not know the query string may not even KNOW the definitions as contained in the wordset, then to be on the safe side you have to assume that all the wordset words are missing, even if words of the same name appear in the dictionary.
However, since TRUE is all bits set, that means that there are 2^16-2 standard values that are neither true nor false. If "is present" is meant literally, then a wordset that is available but not present is in a middle ground state of being "potentially present".
So, a standard set of values would be defined, and returning that value as the flag would inform the user that the wordset is not present, but available.
0 - Not available 1 - Only available through user intervention 2 - Available by loading a BLOCK file 3 - Available by loading a file 4 - Available by bringing a wordlist into the search order 5 - Available by executing the query string as a word
I'm not sure if there are any other automated options.
Its great when query strings trigger a process that makes a wordset available. But simple solutions for simple systems. Indeed, just in writing it I am thinking even the above is too much. A long established common practice to make BLOCKs accessible by name is to define a constant with the block number, and with the standardisation of VALUEs that is now mutable in a standard way. So that could be:
2 - Available through a BLOCK constant with that name
Queries and files are both strings, but files are strings that may have more limitations. A file providing facilities may require prepending with device or subdirectory information, it may require appending file type information, and it may require a conversion to fit within limitations on width or character set -- especially if X: is used. So it seems like a word to translate from an environmental query to a complete filename for the system is required, maybe:
ENVIRONMENT-FILE ( ca u -- ca' u' )
Obviously a system that never returns "3" never needs ENVIRONMENT-FILE.
Of course another way of bringing up a facility that comes to my mind is by executing the facility name. So maybe the list stops with:
4 Available by executing the query string as a word.
Obviously if some systems are already using flags in a system-specific version of this, it may be necessary to count down from -2 rather than count up from 1.
|
|
 | | From: | Anton Ertl | | Subject: | Re: 2nd RfD: extension queries | | Date: | Fri, 21 Jan 2005 13:04:22 GMT |
|
|
 | "Bruce McFarling" writes: > >Anton Ertl wrote: >> I don't know what a flag would indicate about the "how", and why that >> would be interesting. > >> But in any case, that's not how wordset queries work. There the role >> they play is: > >> true="This query string is known." >> flag="true if the wordset is present" > >At the moment, since the flag for "some but not >all the words in the wordset are present" is by >implication FALSE, the combination would tell >you nothing more than that IF you should find one >of the words in the wordset, it is safe to leave >it and just define the missing words. Otherwise, >since an implementation that does not know the >query string may not even KNOW the definitions as >contained in the wordset, then to be on the safe >side you have to assume that all the wordset words >are missing, even if words of the same name >appear in the dictionary.
If a system claims to be ANS Forth, and it provides one of the names defined in ANS Forth, the name must behave as described in the standard:
|3. Usage requirements | |A system shall provide all of the words defined in 6.1 Core Words. It |may also provide any words defined in the optional word sets and |extensions word sets. No standard word provided by a system shall |alter the system state in a way that changes the effect of execution |of any other standard word except as provided in this Standard. A |system may contain non-standard extensions, provided that they are |consistent with the requirements of this Standard.
However, even if a system were allowed to give non-standard meanings to standard names, I don't see anything in the definition of ENVIRONMENT? or the wordset queries that would change that. So, in either case, providing an additional flag does not provide additional information.
For the extension queries, one might propose such a meaning of the FALSE TRUE result, but that would deviate from the wordset-query usage (which is the main point of adding the flag).
>2 - Available by loading a BLOCK file >3 - Available by loading a file
Not useful, since no file name and/or block number is provided.
>4 - Available by bringing a wordlist into the search order
Not useful, since the WID is not provided. Moreover, if the system provides the search order wordset, all standard names should be in the FORTH-WORDLIST; and if the system does not provide the search order wordset, the user has no standard way to get the wordlist in the search order (and typically, there is no search order that the user could bring the wordlist into).
>5 - Available by executing the query string as a word
That would be moderately useful (see below).
However, ANS Forth does not give this meaning to the result (and it actually allows only flags), so I still don't see that the flag produced by the standard wordset queries is useful.
If a program asks whether a wordset is available, it typically wants to use that wordset (or work around its absence); so why not make it available upon the query and return true instead of having an extra word there that does this and return 5?
Ok, there may be some programs that are just doing reporting on the completeness of the system and check the wordsets without using them; but these programs typically require few other resources, so making the wordsets available for those programs should not be a problem, either.
>Its great when query strings trigger a process that >makes a wordset available. But simple solutions >for simple systems. Indeed, just in writing it I >am thinking even the above is too much. A long >established common practice to make BLOCKs accessible >by name is to define a constant with the block number,
How many lines of code or bytes do you think that will save? I think it will cost both bytes and lines of code, because with that approache you typically need the name twice: once for ENVIRONMENT?, and once in forth-wordlist. With ENVIRONMENT? using its own wordlist, instead of having a constant in the environment wordlist, and another in the forth-wordlist, you would just define
: wordset-query ( n "name" -- ) create , does> ( -- true ) @ load true ;
25 wordset-query float-ext ( -- true )
(in both approaches, provisions against loading twice should be added).
And this saves the trouble of programs having to write, for each wordset, things like
s" " environment? [if] dup true = [if] drop \ everything ok [else] dup 2 = [if] drop load [else] dup 3 = [if] drop included [else] dup 5 = [if] drop [else] drop s" " included [then] [then] [then] [then] [else] s" " included [then]
Yes, you might factor that into a definition, but that needs even more space (the system does not have to consider all the cases that you have invented, but the program would have to).
>Queries and files are both strings, but files are >strings that may have more limitations. A file >providing facilities may require prepending with >device or subdirectory information, it may require >appending file type information, and it may require >a conversion to fit within limitations on width >or character set -- especially if X: is used.
Which brings me to the question: Have you looked at the new reference implementation of extension queries? Any comments about the filename handling there? Does it address your concerns.
- anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.complang.tuwien.ac.at/forth/ansforth/forth200x.html
|
|
 | | From: | Bruce McFarling | | Subject: | Re: 2nd RfD: extension queries | | Date: | 23 Jan 2005 00:24:35 -0800 |
|
|
 | Anton Ertl wrote: > If a system claims to be ANS Forth, and it provides one of the names > defined in ANS Forth, the name must behave as described in the > standard:
Yes, but if it claims to be ANS Forth 94, it will not necessarily know anything about any newly standardised words (similarly wrt ANS 200x regarding anything standardised in ANS 201x).
|
|
 | | From: | Bruce McFarling | | Subject: | Re: 2nd RfD: extension queries | | Date: | 23 Jan 2005 00:53:50 -0800 |
|
|
 | Anton Ertl wrote: > >2 - Available by loading a BLOCK file > >3 - Available by loading a file
> Not useful, since no file name and/or block number is provided.
As noted in the following section, for the first, executing the query string as a BLOCK constant would give the Block Number, and for the second, you would need a standard word to translate from query string to filename, but at least it could in the simpler cases be nothing but a string manipulation.
> >4 - Available by bringing a wordlist into the search order
> Not useful, since the WID is not provided. Moreover, if the system > provides the search order wordset, all standard names should be in the > FORTH-WORDLIST;
Aha, so that would have to merge with 5, as it would be some extra-standard manipulation that brings the wordlist into the Forth wordlist, and that may as well be the query string executed as a word.
> >5 - Available by executing the query string as a word
> However, ANS Forth does not give this meaning to the result (and it > actually allows only flags), so I still don't see that the flag > produced by the standard wordset queries is useful.
Certainly ANS Forth 94 does not give this meaning to the result, but ANS Forth 94 is still crippled wrt social programming. The goal should be a compatible extension. The flag produced by the standard wordset queries as a boolean is not very useful, but it allows a compatible extension with useful additional extensions.
> If a program asks whether a wordset is available, it typically wants > to use that wordset (or work around its absence); so why not make it > available upon the query and return true instead of having an extra > word there that does this and return 5?
You got it in one. If a program asks whether a wordset is available, it TYPICALLY wants to use that wordset or work around its absence. What about somehing that wants to check whether the wordset is present, and if not installs itself to make the wordset available? It can in fact make the wordset available in a standard way even if that is not the way that the system it is loaded into normally does things.
Implmentations will provide their own tools for handling these things, and they may or may not be integrated into the ENVIRONMENT query system. On the implementation side, its a question of whether there is a straightforward mapping of how the implementation does it.
> And this saves the trouble of programs having to write, for each > wordset, things like
> s" " environment? [if] > dup true = [if] > drop \ everything ok > [else] dup 2 = [if] > drop load > [else] dup 3 = [if] > drop included > [else] dup 5 = [if] > drop > [else] > drop s" " included > [then] [then] [then] [then] > [else] > s" " included > [then]
> Yes, you might factor that into a definition, but that needs even more > space (the system does not have to consider all the cases that you > have invented, but the program would have to).
This is as opposed to the current state of the art, which is:
[IF] 4thXYZ action [ELSE] 4THQRP action [ELSE] etcetera .... [THEN] [THEN]
A single word to be handed an the name to work with the standard standard eqr (environmental query result index) to load it is a massive step in the right direction.
Sure, it would be loverly if everyone would simply settle on loading the thing when the query is made, and returning TRUE, but I do not believe for one second it will happen. So out in the real world, the best I will hope for is a standard way for the implementation to express how the implementation does things.
Indeed, that is why the second set had everything floating around the string of the implementation query, so that you could easily define it once if it was not already present and then just hand it a string. I could see not even loading that word until an environmental query returned an eqr between TRUE and FALSE.
> >Queries and files are both strings, but files are > >strings that may have more limitations. A file > >providing facilities may require prepending with > >device or subdirectory information, it may require > >appending file type information, and it may require > >a conversion to fit within limitations on width > >or character set -- especially if X: is used.
> Which brings me to the question: Have you looked at the new reference > implementation of extension queries? Any comments about the filename > handling there? Does it address your concerns.
I think it addresses the X: concern, or maybe shows why the X: issue is a minor inconvenience rather than any serious problem. There is the other question of how to map from that to DOS/Rock Ridge 8+3 filenames, but if it is not going to be possible for a script to find out how the system loads these kinds of things, then the answer to that will obviously be RTFM, if you are lucky enough to have a manual.
|
|
|