Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Browse #6

Open
wants to merge 3,265 commits into
base: master
Choose a base branch
from
Open

Browse #6

wants to merge 3,265 commits into from

Conversation

gchiu
Copy link

@gchiu gchiu commented Jun 10, 2017

creates a stub for browse in mezz-help.r and hijack's it to the new browse defined in the extension

hostilefork and others added 30 commits March 31, 2017 05:00
R3-Alpha created two notoriously unique "contexts" for the root and
task variables.  These non-user-exposed contexts did not have keylists,
because they were produced before the object-generation machinery
was initialized.  Yet the contexts were not intended to be user
exposed to select values out of.

Much simpler would be just to have them be arrays and not contexts, as
the means of selecting out of them is done by C code.  This does that
change, and also attempts to do some layering so that it's clear what
is "program global" and "task local".  However that is very difficult,
as tasks were never clearly designed in terms of how independent
threads could safely be operating on common memory pools...so this
exposes the dodgy character of how un-separated the task and program
systems actually are.

Also, the values are initialized to unreadable blanks by default, and
only those values that are used by the C get initialized.  This helps
to notice when a root or task variable is no longer used, because the
failure to fill it in will give an assert in ASSERT_ARRAY.
When creating processes, the `errno` variable was passed by address
to write().  This is not legal on Android, as errno is volatile.
This copies errno to a temporary and writes that instead.

Also, the Android NDK defines char as unsigned by default:

http://stackoverflow.com/questions/7414355/

While most of Rebol uses REBYTE and only uses char for unencoded
string literals, an example in the host was using a < 0 test on char
to tell if it was a UTF8 multibyte character.  This changes that host
code to use REBYTE and check for > 127.
The C and C++ standards do not dictate if `char` is signed or unsigned.
Lest anyone think all environments have settled on them being signed,
they're not... Android NDK uses unsigned:

http://stackoverflow.com/questions/7414355/

In order to give the option some exercise, make the C++11 builds and
above use unsigned chars.

This expands and corrects the attempt to change to REBYTE in
%host-readline.c from the previous commit.
Corrects typo where definitional_return_dsp was used where
definitional_leave_dsp was intended.  Also, prevents a leave: or a
return: from causing problems when one kind of function or procedure
is nested inside another.  (e.g. PROCEDURE is not supposed to have a
definitional RETURN: but was picking them up from a function definition
inside them, could also have occurred with a local named RETURN:)

Includes a fix for where an UNUSED() macro was trashing data that was
used in posix signal builds.

Includes a few small comment updates.
This change radically simplifies HIJACK, in the process making a
hijacked function dispatch with the same performance as the function
which is doing the hijacking.

It also provides a meaningful semantic for COPY of FUNCTION!, which
had historical problems in R3-Alpha.  The new semantic is to do the
proxying which was previously performed with `hijack :some-func _`,
where the idea is that a copy of a function will behave like the
original but with the difference that it has a new identity.  This
new identity would have to be hijacked distinctly from the original.
Further experimenting with HIJACK revealed that direct modification of
the paramlist was not a good idea, due to the potential dependencies
other locations (specializations, adaptations, MAKE FRAME!s, running
function calls...) might depend on the shape of that paramlist not
changing.

This takes back the idea of doing a replacement of the function body
only if the underlying paramlists are compatible...which will be true
for many adaptations, chains, or specializations from copies.  It is
also true for taking the function back from its original--which solves
the issue of un-hijacking (one of the draws of the approach in the
previous commit, but this way works).

However, a non-compatible underlying function means that the frames
will need to be proxied--effectively making a hijack cost similar to
an extra layer of function call.

At an intermediate point in trying to solve this, it was noticed that
f->func is a somewhat misleading name for the frame's current function,
because it changes.  This changes it to f->phase and preserves the
original function in f->original.

Adds a non-trivial specialization test to hijack tests.
The ECHO feature was based on some hackish C code inside the device
model.  Given the general desire to simplify and replace the current
workings of that model, this moves the feature into userspace and
bases it on HIJACK of WRITE-STDOUT and INPUT.

As a demonstration of the improved power, this gives ECHO some slightly
interesting features.

The words supported are ON, OFF, and RESET.  You can also specify a
filename or a filename template as a block, where * indicates a place
to substitute in a string.  Supplying a string instead of a filename
then will offer a new substitution.

Sample transcript:

    >> echo on

    >> 1 + 2
    == 3

    >> 3 + 4
    == 7

    >> echo off

    >> read/lines %echo.txt
    == [
        ""
        ">> 1 + 2"
        "== 3"
        ""
        ">> 3 + 4"
        "== 7"
        ""
        ">> echo off"
    ]

    >> echo "cat"

    >> 'meow 'meow
    == meow

    >> echo off

    >> read/lines %echo-cat.txt
    == [
        ""
        ">> 'meow 'meow"
        "== meow"
        ""
        ">> echo off"
    ]

    >> echo [%custom * %-format.log]

    >> echo on

    >> "use blocks for templates"
    == "use blocks for templates"

    >> echo off

    >> read/lines %custom-cat-format.log
    == [
        ""
        ">> echo on"
        ""
        {>> "use blocks for templates"}
        {== "use blocks for templates"}
        ""
        ">> echo off"
    ]

    >> echo reset

    >> read/lines %custom-cat-format.log
    == []

Also puts static into the linker flags, not the compile flags.
Basically every user of VAL_HANDLE_POINTER has to cast to a type
other than void.  So there's not any seemingly great reason not to
just fold the cast itself in with the macro (like SER_AT() does).
This lets the callsites look a little more readable.
imports libraries using tags to look them up.
Create a location to store the location of user libraries
use the rebol/locale/library setting
use locale/library as source
When porting the original and tough-to-follow HTTPS extension code,
there was a switch() statement that had two cases it checked, and
skipped all else:

https://github.com/zsx/r3/blob/ee00c3dd2295d42e4fec31b389ed841ea21172f0/src/os/host-core.c#L570

In the interest of tightening things up, an error was added to see if
it ever triggered.  It turns out that on certain configurations, it
can trigger because there is a "G" and "PUB-KEY" field in the object.
One instance of causing this was reading https://matrix.org

This relaxes the error to account for those 2 fields, but still gives
an error if unexpected fields are present in the object, which helps
make it more clear what the code is doing.
Inspired by the C++ attribute [[maybe_unused]]:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0212r1.pdf

The UNUSED() can be interpreted as coexisting with a debug build.
The only requirement imposed by Ren-C is that the UNUSED() appear
*after* any asserts, because of the debug corruption of the values
parameterized in UNUSED().
Change clip-str logic to clip only if longer than 48 characters
Allow @ in user names so can decode urls like pop3://carls@rebol.com:pass@rebol.com

    >> decode-url pop3://carls@rebol.com:pass@rebol.com
    == [scheme: 'pop3 pass: "pass" user: "carls@rebol.com" host: "rebol.com"]
Mechanically, the specialization dispatcher was determining what the
function it was specializing was based on the keylist of the exemplar
frame (which is a function paramlist).  However, this keylist is for
the *underlying* function, which was not necessarily the one that
was passed to SPECIALIZE.

This is related to a known problem that since a frame can be used for
many different "phases" of a function, each REBVAL of a frame must
indicate which phase it is for--in order to show the proper subset of
the keylist should be visible to that frame.  By making the phase of
the exemplar frame the function that was specialized, the memory is
retained...then that is used by the dispatcher.
Ren-C introduced the idea that SELECT, in order to be "truthful" about
the full range of values that can appear in a block, would use an
out-of-band signal that cannot appear in a block (void) to denote
that nothing was found. Since BLANK!s can appear in the block, it
wouldn't give you the full information to return blank, since these
two would be indistinguishable:

    >> select [a b c _] 'c
    == _

    >> select [a b] 'c
    == _

Empirical experience of Rebol users in history, however, is that while
"incomplete", the handling of voids is awkward. Frequently it is
known that a block does not contain literal falses or blanks, and
one might casually write:

    >> if x: select [...] y [code]
    Instead of needing to say:

    >> if value? x: select [...] y [code]

(Here, VALUE? is being used as it is intended to someday be retaken,
to mean NOT VOID?)

With other changes to the mechanics of things like if true [] => _
and if* true [] => void, the idea of getting convenient effects
by default suggested that a similar tactic might serve well in
SELECT. Those who know they need the discernment could use SELECT*
and preserve the void/blank distinction, while casual users could
get by with blanks.

This changes the fundamental operations to be the voiding versions,
since they give the most information.  Then it builds convenience
wrappers on top of those, using CHAIN.  The wrapper for TAKE, rather
than converting to a blank when there's nothing to take, gives an
error...because operations that have side effects should generally not
quietly fail.
On Windows, systems/options/boot had backslashes in it, because a
TO FILE! was done on the argv[0] passed in to the command line
processing.  The correct operation to use, since this is a platform
specific string, is TO-REBOL-FILE.
The concept in Rebol is that errors are structural entities, which
hold the relevant parameters for the error, that the user can then
TRAP and extract and react to.  This means that they are defined as
templates in %errors.r, and historically it has required special
syntax from the C code to trigger them.  This also would theoretically
help make it easier to provide internationalization of the strings.

Because the process of adding an error to that table was somewhat of
an effort, especially for an error in transitional code, a "catch-all"
error called RE_MISC was used.  While completely un-informative, it
would serve as a placeholder to come back and replace with a better
error later.

Yet the uninformativeness of an RE_MISC error usually necessitated a
comment anyway.  Much more useful would be to just allow fail() to
accept a C character string and write the comment in that space.  It
can still serve as a sign to come back and fix later with a real
message just as well--just look for `fail ("`.

So this commit eliminates RE_MISC, and takes advantage of the ability
to distinguish UTF-8 strings from REBSER* to allow character strings to
be passed to fail.  As an additional convenience, it takes the #1
most meaningless error about a value (Error_Invalid_Arg) and lets that
lazy option be said even more lazily by passing a REBVAL* to fail():

    fail (Error_Invalid_Arg(ARG(foo)) => fail (ARG(foo))

While perhaps encouraging more laziness, it also makes it easier to
spot that no "real error" is being given in these cases.  A C++
template trick is used to check at compile-time that the void *
parameter isn't actually a RELVAL* or other type Fail_Core() can't
actually handle.
In R3-Alpha, the way that it was determined if a series was an array
or not was if the width was the size of a REBVAL.  That didn't allow
other REBSER which had contents the size of a REBVAL, which might be
useful.  It was changed to a series header bit.  This is now more or
less set in stone, and unlikely to change.

There are a few good reasons to get rid of the abstraction
Is_Array_Series(), which is just a call to GET_SER_FLAG for
SERIES_FLAG_ARRAY.  If people writing routines realize that it is
simply a test of a header flag, then it can make conditional code which
tests other header flags be more obviously finessed by testing multiple
flags at once, or otherwise using the knowledge.

Also, there is a NOT_SER_FLAG variation, which can be more obviously
negated than !Is_Array_Series().

In any case, Is_Array_Series() is not the kind of thing that would
have been added today vs. just testing the series flag.  So that's a
good argument for taking it out.
The catalog arrays didn't contain anything that had to worry about
GC, but an upcoming commit does have some things that do.  They must
be protected from GC one way or another, so they can be gathered and
then put in the system object later.

Includes misc. comment cleanups, and a fix of an `if` statement that
didn't have parentheses, but got away with it because it used a macro
that would generate the parentheses.
There were two versions of LOAD written in Rebol code.  One was the
"early boot version" and the other was the "late boot version".  This
has led to mistakes and misunderstandings, not realizing there are
two LOADs...and as it turns out, the early boot version is never
used.

Removing it for now, and if it needs to be brought back it needs
another name, and an explanation of what it's for.
The Make_Series() function takes some flags to dictate whether the
resulting series should be an array, or allocated to a power of 2, etc.

This change moves the SERIES_FLAG_XXX that would be needed to replace
the MKS flags into the same group.  This way, the header could be
assigned directly from the flags, and the series creation behaviors
could then follow from that.

(Rearrangement of the flags is broken out and tested in a separate
commit in order to help isolate problems in the rest of the change.)

Includes a C++11 type checker for the AS_FUNC() cast.
This commit streamlines the creation of series slightly, by making the
flags which cue a series creation behavior the exact same flags that
are set in the series header.

Hence, rather than being in a situation like:

    s->header.bits = NODE_FLAG_VALID;
    if (flags & MKS_ARRAY) {
        s->header.bits |= SERIES_FLAG_ARRAY;
        ...
    }

The code can just be like:

     s->header.bits = NODE_FLAG_VALID | flags;
     if (flags & SERIES_FLAG_ARRAY) {
         ...
     }

This also means that many flags that were set after creation can be
passed in as well, saving extra steps.

Default versions of Make_Series() and Make_Array() that do not take
flags are provided as well.

Includes some of the very basic and minimal groundwork for getting file
and line information out of source ANY-ARRAYs.
hostilefork and others added 28 commits June 4, 2017 18:05
The division of labor in R3-Alpha for code that required a different
implementation on various OSes was to call OS_XXX functions, that were
linked separately as part of the host's implementation.  These
functions were intended to be C code that did not rely on Rebol types
(e.g. any string parameters to these services would have to be
transformed into the OS-native representation) and memory had to
be explicitly managed by contract, vs leveraging Rebol series.

The consequence of this division was that there would be a native
for a function like CALL, which would need to translate from Rebol
types into non-Rebol structures and flags, which generated overhead...
and then the routine receiving the unpacked data would have to give
back its results to be turned back into Rebol again.

This begins a transition to where CALL itself lives in an extension
which is itself able to use the internal Rebol API.  Not only does this
permit builds to omit CALL (e.g. emscripten would not have had any
underlying OS_XXX function for process spawning), but it means the
code can be much more straightforward.

This only does the conversion to an extension--it does not yet make any
significant changes to the code.  Being done as a separate commit to
see if Travis can build it.
Previously, the CALL mechanics were reused by way of a shared C routine
in the BROWSE native.  This commit alters the form of reuse to actually
use the CALL native directly, putting BROWSE into userspace.

Because some C code may be necessary to detect what browser(s) are on
the system (e.g. by way of registry key access), this moves just that
part into a native called GET-OS-BROWSERS.

Includes a fix for a bug introduced in Linux builds that was
introduced while extracting to a module.
SLEEP was added as a temporary hack to work around potential issues
with WAIT.  However, this created an unwanted OS dependency in the
core itself.  Moving the function to an extension which already has
the necessary OS dependencies for Windows and POSIX is at least a
small step forward.
Instead of erroring when an explicit handler is not given for an
OS build when doing GET-OS-BROWSERS, this falls back to /usr/bin/open,
which should make it work again on OS X or other POSIX-like systems
that don't explicitly define TO_POSIX (but can share code).
This starts taking steps to remove the redundancy between the code
implementing CALL on Posix and on Windows.  The various flags or
parameters for translating input types are turned into just basic
checks of VAL_TYPE(ARG(...)) or of the refinement flags.

As an interim trick, the native's frame_ pointer is passed in to give
the formerly independent OS_Create_Process() functions access to the
parameters, instead of having to find ways to coerce them into non
Rebol types.

Once the OS_Create_Process() functions are able to operation on actual
Rebol binaries and strings directly, they'll be able to progressively
append content to the output--rather than needing to do a large
non-Rebol allocation that is returned all at once.
The routines OS_ALLOC() and OS_FREE() were originally standardized so
that when Rebol core wanted to exchange information (such as a string
or buffer) with the host--and the buffer would have to be freed by
the core or by the host--that an assumption wasn't made about what
kind of allocator would be used.

With CALL moved out of the core and into an extension, there is no
longer any communication of this nature.  CALL is a self-contained
native and so it can use whatever allocator the implementation wants.
Switching to use plain malloc() and free() corrects a current problem,
but it's likely that these allocations won't ultimately be needed
because a small fixed-size chunk buffer could be used to add data
directly to the BINARY! or STRING! outputs of CALL.

This also makes the flow of data clearer (and double checked) by
passing NULL for the pointer to the error and output buffers when
they will not be allocated by the OS_Create_Process API.  Previously
it could be difficult to tell when parameters were inputs vs. outputs.
The intension of the existance of the array canons was to save the their
canonical form and make the comparison easier. However,
Intern_UTF8_Managed might not return a canonical form of a string (when
it's canonical form already exists, and different from the word in
question), which makes direct comparison fail.
Because of this, system/locale/language, locale will be set upon the
extension is loaded.
On windows, REBCNT/REBINT is defined as "[unsigned] long", and
REB_U/I32_ADD_OF takes (i32*), even though their sizes are the same on
this platform, (i32*) and (long*) are still incompatible data types.
A potential disruption of www.rebolsource.net would break the travis
building, so save these files locally.
To avoid name leaking.
GCC 7 will have -Wimplicit-fallthrough turned on by -Wextra. There are a
number of ways to inform the compiler that falling through is intended.
This commit just uses the comment to do that, which is the most
conservative (least destructive) way: if the compiler doesn't recognize
it, nothing's gonna be broken.
Only new compilers (like GCC 7) support this option, so add
-Wno-unknown-warning in hope that the unrecoginized option will be
silently ignored, or -Werror will turn the warning into an error.
Plan to move more process related functions to this extension
access-os is used to access various os specific info, like process id,
user id, sending signals etc. Now each of these features is implemented
in the extension as a native. i.e. "access-os 'pid" is replaced by "get-pid"
When the ALLOC_N macros were changed to plain malloc(), there were a
couple of bugs introduced in the linux CALL implementation.  This
addresses those issues.  Some light formatting done while tracing the
code as well.

Also, this makes CALL test any output series for writability before
starting the process.
changed the URL for the forum and select most recent post
@hostilefork hostilefork deleted the browse branch July 26, 2017 02:22
jjsullivan5196 pushed a commit to jjsullivan5196/r3 that referenced this pull request May 17, 2024
ATRONIX/FEAT: Add a native 'access-os'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants