-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change EOL convention to LF? #2
Comments
Should we make sure the files are all LF before we push the Interlisip source code into the GitHub repo? |
Windows can mostly cope with LF-terminated text files nowadays. For a long time the last holdout was WIndows Notepad, but that's finally been fixed; in addition, there are many third-party plain text editors that have no problem with LF. |
more @rmkaplan notes: I did a lot of the character reading stuff in the early days, and apparently I also implemented the notion of an external format in the mid 90’s (which I don’t remember at all). But that’s the interface that makes it easy to add the UTF8 stuff. The issue has to do with the low-level character reading macros. They all go through a macro \inchar that basically wraps a call to \nsin, that then does xccs/ns stuff inline but otherwise calls out to the external format character function. But \inchar wraps the \nsin call inside another macro \checkeolc, which triggers on CR and LF, and does coercion to internal EOL (which happens to be CR) when the byte sequence matches the EOLCONVENTION of the file. If it sees a CR and the EOLCONVENTION is CRLF, it peeks at the next byte and reads it if it sees LF. If the macros are told to decrement the byte count, then the CRLF cause the necessary extra decrement. If it doesn’t see the LF, then it returns CR by itself, which is OK because that’s the internal. But if the convention is CR (which it defaults to now) and the file has CRLF, the LF is left in the file, and that screws things up. And if sees a naked LF, it doesn’t get converted to EOL. In aboriginal times—before text files moves back and forth between operating system environments with different conventions—there was a lot of fussiness about properly interpreting the EOL. This is still the correct thing to do for output files, so that files will look good in their home environment. But at one point it became apparent that this is a mistake for input files. Since you don’t know the provenance of a file, if you are operating on a file—or a region of a file—with text or character input functions, then any of the 3 eol indicators that happen to appear in the file should be mapped to the internal EOL. In fact, that is what PFCOPYBYTES is doing—it calls \NSIN directly instead of \INCHAR because it doesn’t want the accidental EOL convention of the file to get in the way. I thought I had cleaned this up a long time ago, but apparently not. I’m tempted to take another crack at it: to change the \CHECKEOLC macro to scoop up all the options, and then to recompile the relatively few functions that contain the macros (which may run into the problem in LLREAD). |
Have a look at the source for PFCOPYBYTES in PRINTFN |
In fact, could you just add CHARSET of xccs vs. Unicode to the READTABLE too? |
looks like at a minimum PFCOPYBYTES is misnamed or the patch was put into the wrong place. |
What patch?
I wonder if there is a simpler way of doing this, for the common case where input and output file have the same character encoding, just do the right number of bin/bouts, don’t get into character interpretation (except for ^F and the EOL coercion).
If the external formats differ and you do need to do the character interpretation, well, that’s where you get the multiple-value stack-smashing.
… On Jan 4, 2021, at 5:26 PM, Larry Masinter ***@***.***> wrote:
Have a look at the source for PFCOPYBYTES in PRINTFN
looks like at a minimum PFCOPYBYTES is misnamed or the patch was put into the wrong place.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJNOAA3GVBD6XIWQPTTSYJTDLANCNFSM4PICVDKQ>.
|
i'm confused. This is the thread to talk about "Change EOL convention to LF". |
It is PFCOPYBYTES (or SEE) that provokes the stack crash. If PFCOPYBYTES didn't do what it is doing, then it wouldn't be the provoker...but the bug would still be there |
CR by itself was last used on Mac Classic, and although WIndows still likes to write CRLFs, as I said above it can handle either LF or CRLF formats: Cygwin writes LF by default. As for those Unicode line and paragraph terminators, absolutely nobody uses them: they were a fine example "There are so many standards to choose from, and if you don't like any of them, wait till next year." So switching from CR-only to LF-only would be the Right Thing for usability's sake. |
@rmkaplan I meant READFILE. Use REANDFILE to tell you which CRs you can turn into LF . |
I’m not sure I understand the strategy. It seems you are changing CR’s to LF’s in the source file as you read it, i.e. not making a new version with the changes?
So you end up with the original file except that it has CR’s replaced by LF by reading with your modified READ(WRITE)FILE subroutines?
Is that it?
Why does that tell you what is safe?
Suppose the file was a CR file but had one LF in there in a string that was intended to stay LF when the file is actually loaded. If you have converted everything to LF, then for that file to be loaded eventually, it would be loaded as a file with EOL convention LF, and all of the LF’s on the file would show up internally as the internal EOL (= 13). And if the distinction coded by that one original LF would have been lost. Code that may have testing for charcode 10 will misbehave.
The more conservative approach is first to scan for all LF characters that might be signficant (presumably very few, if any), and then figure out whether or how the behavior they are implementing should be reexpressed.
Am I missing something?
… On Apr 3, 2021, at 4:12 PM, Larry Masinter ***@***.***> wrote:
@rmkaplan <https://github.com/rmkaplan> I meant READFILE. Use REANDFILE to tell you which CRs you can turn into LF .
eolhack.zip <https://github.com/Interlisp/medley/files/6253719/eolhack.zip>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJNWOVVVA6YH7OZPGUDTG6OG3ANCNFSM4PICVDKQ>.
|
I was looking for some method to apply that didn't depend on examining the sources and understanding them. Which CRs can be turned into LF's without changing the meaning of the code? What is the invariant? I took 'READFILE' a good invariant because that's what COMPARESOURCES compares. If you change a CR to a LF and it doesn't change the value returned by READFILE, it's OK. This will leave alone CRs that appear between string quotes, which seems like the most common exception in these files. |
With respect to strings, it ought to be sufficient just to scan for CR or LF preceded by %? Presumably all the others are just variant representations of white space (SEPRS).
(Except in a tiny little test it seems that LF in a string is escaped, but CR in a string is not. Go figure.)
… On Apr 4, 2021, at 12:21 PM, Larry Masinter ***@***.***> wrote:
I was looking for some method to apply that didn't depend on examining the sources and understanding them. Which CRs can be turned into LF's without changing the meaning of the code? What is the invariant? I took 'READFILE' a good invariant because that's what COMPARESOURCES compares. If you change a CR to a LF and it doesn't change the value returned by READFILE, it's OK.
This will leave alone CRs that appear between string quotes, which seems like the most common exception in these files.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJL632DYIETAG2V2QELTHC32NANCNFSM4PICVDKQ>.
|
\PRINSTRING has a special test for escaping LF, the only character besides % and “ that it escapes with %.
Why should that be, and should it still do that if we change to LF files? (Or escape CR instead?)
… On Apr 4, 2021, at 2:23 PM, Ron Kaplan ***@***.***> wrote:
With respect to strings, it ought to be sufficient just to scan for CR or LF preceded by %? Presumably all the others are just variant representations of white space (SEPRS).
(Except in a tiny little test it seems that LF in a string is escaped, but CR in a string is not. Go figure.)
> On Apr 4, 2021, at 12:21 PM, Larry Masinter ***@***.*** ***@***.***>> wrote:
>
>
> I was looking for some method to apply that didn't depend on examining the sources and understanding them. Which CRs can be turned into LF's without changing the meaning of the code? What is the invariant? I took 'READFILE' a good invariant because that's what COMPARESOURCES compares. If you change a CR to a LF and it doesn't change the value returned by READFILE, it's OK.
>
> This will leave alone CRs that appear between string quotes, which seems like the most common exception in these files.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJL632DYIETAG2V2QELTHC32NANCNFSM4PICVDKQ>.
>
|
Based on the discussion at our meeting today... For Lisp source files, the simple way of doing the CR → LF conversion is just to TR. Characters in whitespace contexts are completely safe. Because LF's are escaped in strings, the string reader can recognize unescaped LF's and turn them back into the internal EOL (= CR). But LF and CR in atom names are not discriminated, nor are LF's that were originally intended to be read as CR's (e.g. a user call to PRINTCCODE). So it is not entirely safe to do a brute force translation, unless we know that there are no CR's on the file that are intended to remain as CR's. Better to LOAD PROP the file with a CR EOLCONVENTION and MAKEFILE NEW with LF EOLCONVENTION. This can be done safely (I think) with a few code adjustments: The string and atom printers should trap the CR's and print them as CR's, not letting them go through to the EOLCONVENTION coercion. Same for (PRINTCCODE (CHARCODE EOL)) (which will now be semantically different than TERPRI). Thus, the file willl have LF's everywhere, and CR's only where semantically required. The readers should not need to be modified. This is about the READ/PRIN2 sequence, where presumably we want bidirectional equivalence. I'm not sure about PRIN1, which will print things that presumably aren't going to be read by READ, but might actually read and parsed by the user's own code. I suppose that the best approximation would also be to do the same thing as PRIN2, preserve the CR's instead of letting them coerce to LF. If that doesn't format properly when viewed from the outside, then the user can fix their code (convert the LF's in their strings). But then we are breaking the intuitive connection between internal EOL and newline on the outside. (Which raises the question: What if we changed the internal EOL to LF? How many places in the code do we see (CHARCODE EOL) or (CHARCODE CR)?) In sum, it should be (pretty much) safe if we:
Legacy files that are not converted, one way or the other, should still load properly with an LF EOLCONVENTION, as long as they don't contain atoms with LF's. The low-level character-reader will convert those to EOL. |
On Apr 5, 2021, at 4:08 PM, rmkaplan ***@***.***> wrote:
How many places in the code do we see (CHARCODE EOL) or (CHARCODE CR)?)
At least 76 files (including a couple of TEDIT files where it's probably documentation) have "(CHARCODE CR)" while 63 files have "(CHARCODE EOL)".
Additionally, 30 files have (CHARCODE LF).
|
Also -- what about files with EOLCONVENTION CRLF ? (which Lisp will happily generate) |
If a CRLF file is converted with tr, presumably all the CRLF’s should go to LF, just as if they were CR’s.
If you load a CRLF file from an EOLCONVETION LF, well, the white space occurrences will be safe. The string occurrences that should be converted to EOL would show up in strings as CRLF and the LF gets dropped. Etc.
If the rest of this is good enough, that shouldn’t be a problem.
Seems like there are lots of (CHARCODE EOL), that’s the main one we would have to worry about if we were to make the change. Presumably the CR’s are intended to be CR’s and not necessarily the same as EOL. We would introduce a confusion between the current EOL’s and the current LF’s, if they were intended to be distinct.
Changing the internal EOL seems riskier than fixing a few of the read/print functions, and praying.
… On Apr 5, 2021, at 4:49 PM, Nick Briggs ***@***.***> wrote:
Also -- what about files with EOLCONVENTION CRLF ? (which Lisp will happily generate)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJNRI5UXRQOYJDRXYR3THJEABANCNFSM4PICVDKQ>.
|
If a CRLF file were converted with tr for CR=>LF I would presume the result would be a "LFLF" file being read with LF convention. I agree that in all these cases the whitespace isn't going to be a problem. I haven't looked at exactly what we print for the case of a string containing CRLF when we print it with each of the three EOLCONVENTIONs. |
I think at least it would have to be converted CRLF -> LF. EOLTYPE in COMPAREDIRECTORIES will say what the time is, if it’s unambiguous.
… On Apr 5, 2021, at 5:23 PM, Nick Briggs ***@***.***> wrote:
If a CRLF file were converted with tr for CR=>LF I would presume the result would be a "LFLF" file being read with LF convention. I agree that in all these cases the whitespace isn't going to be a problem. I haven't looked at exactly what we print for the case of a string containing CRLF when we print it with each of the three EOLCONVENTIONs.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJLNGVUGJE5TF42RJZDTHJIB3ANCNFSM4PICVDKQ>.
|
In the last comment, I meant "EOL type", not "time". |
I'm sorry I wasn't clear. My "EOLHACK" was meant to be loaded in and run ONCE, on all the Interlisp source files in the repository. Then LOGOUT(T). commit -a all of the files that will appear changed in Git. |
I think that a safe way of identifying files that can be cavalierly tr(anslated) to LF is something like the following:
Create a filedevice DSKLF by copying DSK with the change of making its EOLCONVENTION be LF instead of CR.
For each file F, use tr to produce a new version F.LF that has every CR converted to LF.
SETQ(FLF (PACKFILENAME ‘HOST ‘DSKLF ‘EXTENSION ‘LF ‘VERSION NIL ‘BODY F))
Then I think we should use a modification of Larry’s technique to make sure we are taking into account the fact that EOL conversion happens at the lowest character level.
(IF (EQUAL (READFILE F) (READFILE F.LF))
THEN ; the LF and CR versions are equivalent when one is read with a CR EOLCONVENTION and the other
; read with an LF EOLCONVENTION, so the brute-force translation is safe.
; Use renamefile to give the LF translation a bumped version number
(RENAMEFILE F.LF (PACKFILENAME ‘VERSION NIL ‘BODY F)
ELSE ; the LF translation wasn’t safe, have to look at the file.
(It may be desirable also to compare the F and F.LF with the CR (DSK) filedevice to confirm that it doesn’t matter how it is read: we want converted files to be properly readable in old sysouts, and we want to be safe while we are in the middle of this (reading these new LF files with CR, but being careful not to write them)
If we get the same result when we read both the LF and CR versions with CR or LF EOLCONVENTIONS, then presumably the compiled files are OK too, we don’t have to recompile.
If we do this after making the reading code changes that I outlined, presumably there would be fewer files that fail the consistency tests and that need to be examined.
… On Apr 7, 2021, at 10:38 AM, Larry Masinter ***@***.***> wrote:
I'm sorry I wasn't clear. My "EOLHACK" was meant to be loaded in and run ONCE, on all the Interlisp source files in the repository.
It wasn't meant as a permanent patch to READFILE or anything else.
Then LOGOUT(T). commit -a all of the files that will appear changed in Git.
I looked at the results and I think there might be some additional sepr CR.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJNY4ZNJWCTHDNFKIQ3THSKANANCNFSM4PICVDKQ>.
|
i did the equivalent of that by just creating another directory parallel to the medley directory A lot of the files differed because the (* ;;; "Copyright ..." ) strings have an embedded CR in them. So maybe not flag those as important? With Git backing up everything, there's no need to increase the version number? |
I ran across a function that is reading text as bytes and is there fore not synchronized with the EOL LF conventions or potential changes to external character encodings (like Unicode). This is the function FASL:READ-TEXT, used in the reading of DFASL files. |
Continuing my last comment... This shows up immediately in the fact that the LF's that are now printed for EOL's in the header to DFASL files are copied to the screen as LFs, not as EOL's. An obvious fix would be to change it that so that it does READCCODE instead of BIN. But I presume that this is used to read characters in the compiled code as well, which raises a caution flag. But if those are printed with \PRINDATUM, as I believe they are, then the reading should mirror that. Is there any that should be done before proceeding with this change? |
I'm confused; how can you print an EOL to the header of a DFASL file when that is "copied to the screen". |
The symptom is the few header lines that you see when you load a DFASL that was created with the LF EOL convention. The string on the file has LF’s as EOL’s, but it is read from the file as bytes, with internal linefeeds, then oddly printed to the terminal.
But the code that does that (print and read), I suspect, must also be used for strings internal to the file’s content (although I really have no idea about that).
If a filename contained XCCS characters that would have been written in the XCCS runcoding format, they will not be read correctly. If they are actually written as bytes, then presumably something would be truncated somewhere.
… On Jun 10, 2021, at 2:25 PM, Larry Masinter ***@***.***> wrote:
I'm confused; how can you print an EOL to the header of a DFASL file when that is "copied to the screen".
This is for what? The end-of-line inside a quoted string?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJNHQUJZPE4KWKG5VGDTSEUS7ANCNFSM4PICVDKQ>.
|
could you give just a bit of context? like what file did you make in what context, and what happened when you loaded it? Under what circumstance would you find a file NAME that includes (non-ASCII) XCCS characters? |
(DEFINEQ (FOO NIL (CONS A B)))
(SETQ FOOCOMS ‘((FNS FOO)))
(MAKEFILE ‘FOO)
(CL:COMPILE-FILE ‘FOO)
(LOAD ‘FOO.DFASL)
… On Jun 10, 2021, at 2:54 PM, Larry Masinter ***@***.***> wrote:
could you give just a bit of context? like what file did you make in what context, and what happened when you loaded it?
Under what circumstance would you find a file NAME that includes (non-ASCII) XCCS characters?
I think we don't have support for Unicode filenames so I'm not sure how that's relevant to the EOL issue.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJN6QAA6ZDT523X4QVLTSEYAZANCNFSM4PICVDKQ>.
|
my inclination is to change the display device to treat CR, LF, and CRLF as equivalent. |
I don't think that would be a good idea -- I'm pretty sure there's code that formats output to the "terminal" using LF to advance to the next line in the same column and CR to reposition to column 1 without advancing to the next line.
… On Jun 10, 2021, at 4:00 PM, Larry Masinter ***@***.***> wrote:
my inclination is to change the display device to treat CR, LF, and CRLF as equivalent.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6DAWM3BQLF4O65ALZP4E3TSE7XPANCNFSM4PICVDKQ>.
|
I also don’t think this would be a good idea. CRLF would be a little dodgy too. And these strings may or may not be read in different contexts, and not just copied to the display.
I put in an immediate fix, which is to do the EOL conversion as FASL::READ-TEXT is reading the string. It should actually be using READCCODE instead of BIN, but the string is actually not printed as it should be, with surrounding quotes (although the printer will do the right thing with non-ascii characters according to the external format). Instead there is a built-in assumption that this is a byte-sequence ending in byte 255, which is the XCCS character-select code, and that confuses things.
… On Jun 10, 2021, at 5:04 PM, Nick Briggs ***@***.***> wrote:
I don't think that would be a good idea -- I'm pretty sure there's code that formats output to the "terminal" using LF to advance to the next line in the same column and CR to reposition to column 1 without advancing to the next line.
> On Jun 10, 2021, at 4:00 PM, Larry Masinter ***@***.***> wrote:
>
>
> my inclination is to change the display device to treat CR, LF, and CRLF as equivalent.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6DAWM3BQLF4O65ALZP4E3TSE7XPANCNFSM4PICVDKQ>.
>
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJPXUNMDOA4K5CLLDFDTSFHJJANCNFSM4PICVDKQ>.
|
The same problem appears in FASL:SKIP-TEXT, and of course there's a lot of stuff in the FASDUMP which uses BOUT and BOUT16. I'm not sure that it's actually writing strings and not dumping a counted set of bytes/words.
… On Jun 10, 2021, at 6:39 PM, rmkaplan ***@***.***> wrote:
I also don’t think this would be a good idea. CRLF would be a little dodgy too. And these strings may or may not be read in different contexts, and not just copied to the display.
I put in an immediate fix, which is to do the EOL conversion as FASL::READ-TEXT is reading the string. It should actually be using READCCODE instead of BIN, but the string is actually not printed as it should be, with surrounding quotes (although the printer will do the right thing with non-ascii characters according to the external format). Instead there is a built-in assumption that this is a byte-sequence ending in byte 255, which is the XCCS character-select code, and that confuses things.
> On Jun 10, 2021, at 5:04 PM, Nick Briggs ***@***.***> wrote:
>
>
> I don't think that would be a good idea -- I'm pretty sure there's code that formats output to the "terminal" using LF to advance to the next line in the same column and CR to reposition to column 1 without advancing to the next line.
>
>
> > On Jun 10, 2021, at 4:00 PM, Larry Masinter ***@***.***> wrote:
> >
> >
> > my inclination is to change the display device to treat CR, LF, and CRLF as equivalent.
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6DAWM3BQLF4O65ALZP4E3TSE7XPANCNFSM4PICVDKQ>.
> >
>
> —
> You are receiving this because you modified the open/close state.
> Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJPXUNMDOA4K5CLLDFDTSFHJJANCNFSM4PICVDKQ>.
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6DAWMGKOTNZLTKDFPBNSLTSFSLPANCNFSM4PICVDKQ>.
|
Yes, with that endmarker again. At least there it doesn’t matter whether they are read as bytes or chars, since they are being thrown away. The READ-TEXT is explicitly converting the bytes to characters.
Other occurrences of bytes and words are OK, it’s the character interpretation (in or out) that we have to watch out for.
… On Jun 10, 2021, at 6:51 PM, Nick Briggs ***@***.***> wrote:
The same problem appears in FASL:SKIP-TEXT, and of course there's a lot of stuff in the FASDUMP which uses BOUT and BOUT16. I'm not sure that it's actually writing strings and not dumping a counted set of bytes/words.
> On Jun 10, 2021, at 6:39 PM, rmkaplan ***@***.***> wrote:
>
>
> I also don’t think this would be a good idea. CRLF would be a little dodgy too. And these strings may or may not be read in different contexts, and not just copied to the display.
>
> I put in an immediate fix, which is to do the EOL conversion as FASL::READ-TEXT is reading the string. It should actually be using READCCODE instead of BIN, but the string is actually not printed as it should be, with surrounding quotes (although the printer will do the right thing with non-ascii characters according to the external format). Instead there is a built-in assumption that this is a byte-sequence ending in byte 255, which is the XCCS character-select code, and that confuses things.
>
>
>
> > On Jun 10, 2021, at 5:04 PM, Nick Briggs ***@***.***> wrote:
> >
> >
> > I don't think that would be a good idea -- I'm pretty sure there's code that formats output to the "terminal" using LF to advance to the next line in the same column and CR to reposition to column 1 without advancing to the next line.
> >
> >
> > > On Jun 10, 2021, at 4:00 PM, Larry Masinter ***@***.***> wrote:
> > >
> > >
> > > my inclination is to change the display device to treat CR, LF, and CRLF as equivalent.
> > >
> > > —
> > > You are receiving this because you were mentioned.
> > > Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6DAWM3BQLF4O65ALZP4E3TSE7XPANCNFSM4PICVDKQ>.
> > >
> >
> > —
> > You are receiving this because you modified the open/close state.
> > Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJPXUNMDOA4K5CLLDFDTSFHJJANCNFSM4PICVDKQ>.
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6DAWMGKOTNZLTKDFPBNSLTSFSLPANCNFSM4PICVDKQ>.
>
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJPVGF7XUUE6XYPVBYTTSFT2FANCNFSM4PICVDKQ>.
|
should we delete the EOLCONV scripts ? |
Hang on in case more source code pops up from someone. Disk is cheap.
…On Fri, Jul 2, 2021 at 4:52 PM Larry Masinter ***@***.***> wrote:
should we delete the EOLCONV scripts ?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANPPBVUIMQOXTDKNV4C67DTVYRHTANCNFSM4PICVDKQ>
.
|
Medley source files can be effectively converted by LOAD-PROP and MAKEFILE-NEW, so it isn't strictly necessary. But is there any harm in keeping it? |
I think people should de-install the script in current repos since the tr isn't needed anymore and would possibly hinder us in future (e.g. it was incompatible with lfe for github large files); my note was cryptic. |
There might be some issues left which should be opened as new issues |
* Add new GitHub action to create medley release * Update to manual trigger with release name as input * Build loadup (#1) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Build loadup (#2) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Cleanup * Build loadup (#3) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Cleanup * Move sysouts to correct location * Build loadup (#4) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Cleanup
* Build loadup (#1) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Build loadup (#2) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Build loadup (#3) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Cleanup * Build loadup (#4) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Cleanup * Build loadup (#5) * Add new GitHub action to create medley release * Update to manual trigger with release name as input * Cleanup * Cleanup * Move sysouts to correct location * Set root directory to medley
… Windows docker build / install (#1337) * Add cygwin-sdl build to buildLoadup workflow; add installer for cygwin-sdl on windows * Change how buildLoadup computes latest maiko release to accomodate draft releases * Fix call to gh release list for maiko * Debugging call to gh release list for maiko * Debugging call to gh release list for maiko #2 * Debugging call to gh release list for maiko #3 * Debugging call to gh release list for maiko #4 * Debugging call to gh release list for maiko #5 * Debugging call to gh release list for maiko #6 * Change maiko downloads to accoiunt for draft releases * Change maiko downloads to account for draft releases #2 * Specify shell (powershell) for Download cygwin installler * Few cleanup items on cygwin-install * Update ShellWhich to use command -v instead of which because which returns to much crap on cygwin and command -v is more portable overall * Switch from using medley-loadup & -runtime tars to medley-full-*.tgz so we get full release incl notecards; delete maiko on install and replace with cygwin maiko * Make sure Notecards doesn't try to load its HASH fileon PostGreet - for apps.sysout * Add xdg-utils to cygwin install to support ShellBrowser * Odds and ends on cygwin build * Redo medley.iss install script to use tar from Windows rather than cygwin tar because cygwin tar was messing up ACLs in windows. Needed to change creation of medley.bat accordingly. * Remove junk lines from buildLoadup.yml * Restore accidently deleted line to buildLoadup.yml * Fix multiple issues with cygwin_installer filename; arrange to remove placeholder.txt from the release assets at the end of cygwin installer * Change name of job from windows_installer to cygwin_installer * Fix missing GH_TOKEN is removal of placeholder.txt; fix naming of output file in medley.iss * Fiddling with getting cygwin-installer name right * Redoing merge of medley.sh/medley.command to handle the Darwin plus Cygwin cases; is medley.iss recreate symbolic links surrounding the medley.sh script * Fix typos/syntrax errors in medley.sh/medley.command
@rmkaplan reported:
Windows may still be the outlier, Mac OSX and Unix/Linux are LF.
It’s really a question of what the default EOL convention should be when output streams are created. It shouldn’t matter for input streams, at least for the operations that read characters and not bytes. The character reading functions should all recognize any of the EOL conventions on files and map them into the internal CR (the value of (CHARCODE EOL). (I remember setting it up that way—silly to have to know where or how a file was created before you could read it).
I thought that Unicode might have something to say about this, but they aren’t very helpful. They point out (in the Unicode 3.0 book that I have) that the CR/LF/CRLF conventions are confused…and then they add to the confusion. They define a new code U+2028 as the unambiguous “line separator” (also an unambiguous “paragraph separator" U+2029).
My Xerox XCCS book doesn’t say anything about this, so I’m not sure what the representation is in XCCS-compliant files (which would have run-codes for mixed character-set strings, but control characters are unique in any run).
My temptation is to change the default, so that we are more compatible with Unix/Mac files. We were never compatible with Windows/CRLF. If not for all files, then at least for UTF8 and UTF16 files.
In prowling around, I have also discovered that some of the low-level files got corrupted by the Japanese. A substantial part of the LLREAD file, for example, is filled with conversion tables for various Japanese coding systems, and this stuff is mixed in in a number of other places. Should have been in separate and later files—hard to imagine that these would be needed in the INIT.SYSOUT.
The text was updated successfully, but these errors were encountered: