Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

Modify heap variable and extension syntax for better backward compatibility #40

Open
DaveBenham opened this issue Jan 5, 2020 · 14 comments
Labels
cancelled The idea of the issue was cancelled enhancement New feature or request open to ideas

Comments

@DaveBenham
Copy link
Collaborator

This issue was split out from #39 to make it easier to track.

EB uses $ prefixed names for heap variables, and @ prefixed names for extensions. I've seen many scripts that already use variable names that begin with those characters. Perhaps switching from environment variable to heap variable might not break existing code, but inability to access variables that begin with @ will definitely break existing code. At DosTips we developed a convention of naming batch macros with an @ prefix. For example, EB breaks my batch port of colossal cave adventure.

In addition to the prefix problem, use of ; will break any code that happens to use ; in an environment variable name. I don't think this is a common practice, I would still like to ensure that EB never breaks variable expansion for existing code.

I believe all the issues can be solved by using =, since batch does not have the ability to define a variable with = in the name. There are the undocumented predefined variables that begin with =, but they are few and far between.

My proposed new syntax is as follows:

  • Heap variable names would be prefixed by =$
  • Settable/gettable extension names would be prefixed by =@
  • Callable extensions would continue to be prefixed simply by @
  • Filters would be introduced by = after the variable name instead of ;. The ; could still be used to chain multiple filters together.

So the following code using old syntax !$var;lower;10! would now be written as !=$var=lower;10! instead.

I know the syntax is not nearly as elegant, but I think backward compatibility is more important than syntax elegance. I'm confident that the expansion code could be modified quite easily to satisfy my proposed syntax. But I am concerned about the code to SET the values. Your patch point would undoubtedly change. Native cmd.exe raises a syntax error when it sees an attempt to define a variable that begins with =. I'm hoping you can intercept that error and conditionally handle the SET if the name begins with =@ or =$.

If this change can be made, then it would be nearly impossible for EB to break existing code related to variable expansion. The only exception is SUBST allows definition of logical drives named @: and $:. There are the undocumented =@: and =$: variables that would hold the current directory associated with those volumes, and EB would block access to those values. I am not concerned about that potential because:

  • The documentation claims the drive letter must be a letter - very few people know you can use a symbol instead.
  • Very few people know about the =DriveLetter: variables - they are rarely used.
@DaveBenham
Copy link
Collaborator Author

DaveBenham commented Jan 5, 2020

This is actually the original response from @adoxa back when this was part of #39.

Heap variable names would be prefixed by =$

I think equals itself would qualify as the prefix, so no real need for $, too. Then again, the extra distinction between name and value is useful. In any event, using = makes creating array variables extremely awkward: set =$array=$index=value. Curly expansion makes it look better - set =$array{=$index}=value - but the underlying problem remains. Using another two-character prefix would be far easier, maybe ~$ and ~@?

Filters would be introduced by = after the variable name instead of ;.

I chose ; to match :, thinking it would be rare enough in variable names to get away with it. I could consider =, but it really doesn't look right.

I know the syntax is not nearly as elegant, but I think backward compatibility is more important than syntax elegance.

Backward compatibility is a double-edged sword and is partly the reason cmd.exe is in the mess it's in. I think anyone prepared to use EB should be prepared to modify their existing scripts to work with it.

@carlos-montiers
Copy link
Owner

carlos-montiers commented Jan 14, 2020

I think that in the case of the extensions using the AT as suffix should not changed, first because I think are more easy of remember, also relies in that in cmd you can not use call AT https://www.dostips.com/forum/viewtopic.php?f=3&t=9230&p=60092#p60092
Thus the use of it as a callable extension not break existing code, because you cannot use in that way in the normal cmd.
Use the = as prefix instead of $ for heap variables, mmm, maybe, I like more the $. I think the only think that maybe can cause a little issue, not tested is the feature of for of search the directories %~$PATH:I.
For simple I think is better use only one convention, thus not =@ with @, only @.
I agree with @adoxa :

anyone prepared to use EB should be prepared to modify their existing scripts to work with it.

@carlos-montiers
Copy link
Owner

@adoxa , @DaveBenham
I thought on this a lot of time and I decide that is not good use the = character in the name of the variables, because is used as an operator and will lead to confusion. And use any character available for a variable name as prefix, can break any code that has a variable beginning with that character.

In this sense, the use of $ and @ as prefixes in the variables must be considered as a change that can break code using variables with that prefixes.
I that case, that codes need be modified.

but inability to access variables that begin with @ will definitely break existing code.

EB does not prevent access to variables prefixed with @. It only will try to return the used in the list of extensions first:

>set @hello=Hi
>echo %@hello%
Hi

In summary: EB introduce little breaking changes, in the sense of the usage of the prefixes $ and @, and also is introducing other little breaking changes like the operator += in the set command.

Maybe an option for reducing the impact of the heap variables (always prefixed with $) is use an idea of @adoxa that consists in using a new extension. in this case @SET and it will be the only way of set heap variables.
Thus:

Set $a=b // environment
Call @Set $a=b // heap

This can save the old codes that use the $ as a prefix in the variables

@carlos-montiers carlos-montiers added cancelled The idea of the issue was cancelled open to ideas labels Apr 13, 2020
@carlos-montiers
Copy link
Owner

Hello. @DaveBenham , @adoxa. I'm thinking on remove the '$' prefix. Also, the heap variables should not be listed in the output of classic 'set' command. A new command: 'XSET' should be created. That will be analogous to set in the sense of list the variables, in this case, the heap variables and also can list the environment variables, in the output a letter can identify the storage ubication: H means heap and E means environment.
'XSET' will allow creating heap variables, and in the future also the type of the variables mentioned in issue #52.

@adoxa
Copy link
Collaborator

adoxa commented May 22, 2020

I would rather go the other way: SET always uses heap variables by default, with an option/command to use the environment. Adventure.bat is much quicker with EB without even using anything EB-specific (such as the length modifier). I haven't actually tested, but I think it's because of moving all its $ variables out of the environment and into the heap. Moving all the variables would probably have even more benefit. SET has to list heap variables in order for Adventure.bat to work.

ATM, for existing scripts to make use of heap variables requires changing variables, adding the $ prefix; your proposal requires changing the set command (it wouldn't be xset, it would be call @set); the above requires only changing variables that must be in the environment, which would be far fewer, if any.

@carlos-montiers
Copy link
Owner

@adoxa I think that all variables uses heap is really good (removing the $ prefix). All variables in the heap will support the local level provided by the setlocal?

@adoxa
Copy link
Collaborator

adoxa commented May 22, 2020

Ah, good point, I'll have to get setlocal working with heap variables before putting all variables in the heap.

@carlos-montiers
Copy link
Owner

When EB be loaded it should pass all that is in the environment to the heap?
I not know how cmd implements the local contexts.
Maybe it have variables per context, thus when tried to read a variable it looks in the current context, else search in the previous context?

@adoxa
Copy link
Collaborator

adoxa commented May 22, 2020

No, existing environment variables will remain. Heap will override, though, as we've discussed...somewhere.
SetLocal creates a copy of the environment, EndLocal replaces the environment with the copy.

@carlos-montiers
Copy link
Owner

Mmm, but if we set in the heap a variable that also exists in the environment block, and after it we delete it, will be needed delete from the heap and from the environment, will not be an overhead of always check environment variables.?

@adoxa
Copy link
Collaborator

adoxa commented May 23, 2020

Well, that still happens now, not that you'd notice, since variables don't typically start with $ or @. Maybe it's not such a good idea, after all.

@carlos-montiers
Copy link
Owner

How hard is copy all the environment block to heap when EB is loaded and after it, create heap context with each setlocal?

@adoxa
Copy link
Collaborator

adoxa commented May 24, 2020

Not sure what you're asking, there. If EB uses heap for all variables it would still try the environment if it doesn't exist, so there's no need for a copy. But that raises the problem of deleting a variable: does it delete the environment, too, or not? Is that where your copy comes in? We copy the initial environment to the heap (not a problem) then solely use the heap, so deleting a variable stays deleted, but an environment variable will be restored when EB exits (since it was never actually deleted). That might be a sensible approach. Local would be done via separate contexts, so that would be trivial.

@carlos-montiers
Copy link
Owner

I think if we implement the type of variables with the properties address when deleting the variable we set the address to null, thus on the unload we iterate all the variables with null address, and remove it from the environment.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cancelled The idea of the issue was cancelled enhancement New feature or request open to ideas
Projects
None yet
Development

No branches or pull requests

3 participants