Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unlock shebang++ #560

Merged
merged 8 commits into from
Apr 22, 2024
Merged

Unlock shebang++ #560

merged 8 commits into from
Apr 22, 2024

Conversation

sourishkrout
Copy link
Member

Multiple things:

  1. Allow cells to be store in ENV as long as their name conforms to an opinionated layout (we can give visual feedback in the notebook UI).
  2. If a language is not executable and no interpreter/program is set, run cat to stash the input cell in the output.

This will enable use cases where e.g. you have a SQL query you're live-editing in one cell and running it with $ bq query (BigQuery CLI) in another cell. Enables better interplay between cells than just running them back-to-back ($__).

@adambabik wondering if we should deny overwriting variables already contained in the ENV? However, any export statement will let you do that 🤷. Also, I don't want to overload the Runner with too much "cell/block"-level terminology so I introduced known_id & known_name which are as below. The idea being that if a name or id is known, clients can specify them here. Otherwise they are "unknown" since they are not required to run programs. Open to suggestion to model this differently.

` ` `sql {"id":"01HT37XG04CWS4CQS2WS7P1MX2","name":"experiments_sql"}

PS: Also renamed StoreLastOutput to StoreEnvVars only in runnerv2 as a cleanup.

@sourishkrout sourishkrout requested a review from adambabik April 19, 2024 16:23
@@ -52,6 +52,14 @@ func pathNormalizer(cfg *Config) (*Config, func() error, error) {
}
}

// default to "cat"
cat, err := exec.LookPath("cat")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adambabik should this be using the new sys abstraction?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave it as is. I need to port the system pkg usage to beta commands and runnerv2.

@sourishkrout
Copy link
Member Author

Also just realized that runnerv2's command packages does not seem to have any TEMP_FILE test cases, no @adambabik?

bool store_last_stdout = 23;
// store_env_vars, if true, will store the stdout under well known name
// and the last ran block in the environment variable `__`.
bool store_env_vars = 23;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool store_env_vars = 23;
bool store_stdout_in_env = 23;

@@ -52,6 +52,14 @@ func pathNormalizer(cfg *Config) (*Config, func() error, error) {
}
}

// default to "cat"
cat, err := exec.LookPath("cat")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave it as is. I need to port the system pkg usage to beta commands and runnerv2.

@@ -52,6 +52,14 @@ func pathNormalizer(cfg *Config) (*Config, func() error, error) {
}
}

// default to "cat"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// default to "cat"
// Default to "cat" when no program path is found.
// The idea is to return the body of the cell as its output
// so that it can be used as input in other cells.

if knownName != "" && runnerConformsOpinionatedEnvVarNaming(knownName) {
err = sess.SetEnv(knownName, string(stdoutMem))
if err != nil {
logger.Sugar().Errorf("%v", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.Sugar().Errorf("%v", err)
logger.Warn("failed to set env", zap.Error(err))

In order to stay consistent.

@@ -609,6 +618,12 @@ func runnerWinsizeToPty(winsize *runnerv1.Winsize) *pty.Winsize {
}
}

func runnerConformsOpinionatedEnvVarNaming(knownName string) bool {
// only allow uppercase letters, digits and underscores, min three chars
re := regexp.MustCompile(`^[A-Z_][A-Z0-9_]{1}[A-Z0-9_]*[A-Z][A-Z0-9_]*$`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: you can move this line outside of the func body to compile it only once.

session *command.Session
storeLastStdout bool
session *command.Session
storeEnvVars bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
storeEnvVars bool
storeStdoutInEnv bool

@@ -59,7 +61,7 @@ func newExecution(
id string,
cfg *command.Config,
session *command.Session,
storeLastStdout bool,
storeEnvVars bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
storeEnvVars bool,
storeStdoutInEnv bool,

@@ -277,7 +283,7 @@ func (e *execution) closeIO() {
}

func (e *execution) storeLastOutput(r io.Reader) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (e *execution) storeLastOutput(r io.Reader) {
func (e *execution) storeOutputInEnv(r io.Reader) {

@@ -177,6 +177,12 @@ message ExecuteRequest {

// file extension associated with script
string file_extension = 26;

// optional well known id for cell/block
string known_id = 27;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it is not used. We can keep it for the completeness reason.

The most important for me is to provide more detailed description. What does "known" mean? What guarantees does it give? Maybe given an example how client would use this.

Finally, if you take a look at config/v1alpha1/config.proto, we have a way to validate field values in the proto definition. This is a bit tricky and not used anywhere else, so it might be a better idea for a separate PR.

Copy link
Member Author

@sourishkrout sourishkrout Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add an example. The reason why I added id (unused) was because I have near-term projects in mind where I need it. I'll be sure to make the description better. Going to skip validation in this PR for now.

@adambabik
Copy link
Collaborator

Also just realized that runnerv2's command packages does not seem to have any TEMP_FILE test cases, no @adambabik?

This is implemented in internal/command which is used in runnerv2. Check out command_args_normalizer.go which handles runnerv2alpha1.CommandMode_COMMAND_MODE_FILE. It is also tested (I checked code coverage) but implicitly.

@adambabik
Copy link
Collaborator

Also, I don't want to overload the Runner with too much "cell/block"-level terminology so I introduced known_id & known_name which are as below. The idea being that if a name or id is known, clients can specify them here. Otherwise they are "unknown" since they are not required to run programs. Open to suggestion to model this differently.

I think it's ok until we figure out a better approach. I would only ask for a more detailed description in the proto files. I added a comment in the source code.

Copy link

sonarcloud bot commented Apr 22, 2024

@sourishkrout sourishkrout merged commit be6e4ce into main Apr 22, 2024
6 checks passed
@sourishkrout sourishkrout deleted the seb-sbpp branch April 22, 2024 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants