Skip to content
This repository has been archived by the owner on Oct 25, 2023. It is now read-only.

Add endpoint that receives a webdriver script for batch command execution #328

Merged
merged 11 commits into from
Jun 25, 2019

Conversation

jlipps
Copy link
Member

@jlipps jlipps commented Jun 4, 2019

The batch command proposal has been under discussion for some time. To summarize, the original proposal was for a new API that would essentially allow encoding of any Appium command in JSON format, including the use of magic values like {{<element}} to refer to an element found in a previous command. This original proposal had several drawbacks:

  1. Quite a lot of code would need to be written on both client and server to support the feature. Most of the code would just be encoding/decoding commands from a JSON format.
  2. This method interacts poorly with proxying, which is quite common in Appium tests. Say that a batched command set came in involving a context switch, such that subsequent commands in the batch are supposed to target chromedriver. The problem is that the batch command itself would be written on the driver, which can only awkwardly proxy it on.
  3. Use of backreferences is just one example of all kinds of logic that users might wish to encode in batch commands, and be unable to. How would we handle conditionals, loops, and the like? We'd need to draw a line somewhere or invent a Turing complete mini language to support all the use cases!

This PR takes a much more radical approach. Basically, it asks that the user send in not a formatted list of commands, but rather a script (written in a supported library, in this case webdriverio). This script is run in a NodeJS VM with access to a driver object representing the current session. This proposal has a number of advantages:

  1. Access to a full webdriver client
  2. Access to a full programming language (JS) for use of logic
  3. Very little code needing to be written on the client side (the Appium Java client would just assume someone will write a webdriverio script as a string. Or, the Java client could go crazy and implement a code transformation feature that outputs valid webdriverio code. Either way).
  4. Very little code needing to be written on the server side. I think this PR is all that's necessary!

The main disadvantages of this new proposal are:

  1. Theoretical security concerns running untrusted code. However, the code is sandboxed in a VM, and has no access to e.g. the require function. I'm having trouble thinking how a malicious user could abuse this feature, but it's possible I suppose.
  2. We bring in a webdriver client (webdriverio) as a production dep. Thankfully it's not too big. In the future, this feature and its deps could be developed as a plugin/extension rather than part of Appium core.

I think this is pretty sweet, what do you all think?

@xcolwell
Copy link

xcolwell commented Jun 5, 2019

This is very nice direction for remote device testing and distributed device testing. Having a way to optimize performance critical sections like "find and act" while running a full script on the client has not been possible with Appium.

A couple questions:

  • how are remote exceptions returned to the client?
  • how are stdout/stderr returned to the client?
  • can the script return a json to the client?
  • is there a watchdog on the uploaded script for example max cpu time per script, max elapsed time per script?
  • do you see a standard set of available modules in the node vm? In other words is there a standard "header" file that gets appended to each batch script?
  • probably this would be enabled with a special flag? I think the semantics of --relaxed-security make sense here, but it would be most useful for us to enable just this batch feature without the other features guarded by this flag.

Also I think it would be useful to pass the script language with the script, so that cloud providers can choose to optionally support transpilation of other languages to javascript.

@KazuCocoa
Copy link
Member

a note after reading https://github.com/appium/appium-base-driver/pull/328/files#diff-aa816d90f6389bb3ff7a3fc331643232R287 ...

an idea of how to pass valid JS commands to the server. (To reduce mistake/work load in client side)

like Selenium, we'll prepare some JS commands as atom commands. Each client gets, reads and sends it to server like https://github.com/SeleniumHQ/selenium/blob/2abd80f236d1a7459ef638e96af8c4efd86b4abd/rb/lib/selenium/webdriver/atoms.rb#L30 (we can fix the JS file to fix some issues instead o updating all clients. This is a behefit of shared code.)

If sending command is complicated, this method does not work. Users should write JS (or will we implement each languages->JS converter?).

@jlipps
Copy link
Member Author

jlipps commented Jun 6, 2019

@KazuCocoa I'm not totally sure I'm following you. Are you saying that it's a bad idea to give clients the ability to write full-on javascript? And instead we implement client methods that simply generate correct javascript?

I think that's a good idea, but I don't see them as mutually exclusive. You could either write something like this:

await driver.executeDriverScript(`
    const el = await driver.$("~foo");
    await el.click();
`);

or, you could write this (if the client supports it, let's say we're using python:

with driver.batch() as batch:
    el = batch.find_element_by_accessibility_id("foo")
    el.click()

And then the python client would, as part of interpreting the context manager there, turn that into valid webdriverio code underneath the hood, and call driver.execute_script on its own?

Is that what you're proposing? If so I agree that'd be awesome, and then it's just up to the clients to implement this "safe" way of using executeScript. In practice though, I'm not sure how easy it would be to do this translation. What about this script?

with driver.batch() as batch:
    els = batch.find_elements_by_accessibility_id("foo")
    for el in els:
        el.click()

How do you translate the for loop into JS? You'd need to do the translation based on a Python AST or something, which is getting kinda crazy at this point.

@jlipps
Copy link
Member Author

jlipps commented Jun 6, 2019

thanks for the questions, @xcolwell!

how are remote exceptions returned to the client?

Same as with any other error

how are stdout/stderr returned to the client?

Right now they are not (and running inside a pure JS VM, there's no such thing as stdout/stderr per se), but we could make some log object available to the script, which could then be used to put the output in the appium log as well as return it as part of the return value of the function.

can the script return a json to the client?

Yes, the script can return any simple javascript object at any level of complexity, and JSON is the transport method.

is there a watchdog on the uploaded script for example max cpu time per script, max elapsed time per script?

No, I'm not sure if it's possible to attach a watchdog to a VM, but this could presumably be done of Appium as a whole. I'm assuming actions in the VM get counted on a system resource level are counted as happening within the appium process

do you see a standard set of available modules in the node vm? In other words is there a standard "header" file that gets appended to each batch script?

There could be! We'd just need to decide what we want to make available to the user. In this PR it's just the driver object. But we mentioned above maybe a log object also makes sense. What else makes sense? Within the constraints of security.

probably this would be enabled with a special flag? I think the semantics of --relaxed-security make sense here, but it would be most useful for us to enable just this batch feature without the other features guarded by this flag.

I agree.

@jlipps
Copy link
Member Author

jlipps commented Jun 10, 2019

@imurchie any objections to this approach?

@jlipps
Copy link
Member Author

jlipps commented Jun 19, 2019

ok @mykola-mokhnach isFeatureEnabled has been updated, and unit tests added. i think this code is now all good to go. i'll be happy to find uses of relaxedSecurityEnabled in the codebase and update to use this new method.

@jlipps
Copy link
Member Author

jlipps commented Jun 20, 2019

Alright, I've made all the downstream changes so once we merge all of these, the whole codebase should be using the new security model.

Copy link
Contributor

@mykola-mokhnach mykola-mokhnach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This features needs a separate documentation page

Copy link
Member

@KazuCocoa KazuCocoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 promising

@jlipps
Copy link
Member Author

jlipps commented Jun 20, 2019

will work on docs in another PR in the main repo

@jlipps
Copy link
Member Author

jlipps commented Jun 21, 2019

OK, docs added in appium/appium#12807

const W3C_ELEMENT_KEY = 'element-6066-11e4-a52e-4f735466cecf';
const MJSONWP_ELEMENT_KEY = 'ELEMENT';

async function runScript (driverOpts, script, timeout) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should timeout have a default value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does, in the important place where the request from the user is handled (in execute-child.js) but i'll add a check to make sure it's included

@jlipps
Copy link
Member Author

jlipps commented Jun 21, 2019

Also just realized that to support this feature, /execute_driver has to be added to NO_PROXY for specific drivers.

@jlipps
Copy link
Member Author

jlipps commented Jun 21, 2019

I've run some functional tests with @KazuCocoa 's python client changes, and they work (though I ran into a bug with webdriverio sendkeys, seems like these lines are causing the problem--cc @christian-bromann )

@christian-bromann
Copy link
Member

though I ran into a bug with webdriverio sendkeys

I will look into it.

@jlipps
Copy link
Member Author

jlipps commented Jun 24, 2019

will wait a bit to see what @christian-bromann says before merging, in case a code change is required on this end. otherwise i think this is good to go, and will merge/publish all the dependent PRs once this one is merged.

@jlipps
Copy link
Member Author

jlipps commented Jun 25, 2019

ok, with the new version of webdriverio, i've confirmed this works, so will merge now!

@jlipps jlipps merged commit 23d6ecf into master Jun 25, 2019
@jlipps jlipps deleted the batch-commands branch June 25, 2019 19:12
@jlipps
Copy link
Member Author

jlipps commented Jun 25, 2019

published in 3.17.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants