Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add util class for parsing ExtensionAction bytes #607

Closed
dbwiddis opened this issue Mar 25, 2023 · 12 comments
Closed

[FEATURE] Add util class for parsing ExtensionAction bytes #607

dbwiddis opened this issue Mar 25, 2023 · 12 comments
Assignees
Labels
CCI Part of the College Contributor Initiative enhancement New feature or request good first issue Good for newcomers more-challenging Good issues for new contributors that present a small challenge

Comments

@dbwiddis
Copy link
Member

dbwiddis commented Mar 25, 2023

Is your feature request related to a problem?

Don't Repeat Yourself

While reviewing JS #349 I noticed a lot of repetition of some of the same code as in #588.

This is all general code for the common pattern we've chosen of appending a request class name as a string to its parameters in bytes.

What solution would you like?

Copy these repeated features to a class in the same package as the ExtensionAction / ExtensionTransportAction / ExtensionActionRequest and related classes. They might fit well in an existing class (ExtensionAction possibly) or in a new class (ExtensionActionUtil). These include:

  • the UNIT_SEPARATOR byte constant
  • a method for finding the position of that separator in the byte array
  • methods for splitting the class string and param bytes (and the opposite for gluing them together)
  • perhaps even a method to instantiate the class with the params

Basically, all the functions needed to do this:

byte[] requestClassBytes = request.getRequestClass().getBytes(StandardCharsets.UTF_8);
byte[] proxyRequestBytes = ByteBuffer.allocate(requestClassBytes.length + 1 + request.getRequestBytes().length)
.put(requestClassBytes)
.put(RemoteExtensionActionRequest.UNIT_SEPARATOR)
.put(request.getRequestBytes())
.array();

and this:

int nullPos = indexOf(request.getRequestBytes(), RemoteExtensionActionRequest.UNIT_SEPARATOR);
String requestClassName = new String(Arrays.copyOfRange(request.getRequestBytes(), 0, nullPos + 1), StandardCharsets.UTF_8)
.stripTrailing();
ActionRequest actionRequest = null;
try {
Class<?> clazz = Class.forName(requestClassName);
Constructor<?> constructor = clazz.getConstructor(StreamInput.class);
StreamInput requestByteStream = StreamInput.wrap(
Arrays.copyOfRange(request.getRequestBytes(), nullPos + 1, request.getRequestBytes().length)
);
actionRequest = (ActionRequest) constructor.newInstance(requestByteStream);

Plus there's one more method I see in JS #349 that converts a Writeable to bytes, and that's also useful.

What alternatives have you considered?

Repeating ourselves.

Do you have any additional context?

This somewhat doubles down on using Writeable and if we're considering switching to protobuf, then we may consider a different approach.

@dbwiddis dbwiddis added enhancement New feature or request good first issue Good for newcomers untriaged CCI Part of the College Contributor Initiative more-challenging Good issues for new contributors that present a small challenge and removed untriaged labels Mar 25, 2023
@Kuanysh-kst
Copy link
Contributor

Hello @dbwiddis, I would like to take this issue

@dbwiddis
Copy link
Member Author

dbwiddis commented Mar 25, 2023

Assigning it to you, @Kuanysh-kst but let's make sure we prioritize finishing up your currently open parsing validator PR.

@Kuanysh-kst
Copy link
Contributor

@dbwiddis can I redo this code like this :

    public static ExtensionActionRequest createExtensionActionRequest(ActionRequest actionRequest) {
        try {
            return (ExtensionActionRequest) actionRequest;
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
        return null;
    }

@dbwiddis
Copy link
Member Author

@Kuanysh-kst I'm missing what comment this is in a reply to, but generally speaking:

  • this is a whole method to just do a class casting, which can happen on one line. I'm guessing there's more to it than this.
  • We don't print to stdout, if necessary we log something.
  • We would have to handle the null wherever we call this so it's not a good return value for utility functions unless well documented.

I suspect there is a bigger question here that I'm missing because this comment seems disconnected.

@dbwiddis
Copy link
Member Author

@Kuanysh-kst I think I got an email about a comment here but it may have been deleted. Let me try to clarify all the places where I think a util is useful. I'm going to use the Hello World RestRemoteHelloAction sample here.

Creating the request: this is already handled with the constructor syntax, no util needed.

RemoteExtensionActionRequest proxyActionRequest = new RemoteExtensionActionRequest(SampleAction.INSTANCE, sampleRequest);

This is passed to the SDKTransportService which also collects the response

RemoteExtensionActionResponse response = sdkTransportService.sendRemoteExtensionActionRequest(request);

The SDKTransportService (1st code snippet above) serializes the request class name and its input bytes, suitable for reconstructing it later. So this is a good util function, RestExtensionActionRequest to bytes to use as a param for TransportActionRequestFromExtension.

Upon receipt at the target extension, we have to deserialize, that is handled in ExtensionActionRequestHandler (second code snippet above). This produces the desired final class from the name name (in this case, an instance of SampleRequest) but to make the variable generic enough, we have its superclass, ActionRequest. This second snippet can be broken down into two functionalities:

  • Separating the byte stream into the class name and bytes
  • Instantiating the class

They aren't exactly in sequence in the above snippet but the pieces are there. Separating the stream and bytes is

int nullPos = indexOf(request.getRequestBytes(), RemoteExtensionActionRequest.UNIT_SEPARATOR);
String requestClassName = new String(Arrays.copyOfRange(request.getRequestBytes(), 0, nullPos + 1),
    StandardCharsets.UTF_8).stripTrailing();
StreamInput requestByteStream = StreamInput.wrap(
    Arrays.copyOfRange(request.getRequestBytes(), nullPos + 1, request.getRequestBytes().length)
);

(Note the indexOf can/should be its own util method).

And then instantiating the class is:

Class<?> clazz = Class.forName(requestClassName);
Constructor<?> constructor = clazz.getConstructor(StreamInput.class);
actionRequest = (ActionRequest) constructor.newInstance(requestByteStream);

That's all we need, since the next thing we do is client.execute() which only requires that ActionRequest class methods, so the fact it's a subclass doesn't matter.

SO in this sequence there are 3 distinct util methods:

  1. Request name+bytes = combined bytes (the first snippet)
  2. An indexOf method that you can pass the separator to
  3. (Part 1) A method to separate the bytes (using the indexof) into the class name and streaminput
  4. (Part 2) Using the above two results to instantiate an ActionRequest

Ideally the two methods in part 3 would be separate methods, but since we are returning two bits of information that complicates things, so a single method returning an ActionRequest works.

@dbwiddis
Copy link
Member Author

dbwiddis commented Apr 11, 2023

In addition to the above 3 methods which work for Extension-to-Extension, we have the Plugin-to-Extension version.

I noted that opensearch-project/job-scheduler#349 contained many of the same methods:

  • it used indexOf() which is defined above

It has a new method to convert a Writeable to its byte stream:

private static <T extends Writeable> byte[] convertParamsToBytes(T actionParams)

This isn't obvious but in the Extension case (Hello World) we implemented similar code in the constructor taking the sample request. It could be made shorter using that convertParamsToBytes which does the same thing.

    public RemoteExtensionActionRequest(ActionType<? extends ActionResponse> instance, ActionRequest request) {
        this.action = instance.getClass().getName();
        this.requestClass = request.getClass().getName();
        byte[] bytes = new byte[0];
        try (BytesStreamOutput out = new BytesStreamOutput()) {
            request.writeTo(out);
            bytes = BytesReference.toBytes(out.bytes());
        } catch (IOException e) {
            throw new IllegalStateException("Writing an OutputStream to memory should never result in an IOException.");
        }
        this.requestBytes = bytes;
    }

So this conversion of Writeable (Really an ActionRequest but we only need the Writeable interface for the method to work) to its bytes is a fourth method that should be added to the util class.

@dbwiddis
Copy link
Member Author

So to sum up the above 2 comments, the 4 methods needed are:

  1. ActionRequest --> bytes from its params
  2. ActionRequest name + paramBytes = bytes to send
  3. Reverse of both 2 and 1 to reconstruct an ActionRequest (ideally we'd reverse 2 and reverse 1 separately but that's harder to do)
  4. an indexOf util method that all of the above can use

@dbwiddis
Copy link
Member Author

dbwiddis commented Apr 26, 2023

The Util PR is getting a bit confusing with the naming. Let's be clear on the pieces of information we have here.

Ultimately we want to enable calling client.execute(action, request, listener) so we are sending the action (an instance of ActionType) and request (which itself has a class name, and parameters) over the transport.

Step 1: turn action (extends ActionType) into a class name (this is easily done, no util needed)
Step 2: turn request (extends ActionRequest ) into a class name + separator + params-as-bytes = request-as-bytes
Step 3: create a RemoteExtensionActionRequest using action and request-as-bytes (you don't need util for this)
Step 4: on the remote extension, turn the RemoteExtensionActionRequest into an action class and a request class so that you can eventually call client.execute.

@Kuanysh-kst
Copy link
Contributor

@dbwiddis can I open a new PR again , there are a lot of problems in the old one

@dbwiddis
Copy link
Member Author

@dbwiddis can I open a new PR again , there are a lot of problems in the old one

You can, but it's generally a better idea to try to fix the old one. Usually a rebase commit and resolving conflicts fixes a lot of the issues.

git checkout main
git pull upstream main
git checkout yourprbranch
git rebase main

This will show any conflicting files. Resolve conflicts, then git add and git rebase --continue.

Once you're done, force push the result.

Then fix any compile errors and push the fix.

@Kuanysh-kst
Copy link
Contributor

@dbwiddis can you check the new commit

@minalsha
Copy link
Collaborator

@dbwiddis can we close this issue since the PR associated with this issue: opensearch-project/OpenSearch#6969 is merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CCI Part of the College Contributor Initiative enhancement New feature or request good first issue Good for newcomers more-challenging Good issues for new contributors that present a small challenge
Projects
None yet
Development

No branches or pull requests

3 participants