Add proxy, token refresh, and multi_endpoint features to generators #878

au70ma70n · 2024-09-04T06:02:13Z

add proxy support to rest generator
add token refresh support to rest generator
add multi_endpoint_rest to generators
add proxy support to multi_endpoint_rest generator
add token refresh support to multi_endpoint_rest generator

- add token refresh support to rest generator - add multi_endpoint_rest to generators - add proxy support to multi_endpoint_rest generator - add token refresh support to multi_endpoint_rest generator

github-actions · 2024-09-04T06:02:29Z

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

jmartin-tech

Thanks for the progress, this PR tries to accomplish a number of goals at once.

A number of items to consider refactoring:

The new class seems to fit in the existing rest module, the rest.MultiEndpointGenerator class can be specified on the command line while leaving the RestGenerator as default.
Configuration specific to a single plugin should not be added as top level options.
- Since the current implementation of proxy support is rest specific the config can be consolidated as part of the generator_options_file. The same applies to verify_ssl.
- Configuration of token refresh as a separate json config file creates complexity for a feature that is not globally applicable.
Please add an example of how all the new functionality configuration should be formatted, this could be something like an added json segment in the class description for when a proxy or token_refresh is to be utilized.
Since the parameter names for each stage can be identified by prefix using the existing rest.Generator as a web request client can greatly reduce the code required. There is an offered example for possibly nesting the config for each endpoint to avoid the need for prefix values on the keys that might defer more of the code required.
As implemented verify_ssl only applies to token refresh, if a proxies value is passed ssl control likely either needs to be on/off for all request or to have configurable values for each request made.

.devcontainer/devcontainer.json

.gitignore

jmartin-tech · 2024-09-04T14:03:46Z

garak/cli.py

I don't think the arguments here make sense to be top level cli entires. If exposed at this level users may expect them to apply to all plugins. Prefer options that can only be utilized by a specific plugin object be configured for that object specifically.

The arguments were added as top level options to provide the opportunity to integrate the functionality into other parts of the project, however if you'd rather they be moved into the class configuration file that's simple enough.

garak/configurable.py

jmartin-tech · 2024-09-04T14:16:38Z

garak/generators/multi_endpoint_rest.py

+
+        # Load the token refresh configuration
+        if hasattr(config_root.transient.cli_args, "token_refresh_config"):
+            self.token_refresh_path = config_root.transient.cli_args.token_refresh_config


__init__ should not access global _config directly this ties in with my comment about the cil options not being top level items. The concept of token refresh is rest generator specific and the level of complexity here in requiring a separate configuration is not desired.

Consider all configuration for a specific plugin class should be provided via one configuration object applied via the Configurable pattern, validation of the values set on the object post load is valid.

jmartin-tech · 2024-09-04T14:27:59Z

garak/generators/rest.py

+        # Load the token refresh configuration
+        if hasattr(config_root.transient.cli_args, "token_refresh_config"):
+            self.token_refresh_path = config_root.transient.cli_args.token_refresh_config
+            with open(self.token_refresh_path, "r") as f:
+                self.token_refresh_config = json.load(f)
+            if not isinstance(self.token_refresh_config["method"], str):
+                raise ValueError("token_refresh_config set but does not contain method")
+            if not isinstance(self.token_refresh_config["required_secrets"], list):
+                raise ValueError("token_refresh_config set but does not contain required_secrets list")
+
+            if len(self.token_refresh_config["required_secrets"]) == 0:
+                raise ValueError("token_refresh_config required_secrets list is empty")
+
+            self.token_refresh_http_function = getattr(requests, self.token_refresh_config["method"].lower())
+
+            secrets = {}
+            for secret in self.token_refresh_config["required_secrets"]:
+                if secret in os.environ:
+                    secrets[secret] = os.environ[secret]
+                else:
+                    raise ValueError(f"token_refresh_config required secret: {secret} not found in environment")
+            self.token_refresh_config["secrets"] = secrets
+
+
+        if (hasattr(self, "req_template_json_object") and self.req_template_json_object is not None):


Same issue here, the __init__ should not need to load the file, and should not access transient. Configuration for the class should be fully contained in the Configurable pattern from the initial config file.

jmartin-tech · 2024-09-04T14:35:10Z

garak/generators/multi_endpoint_rest.py

+        response_fields = []
+        for required_return in self.first_stage_required_returns:
+            field_path_expr = jsonpath_ng.parse(required_return["json_field"])
+            tmp_output_var = field_path_expr.find(first_stage_response_object)
+            if len(tmp_output_var) == 1:
+                tmp = {
+                    "name": required_return["name"],
+                    "value": tmp_output_var[0].value
+                }
+                response_fields.append(tmp)
+            else:
+                logging.error(
+                    "RestGenerator JSONPath in first_stage_required_returns yielded nothing. Response content: %s"
+                    % repr(first_stage_response_object)
+                )
+
+        # Populate second stage request body with the first stage response fields
+        self._populate_second_stage(response_fields)
+        second_stage_request_data = self._populate_template(self.second_stage_req_template, prompt)
+        # Populate second stage request headers
+        second_stage_request_headers = dict(self.second_stage_headers)
+        # Populate placeholders in the second stage headers
+        for k, v in self.second_stage_headers.items():
+            second_stage_request_headers[k] = self._populate_template(
+                v, prompt)
+
+        second_stage_req_kArgs = {
+            second_stage_data_kw: second_stage_request_data,
+            "headers": second_stage_request_headers,
+            "timeout": self.request_timeout,
+            "proxies": self.proxies,
+            "verify": self.verify_ssl,
+        }
+
+        second_stage_resp = self.second_stage_http_function(self.second_stage_uri, **second_stage_req_kArgs)
+
+
+        if second_stage_resp.status_code in self.ratelimit_codes:
+            raise RateLimitHit(
+                f"Rate limited: {second_stage_resp.status_code} - {second_stage_resp.reason}")
+
+        elif str(second_stage_resp.status_code)[0] == "3":
+            raise NotImplementedError(
+                f"REST URI redirection: {second_stage_resp.status_code} - {second_stage_resp.reason}"
+            )
+
+        elif str(second_stage_resp.status_code)[0] == "4":
+            # Token is expired, refresh it
+            if first_stage_resp.status_code == 401:
+                self.need_token_refresh = True
+                raise RateLimitHit(
+                    f"Rate limited: {first_stage_resp.status_code} - {first_stage_resp.reason}")
+            else:
+                raise ConnectionError(
+                f"REST URI client error: {first_stage_resp.status_code} - {first_stage_resp.reason}"
+            )
+
+        elif str(second_stage_resp.status_code)[0] == "5":
+            error_msg = f"REST URI server error: {second_stage_resp.status_code} - {second_stage_resp.reason}"
+            if self.retry_5xx:
+                raise IOError(error_msg)
+            else:
+                raise ConnectionError(error_msg)
+
+        second_stage_response_object = json.loads(second_stage_resp.content)
+
+
+        # if response_json_field starts with a $, treat is as a JSONPath
+        assert (self.second_stage_response_json), "second_stage_response_json must be True at this point; if False, we should have returned already"
+        assert isinstance(self.second_stage_response_json_field,str), "second_stage_response_json_field must be a string"
+        assert (len(self.second_stage_response_json_field) >0), "second_stage_response_json_field needs to be complete if second_stage_response_json is true; ValueError should have been raised in constructor"
+        if self.second_stage_response_json_field[0] != "$":
+            second_stage_json_extraction_result = [
+                second_stage_response_object[self.second_stage_response_json_field]]
+        else:
+            field_path_expr = jsonpath_ng.parse(self.second_stage_response_json_field)
+            second_stage_json_extraction_results = field_path_expr.find(second_stage_response_object)
+            if len(second_stage_json_extraction_results) == 1:
+                response_value = second_stage_json_extraction_results[0].value
+                if isinstance(response_value, str):
+                    second_stage_json_extraction_result = [response_value]
+                elif isinstance(response_value, list):
+                    second_stage_json_extraction_result = response_value
+            elif len(second_stage_json_extraction_results) > 1:
+                second_stage_json_extraction_result = [
+                    r.value for r in second_stage_json_extraction_results]
+            else:
+                logging.error(
+                    "MultiEndpointGenerator JSONPath in response_json_field yielded nothing. Response content: %s"
+                    % repr(second_stage_response_object)
+                )
+                return [None]
+
+        return second_stage_json_extraction_result
+
+        ################################################################################
+
+        job_id = self.prompt_sender._call_model(prompt, generations_this_call)
+        self.response_retriever.uri = f"{self.get_uri}/{job_id}"
+        return self.response_retriever._call_model(job_id, generations_this_call)


Most of this code could be deferred to the existing RestGenerator object, if it cannot reuse RestGenerator at a minimum it should be refactored into methods reflecting the common error handling patterns here.

jmartin-tech · 2024-09-04T16:54:38Z

garak/generators/rest.py

+            secrets = {}
+            for secret in self.token_refresh_config["required_secrets"]:
+                if secret in os.environ:
+                    secrets[secret] = os.environ[secret]
+                else:
+                    raise ValueError(f"token_refresh_config required secret: {secret} not found in environment")


Values extracted from os.environ are handled in _validate_env_var(), this consolidates mapping for values from environment variables in a common location and allows the configurable process to have already set values or defer them to os.environ, this is another reason a plugin should have a single consolidated configuration.

jmartin-tech · 2024-09-04T16:57:51Z

garak/generators/rest.py

+                "verify": self.verify_ssl,
+            }
+
+            # TODO: add error handling


Consider extracting token refresh as a method call that can then be tagged for backoff and provide consolidated error handling that can offer a clear indicator that authentication refresh caused the failure.

This might even make sense to be a class that is mixed in to RestGenerator to provide periodic authentication capabilities.

jmartin-tech · 2024-09-04T17:03:48Z

garak/generators/multi_endpoint_rest.py

+        "multi_endpoint_rest": {
+            "MultiEndpointGenerator": {
+                "name": "example service",
+                "post_uri": "https://example.ai/llm",
+                "post_headers": {
+                    "X-Authorization": "$KEY"
+                },
+                "post_req_template_json_object": {
+                    "text": "$INPUT"
+                },
+                "post_response_json": true,
+                "post_response_json_field": "job_id",
+                "get_uri": "https://example.ai/llm",
+                "get_headers": {
+                    "X-Authorization": "$KEY"
+                },
+                "get_req_template_json_object": {
+                    "text": "$INPUT"
+                },
+                "post_response_json": true,
+                "post_response_json_field": "text"
+            }
+        }
+    }


This example config does not match the class expectations.

{ "multi_endpoint_rest": { "MultiEndpointGenerator": { "name": "example service", "first_stage_uri": "https://example.ai/llm", "first_stage_headers": { "X-Authorization": "$KEY" }, "first_stage_req_template_json_object": { "text": "$INPUT" }, "first_stage_response_json": true, "first_stage_response_json_field": "job_id", "second_stage_uri": "https://example.ai/llm", "second_stage_headers": { "X-Authorization": "$KEY" }, "second_stage_req_template_json_object": { "text": "$INPUT" }, "second_stage_response_json": true, "second_stage_response_json_field": "text" } } }

first_stage_ & second_stage_ prefix are very generic would there be value in a more specific network concepts focused prefix like send_ & recv_?

{ "multi_endpoint_rest": { "MultiEndpointGenerator": { "name": "example service", "send_uri": "https://example.ai/llm", "send_headers": { "X-Authorization": "$KEY" },

Another option might be to organize the config in line with reuse of the existing rest generator.

{ "multi_endpoint_rest": { "MultiEndpointGenerator": { "first_stage": { "name": "request service", "uri": "https://example.ai/llm", "method": "post", "headers": { "X-Authorization": "$KEY", }, "req_template_json_object": { "text": "$INPUT" }, "response_json": true, "response_json_field": "text" }, "second_stage": { "name": "response service", "uri": "https://example.ai/llm", "method": "post", "headers": { "X-Authorization": "$KEY", }, "req_template_json_object": { "text": "$INPUT" }, "response_json": true, "response_json_field": "text" } } } }

The first and second stage variable names were intentionally generic for flexibility, as it can perform any two calls to any two systems. A post and delete for example, or two puts (one to make a change and the other to revert it).

leondz · 2024-09-05T08:27:50Z

garak/generators/multi_endpoint_rest.py

What's this for? What's the use-case? Where can it be tested? Why should we integrate this and maintain it? Where are the docs (should be in .rst format under docs/)?

-- Not saying we shouldn't include this, but without understanding what's gained by this, it's hard to triage the value of this PR. Can we get some clarity around what this brings to the project and why we should add it to the set of things we're maintaining?

What's this for?

This is for supporting LLM application flows that require multiple endpoint interactions.

What's the use case?

The use case is left intentionally generic to support as many use cases as possible, however some example use cases would be the following:

An LLM application requires a POST request to generate a response, and a GET request to fetch it

Testing against an LLM application without permissions to create new responses, and using two PUT requests, one to change an initial conversation, the other to change it back

Testing against an LLM application in production, and using an additional DELETE request to clean up old conversation histories.

Where can it be tested?

This could be tested against anything that you are able to interact with via a rest API. I've tested with ChatGPT, however I'm sure there are many other suitable candidates.

Where are the docs?

Docs have yet to be added as I suspected there would be many refactoring changes, as @jmartin-tech has pointed out.

Can we get some clarity around what this brings to the project and why we should add it to the set of things we're maintaining?

In my opinion the main thing this PR brings to the project is flexibility. This functionality will be required for the testing of many internally developed LLM applications, which I suspect the project values. I know of several LLM applications across various companies that were developed internally and require multiple api interactions across various endpoints to use. This PR is primarily designed to address those applications.

This functionality will be required for the testing of many internally developed LLM applications, which I suspect the project values. I know of several LLM applications across various companies that were developed internally and require multiple api interactions across various endpoints to use.

I think the primary question here is how and should the project support something that does not have a public example? Anything accepted into a release has to be maintained to some extent. If there is no public reference implementation, it can get very difficult to support consumers as reproduction of issues becomes a high effort activity.

The project claims to be "nmap for LLMs." If that's the case, then yes the project should support features required by industry professionals.

The question posed is reasonable, how can the project establish the value of supporting this access pattern? Is there a reference example where this is actually encountered in a professional setting that the project can use for testing?

The answers can be I don't know or Nothing that can be publicly shared, simply having information about patterns seen in the wild can influence value.

If there is an available tool or application that follows this pattern for access to inference, this generator has more value and more reason to be maintained and supported.

Supporting the three use cases you provided may meet the criteria, the second and third especially sound interesting and examples of applications that have these patterns may even help this project and it's community further research and build guidance on using this type of access pattern.

Note the third use case is not yet supported. As implemented the second stage response needs to contain the result of generation, I can see it as viable that a DETELE request may not return the data deleted which leads me to suggest there may be another configuration option needed provide a mechanism to return a value from the first stage as the result of the attempt to be evaluated by detectors, and obtain another value from the response to use as the identifier posted in the second stage to remove the history. I am not suggesting this would be required to land this PR, only that this factors into the equation.

Is there a reference example where this is actually encountered in a professional setting that the project can use for testing?

The answer is yes, I developed this PR specifically for the deployment of garak in an internal, professional setting. Unfortunately the details of the application cannot be publicly shared. Also yes, the third use case is not currently supported. In the future I'd like to see this functionality become more modular, supporting n number of requests and containing configuration options for how to get and return the response data.

Huh, OK. Something we have on the roadmap is for custom plugins to go in user-local custom dirs instead of in the garak package directory. This might be a sensible route for developing this.

As maintainers, when we merge a PR, we agree to take on the technical debt for the concepts and features in that PR for the lifetime of the project. As humans, our time and bandwidth is finite, so each new external-origin feature means a permanent degradation in our ability to steer and progress the project, and a permanent reduction in the velocity we have on our roadmap. I hope you can appreciate, therefore, how seriously we take this sort of thing, especially when it's an agreement to support a target that is invisible to us (and all the other garak users).

For example, because we cannot test things we can't observe, architectural changes are prone to breaking functions like this, even despite our best efforts: we won't have "real" tests, so we'll be unable to perfectly detect when things break. Given that, what kind of SLA would you expect if an architectural change silently broke this plugin? What would be fair? What would be the incentive from either side to engage in that feature/project at all?

Architectural changes within garak or externally? I wouldn't expect external architectural changes to affect the garak project, however things like changing how responses are processed for example would. I think a reasonable SLA would be maintaining the data flow into and out of the generator, and I would expect the community to patch the generator to fit architectural changes to their target applications.

au70ma70n · 2024-09-05T17:02:16Z

I have read the DCO Document and I hereby sign the DCO

I find additional documentation to rarely be a bad thing, however if you disagree then we can remove it Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: au70ma70n <168219140+au70ma70n@users.noreply.github.com>

leondz · 2024-09-06T17:49:02Z

I mean changes in garak architecture that cause alterations in how high-level components operate, pretty day-to-day stuff in a software project.

…

On Fri, Sep 6, 2024, 19:05 au70ma70n ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ On garak/generators/multi_endpoint_rest.py <#878 (comment)>: Architectural changes within garak or externally? I wouldn't expect external architectural changes to affect the garak project, however things like changing how responses are processed for example would. I think a reasonable SLA would be maintaining the data flow into and out of the generator, and I would expect the community to patch the generator to fit architectural changes to their target applications. — Reply to this email directly, view it on GitHub <#878 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA5YTS2ADNMYUKBDYTVJDTZVHOEJAVCNFSM6AAAAABNTRM2FKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDEOBWG43TKMBWGU> . You are receiving this because you commented.Message ID: ***@***.***>

jmartin-tech · 2024-10-10T16:07:28Z

Circling back here, given the number of changes requested could this be separated to isolate the various changes?

I can see the following possible independent PRs:

enable restGenerator to accept verify_ssl
- Allow suppression of ssl certificate validation when executing https based requests
- Default verify_ssl: True
enable restGenerator http proxy support
- Allow configuration of proxies
- Include validation that a dictionary is provided and values are valid uri entries, "Dictionary mapping protocol to the URL of the proxy" format example: { "https": "https:10.10.1.1:8443" }
enable restGenerator token refresh
- consider support generic endpoint value request
- (optional) consider support for template and value extraction
- (optional) support a specific token generation pattern such as oath2
enable multi_endpoint generation
- accept a series of restGenerator configuration stages
- support selection of stage to use for generator result
- support usage of response from a previous stage as $INPUT value of a stage

- add proxy support to rest generator

b386445

- add token refresh support to rest generator - add multi_endpoint_rest to generators - add proxy support to multi_endpoint_rest generator - add token refresh support to multi_endpoint_rest generator

automatic garak/resources/plugin_cache.json update

dab5e13

jmartin-tech self-assigned this Sep 4, 2024

jmartin-tech requested changes Sep 4, 2024

View reviewed changes

This comment was marked as off-topic.

Sign in to view

leondz reviewed Sep 5, 2024

View reviewed changes

Update garak/configurable.py

f630b0e

I find additional documentation to rarely be a bad thing, however if you disagree then we can remove it Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: au70ma70n <168219140+au70ma70n@users.noreply.github.com>

github-actions bot added a commit that referenced this pull request Sep 5, 2024

@au70ma70n has signed the CLA in #878

b226c7e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add proxy, token refresh, and multi_endpoint features to generators #878

Add proxy, token refresh, and multi_endpoint features to generators #878

au70ma70n commented Sep 4, 2024

github-actions bot commented Sep 4, 2024 •

edited

Loading

jmartin-tech left a comment •

edited

Loading

jmartin-tech Sep 4, 2024

au70ma70n Sep 5, 2024

jmartin-tech Sep 4, 2024

jmartin-tech Sep 4, 2024

jmartin-tech Sep 4, 2024

jmartin-tech Sep 4, 2024

jmartin-tech Sep 4, 2024

jmartin-tech Sep 4, 2024

au70ma70n Sep 5, 2024

This comment was marked as off-topic.

leondz Sep 5, 2024

au70ma70n Sep 5, 2024 •

edited

Loading

jmartin-tech Sep 5, 2024 •

edited

Loading

au70ma70n Sep 5, 2024

jmartin-tech Sep 5, 2024

au70ma70n Sep 6, 2024 •

edited

Loading

leondz Sep 6, 2024 •

edited

Loading

au70ma70n Sep 6, 2024

au70ma70n commented Sep 5, 2024

leondz commented Sep 6, 2024 via email

jmartin-tech commented Oct 10, 2024

Add proxy, token refresh, and multi_endpoint features to generators #878

Are you sure you want to change the base?

Add proxy, token refresh, and multi_endpoint features to generators #878

Conversation

au70ma70n commented Sep 4, 2024

github-actions bot commented Sep 4, 2024 • edited Loading

jmartin-tech left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as off-topic.

Choose a reason for hiding this comment

au70ma70n Sep 5, 2024 • edited Loading

Choose a reason for hiding this comment

jmartin-tech Sep 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

au70ma70n Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

leondz Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

au70ma70n commented Sep 5, 2024

leondz commented Sep 6, 2024 via email

jmartin-tech commented Oct 10, 2024

github-actions bot commented Sep 4, 2024 •

edited

Loading

jmartin-tech left a comment •

edited

Loading

au70ma70n Sep 5, 2024 •

edited

Loading

jmartin-tech Sep 5, 2024 •

edited

Loading

au70ma70n Sep 6, 2024 •

edited

Loading

leondz Sep 6, 2024 •

edited

Loading