Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proxy, token refresh, and multi_endpoint features to generators #878

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

au70ma70n
Copy link

  • add proxy support to rest generator
  • add token refresh support to rest generator
  • add multi_endpoint_rest to generators
  • add proxy support to multi_endpoint_rest generator
  • add token refresh support to multi_endpoint_rest generator

- add token refresh support to rest generator
- add multi_endpoint_rest to generators
- add proxy support to multi_endpoint_rest generator
- add token refresh support to multi_endpoint_rest generator
Copy link
Contributor

github-actions bot commented Sep 4, 2024

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

@jmartin-tech jmartin-tech self-assigned this Sep 4, 2024
Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the progress, this PR tries to accomplish a number of goals at once.

A number of items to consider refactoring:

  • The new class seems to fit in the existing rest module, the rest.MultiEndpointGenerator class can be specified on the command line while leaving the RestGenerator as default.
  • Configuration specific to a single plugin should not be added as top level options.
    • Since the current implementation of proxy support is rest specific the config can be consolidated as part of the generator_options_file. The same applies to verify_ssl.
    • Configuration of token refresh as a separate json config file creates complexity for a feature that is not globally applicable.
  • Please add an example of how all the new functionality configuration should be formatted, this could be something like an added json segment in the class description for when a proxy or token_refresh is to be utilized.
  • Since the parameter names for each stage can be identified by prefix using the existing rest.Generator as a web request client can greatly reduce the code required. There is an offered example for possibly nesting the config for each endpoint to avoid the need for prefix values on the keys that might defer more of the code required.
  • As implemented verify_ssl only applies to token refresh, if a proxies value is passed ssl control likely either needs to be on/off for all request or to have configurable values for each request made.

.devcontainer/devcontainer.json Outdated Show resolved Hide resolved
.gitignore Show resolved Hide resolved
garak/cli.py Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the arguments here make sense to be top level cli entires. If exposed at this level users may expect them to apply to all plugins. Prefer options that can only be utilized by a specific plugin object be configured for that object specifically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arguments were added as top level options to provide the opportunity to integrate the functionality into other parts of the project, however if you'd rather they be moved into the class configuration file that's simple enough.

garak/configurable.py Outdated Show resolved Hide resolved

# Load the token refresh configuration
if hasattr(config_root.transient.cli_args, "token_refresh_config"):
self.token_refresh_path = config_root.transient.cli_args.token_refresh_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__init__ should not access global _config directly this ties in with my comment about the cil options not being top level items. The concept of token refresh is rest generator specific and the level of complexity here in requiring a separate configuration is not desired.

Consider all configuration for a specific plugin class should be provided via one configuration object applied via the Configurable pattern, validation of the values set on the object post load is valid.

Comment on lines +153 to +177
# Load the token refresh configuration
if hasattr(config_root.transient.cli_args, "token_refresh_config"):
self.token_refresh_path = config_root.transient.cli_args.token_refresh_config
with open(self.token_refresh_path, "r") as f:
self.token_refresh_config = json.load(f)
if not isinstance(self.token_refresh_config["method"], str):
raise ValueError("token_refresh_config set but does not contain method")
if not isinstance(self.token_refresh_config["required_secrets"], list):
raise ValueError("token_refresh_config set but does not contain required_secrets list")

if len(self.token_refresh_config["required_secrets"]) == 0:
raise ValueError("token_refresh_config required_secrets list is empty")

self.token_refresh_http_function = getattr(requests, self.token_refresh_config["method"].lower())

secrets = {}
for secret in self.token_refresh_config["required_secrets"]:
if secret in os.environ:
secrets[secret] = os.environ[secret]
else:
raise ValueError(f"token_refresh_config required secret: {secret} not found in environment")
self.token_refresh_config["secrets"] = secrets


if (hasattr(self, "req_template_json_object") and self.req_template_json_object is not None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here, the __init__ should not need to load the file, and should not access transient. Configuration for the class should be fully contained in the Configurable pattern from the initial config file.

Comment on lines +271 to +455
response_fields = []
for required_return in self.first_stage_required_returns:
field_path_expr = jsonpath_ng.parse(required_return["json_field"])
tmp_output_var = field_path_expr.find(first_stage_response_object)
if len(tmp_output_var) == 1:
tmp = {
"name": required_return["name"],
"value": tmp_output_var[0].value
}
response_fields.append(tmp)
else:
logging.error(
"RestGenerator JSONPath in first_stage_required_returns yielded nothing. Response content: %s"
% repr(first_stage_response_object)
)

# Populate second stage request body with the first stage response fields
self._populate_second_stage(response_fields)
second_stage_request_data = self._populate_template(self.second_stage_req_template, prompt)
# Populate second stage request headers
second_stage_request_headers = dict(self.second_stage_headers)
# Populate placeholders in the second stage headers
for k, v in self.second_stage_headers.items():
second_stage_request_headers[k] = self._populate_template(
v, prompt)

second_stage_req_kArgs = {
second_stage_data_kw: second_stage_request_data,
"headers": second_stage_request_headers,
"timeout": self.request_timeout,
"proxies": self.proxies,
"verify": self.verify_ssl,
}

second_stage_resp = self.second_stage_http_function(self.second_stage_uri, **second_stage_req_kArgs)


if second_stage_resp.status_code in self.ratelimit_codes:
raise RateLimitHit(
f"Rate limited: {second_stage_resp.status_code} - {second_stage_resp.reason}")

elif str(second_stage_resp.status_code)[0] == "3":
raise NotImplementedError(
f"REST URI redirection: {second_stage_resp.status_code} - {second_stage_resp.reason}"
)

elif str(second_stage_resp.status_code)[0] == "4":
# Token is expired, refresh it
if first_stage_resp.status_code == 401:
self.need_token_refresh = True
raise RateLimitHit(
f"Rate limited: {first_stage_resp.status_code} - {first_stage_resp.reason}")
else:
raise ConnectionError(
f"REST URI client error: {first_stage_resp.status_code} - {first_stage_resp.reason}"
)

elif str(second_stage_resp.status_code)[0] == "5":
error_msg = f"REST URI server error: {second_stage_resp.status_code} - {second_stage_resp.reason}"
if self.retry_5xx:
raise IOError(error_msg)
else:
raise ConnectionError(error_msg)

second_stage_response_object = json.loads(second_stage_resp.content)


# if response_json_field starts with a $, treat is as a JSONPath
assert (self.second_stage_response_json), "second_stage_response_json must be True at this point; if False, we should have returned already"
assert isinstance(self.second_stage_response_json_field,str), "second_stage_response_json_field must be a string"
assert (len(self.second_stage_response_json_field) >0), "second_stage_response_json_field needs to be complete if second_stage_response_json is true; ValueError should have been raised in constructor"
if self.second_stage_response_json_field[0] != "$":
second_stage_json_extraction_result = [
second_stage_response_object[self.second_stage_response_json_field]]
else:
field_path_expr = jsonpath_ng.parse(self.second_stage_response_json_field)
second_stage_json_extraction_results = field_path_expr.find(second_stage_response_object)
if len(second_stage_json_extraction_results) == 1:
response_value = second_stage_json_extraction_results[0].value
if isinstance(response_value, str):
second_stage_json_extraction_result = [response_value]
elif isinstance(response_value, list):
second_stage_json_extraction_result = response_value
elif len(second_stage_json_extraction_results) > 1:
second_stage_json_extraction_result = [
r.value for r in second_stage_json_extraction_results]
else:
logging.error(
"MultiEndpointGenerator JSONPath in response_json_field yielded nothing. Response content: %s"
% repr(second_stage_response_object)
)
return [None]

return second_stage_json_extraction_result

################################################################################

job_id = self.prompt_sender._call_model(prompt, generations_this_call)
self.response_retriever.uri = f"{self.get_uri}/{job_id}"
return self.response_retriever._call_model(job_id, generations_this_call)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this code could be deferred to the existing RestGenerator object, if it cannot reuse RestGenerator at a minimum it should be refactored into methods reflecting the common error handling patterns here.

Comment on lines +168 to +173
secrets = {}
for secret in self.token_refresh_config["required_secrets"]:
if secret in os.environ:
secrets[secret] = os.environ[secret]
else:
raise ValueError(f"token_refresh_config required secret: {secret} not found in environment")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Values extracted from os.environ are handled in _validate_env_var(), this consolidates mapping for values from environment variables in a common location and allows the configurable process to have already set values or defer them to os.environ, this is another reason a plugin should have a single consolidated configuration.

"verify": self.verify_ssl,
}

# TODO: add error handling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting token refresh as a method call that can then be tagged for backoff and provide consolidated error handling that can offer a clear indicator that authentication refresh caused the failure.

This might even make sense to be a class that is mixed in to RestGenerator to provide periodic authentication capabilities.

Comment on lines +21 to +44
"multi_endpoint_rest": {
"MultiEndpointGenerator": {
"name": "example service",
"post_uri": "https://example.ai/llm",
"post_headers": {
"X-Authorization": "$KEY"
},
"post_req_template_json_object": {
"text": "$INPUT"
},
"post_response_json": true,
"post_response_json_field": "job_id",
"get_uri": "https://example.ai/llm",
"get_headers": {
"X-Authorization": "$KEY"
},
"get_req_template_json_object": {
"text": "$INPUT"
},
"post_response_json": true,
"post_response_json_field": "text"
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example config does not match the class expectations.

    {
        "multi_endpoint_rest": {
            "MultiEndpointGenerator": {
                "name": "example service",
                "first_stage_uri": "https://example.ai/llm",
                "first_stage_headers": {
                    "X-Authorization": "$KEY"
                },
                "first_stage_req_template_json_object": {
                    "text": "$INPUT"
                },
                "first_stage_response_json": true,
                "first_stage_response_json_field": "job_id",
                "second_stage_uri": "https://example.ai/llm",
                "second_stage_headers": {
                    "X-Authorization": "$KEY"
                },
                "second_stage_req_template_json_object": {
                    "text": "$INPUT"
                },
                "second_stage_response_json": true,
                "second_stage_response_json_field": "text"
            }
        }
    }

first_stage_ & second_stage_ prefix are very generic would there be value in a more specific network concepts focused prefix like send_ & recv_?

    {
        "multi_endpoint_rest": {
            "MultiEndpointGenerator": {
                "name": "example service",
                "send_uri": "https://example.ai/llm",
                "send_headers": {
                    "X-Authorization": "$KEY"
                },

Another option might be to organize the config in line with reuse of the existing rest generator.

    {
        "multi_endpoint_rest": {
            "MultiEndpointGenerator": {
                "first_stage": {                    
                    "name": "request service",
                    "uri": "https://example.ai/llm",
                    "method": "post",
                    "headers": {
                        "X-Authorization": "$KEY",
                    },
                    "req_template_json_object": {
                        "text": "$INPUT"
                    },
                    "response_json": true,
                    "response_json_field": "text"
                },
                "second_stage": {
                    "name": "response service",
                    "uri": "https://example.ai/llm",
                    "method": "post",
                    "headers": {
                        "X-Authorization": "$KEY",
                    },
                    "req_template_json_object": {
                        "text": "$INPUT"
                    },
                    "response_json": true,
                    "response_json_field": "text"
                }
            }
        }
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first and second stage variable names were intentionally generic for flexibility, as it can perform any two calls to any two systems. A post and delete for example, or two puts (one to make a change and the other to revert it).

leondz

This comment was marked as off-topic.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this for? What's the use-case? Where can it be tested? Why should we integrate this and maintain it? Where are the docs (should be in .rst format under docs/)?

-- Not saying we shouldn't include this, but without understanding what's gained by this, it's hard to triage the value of this PR. Can we get some clarity around what this brings to the project and why we should add it to the set of things we're maintaining?

Copy link
Author

@au70ma70n au70ma70n Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this for?

This is for supporting LLM application flows that require multiple endpoint interactions.

What's the use case?

The use case is left intentionally generic to support as many use cases as possible, however some example use cases would be the following:

  • An LLM application requires a POST request to generate a response, and a GET request to fetch it
  • Testing against an LLM application without permissions to create new responses, and using two PUT requests, one to change an initial conversation, the other to change it back
  • Testing against an LLM application in production, and using an additional DELETE request to clean up old conversation histories.

Where can it be tested?

This could be tested against anything that you are able to interact with via a rest API. I've tested with ChatGPT, however I'm sure there are many other suitable candidates.

Where are the docs?

Docs have yet to be added as I suspected there would be many refactoring changes, as @jmartin-tech has pointed out.

Can we get some clarity around what this brings to the project and why we should add it to the set of things we're maintaining?

In my opinion the main thing this PR brings to the project is flexibility. This functionality will be required for the testing of many internally developed LLM applications, which I suspect the project values. I know of several LLM applications across various companies that were developed internally and require multiple api interactions across various endpoints to use. This PR is primarily designed to address those applications.

Copy link
Collaborator

@jmartin-tech jmartin-tech Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality will be required for the testing of many internally developed LLM applications, which I suspect the project values. I know of several LLM applications across various companies that were developed internally and require multiple api interactions across various endpoints to use.

I think the primary question here is how and should the project support something that does not have a public example? Anything accepted into a release has to be maintained to some extent. If there is no public reference implementation, it can get very difficult to support consumers as reproduction of issues becomes a high effort activity.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project claims to be "nmap for LLMs." If that's the case, then yes the project should support features required by industry professionals.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question posed is reasonable, how can the project establish the value of supporting this access pattern? Is there a reference example where this is actually encountered in a professional setting that the project can use for testing?

The answers can be I don't know or Nothing that can be publicly shared, simply having information about patterns seen in the wild can influence value.

If there is an available tool or application that follows this pattern for access to inference, this generator has more value and more reason to be maintained and supported.

Supporting the three use cases you provided may meet the criteria, the second and third especially sound interesting and examples of applications that have these patterns may even help this project and it's community further research and build guidance on using this type of access pattern.

Note the third use case is not yet supported. As implemented the second stage response needs to contain the result of generation, I can see it as viable that a DETELE request may not return the data deleted which leads me to suggest there may be another configuration option needed provide a mechanism to return a value from the first stage as the result of the attempt to be evaluated by detectors, and obtain another value from the response to use as the identifier posted in the second stage to remove the history. I am not suggesting this would be required to land this PR, only that this factors into the equation.

Copy link
Author

@au70ma70n au70ma70n Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reference example where this is actually encountered in a professional setting that the project can use for testing?

The answer is yes, I developed this PR specifically for the deployment of garak in an internal, professional setting. Unfortunately the details of the application cannot be publicly shared. Also yes, the third use case is not currently supported. In the future I'd like to see this functionality become more modular, supporting n number of requests and containing configuration options for how to get and return the response data.

Copy link
Owner

@leondz leondz Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, OK. Something we have on the roadmap is for custom plugins to go in user-local custom dirs instead of in the garak package directory. This might be a sensible route for developing this.

As maintainers, when we merge a PR, we agree to take on the technical debt for the concepts and features in that PR for the lifetime of the project. As humans, our time and bandwidth is finite, so each new external-origin feature means a permanent degradation in our ability to steer and progress the project, and a permanent reduction in the velocity we have on our roadmap. I hope you can appreciate, therefore, how seriously we take this sort of thing, especially when it's an agreement to support a target that is invisible to us (and all the other garak users).

For example, because we cannot test things we can't observe, architectural changes are prone to breaking functions like this, even despite our best efforts: we won't have "real" tests, so we'll be unable to perfectly detect when things break. Given that, what kind of SLA would you expect if an architectural change silently broke this plugin? What would be fair? What would be the incentive from either side to engage in that feature/project at all?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architectural changes within garak or externally? I wouldn't expect external architectural changes to affect the garak project, however things like changing how responses are processed for example would. I think a reasonable SLA would be maintaining the data flow into and out of the generator, and I would expect the community to patch the generator to fit architectural changes to their target applications.

@au70ma70n
Copy link
Author

I have read the DCO Document and I hereby sign the DCO

I find additional documentation to rarely be a bad thing, however if you disagree then we can remove it

Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev>
Signed-off-by: au70ma70n <168219140+au70ma70n@users.noreply.github.com>
github-actions bot added a commit that referenced this pull request Sep 5, 2024
@leondz
Copy link
Owner

leondz commented Sep 6, 2024 via email

@jmartin-tech
Copy link
Collaborator

Circling back here, given the number of changes requested could this be separated to isolate the various changes?

I can see the following possible independent PRs:

  • enable restGenerator to accept verify_ssl
    • Allow suppression of ssl certificate validation when executing https based requests
    • Default verify_ssl: True
  • enable restGenerator http proxy support
    • Allow configuration of proxies
    • Include validation that a dictionary is provided and values are valid uri entries, "Dictionary mapping protocol to the URL of the proxy" format example: { "https": "https:10.10.1.1:8443" }
  • enable restGenerator token refresh
    • consider support generic endpoint value request
    • (optional) consider support for template and value extraction
    • (optional) support a specific token generation pattern such as oath2
  • enable multi_endpoint generation
    • accept a series of restGenerator configuration stages
    • support selection of stage to use for generator result
    • support usage of response from a previous stage as $INPUT value of a stage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants