Conversation
| @@ -0,0 +1,3 @@ | |||
| SOURCE: gmail | |||
| TARGET_HOST: gmail.googleapis.com | |||
| OAUTH_SCOPES: https://www.googleapis.com/auth/gmail.metadata | |||
There was a problem hiding this comment.
these files fill environment variables for the cloud function. in essence, they are a config.
README.md
Outdated
| ## Development | ||
|
|
||
| Can run locally via IntelliJ + maven, using run config: | ||
| - `psoxy - run gmail (located in `.idea/runConfigurations`) |
aperez-worklytics
left a comment
There was a problem hiding this comment.
No blockers, just some comments. Impressive work, congrats!
| public class PrebuiltSanitizerOptions { | ||
|
|
||
| static final Sanitizer.Options GMAIL_V1 = Sanitizer.Options.builder() | ||
| .pseudonymization(Pair.of("\\/gmail\\/v1\\/users\\/.*?\\/messages\\/.*", |
There was a problem hiding this comment.
As we discussed and maybe not for the PoC, but that will be better if parametrized in files
| */ | ||
| @Builder | ||
| @Value | ||
| public class Pseudonym { |
There was a problem hiding this comment.
Again and not for now, but we could create a shared package for pseudonym thing to we can reuse code for creating hashes between psoxy and worklytics
There was a problem hiding this comment.
yes, probably sooner than later
| import java.util.stream.Collectors; | ||
|
|
||
| @Log | ||
| public class Route implements HttpFunction { |
There was a problem hiding this comment.
Well, function is not actually routing as we are transforming the results, I'd rename it as Sanitize
There was a problem hiding this comment.
🤔 well, it does something beyond sanitizing:
- route the request to the source API (removes the host + cloud function name; adds the source API host)
- add authentication for source, as needed
There was a problem hiding this comment.
I think Route is ok, other names could be Transponder (transmit and responds, more or less) or ApiCallTranslator(probably more accurate than router)
| com.google.api.client.http.HttpRequest sourceApiRequest = | ||
| requestFactory.buildGetRequest(targetUrl); | ||
|
|
||
| //TODO: what headers to forward??? |
There was a problem hiding this comment.
Maybe that can be included as part of the config
| .setConnectTimeout(SOURCE_API_REQUEST_CONNECT_TIMEOUT) | ||
| .setReadTimeout(SOURCE_API_REQUEST_READ_TIMEOUT); | ||
|
|
||
| //q: add exception handlers for IOExceptions / HTTP error responses, so those retries |
There was a problem hiding this comment.
Actually it should happen in both but with different purpose... proxy should check the request and then worklytics should check the connection against proxy as any other external endpoint.
Proxy should return the status of the request to the source as it is done in L132; if after consuming its retries source request still not working I'll return a 500 or 502 + header. In Worklytics with that code and with the header we may know that we should not retry the connection again, as something happened.
|
|
||
| Configuration jsonConfiguration; | ||
|
|
||
| public void initConfiguration() { |
There was a problem hiding this comment.
Mmm so better if this is the class constructor instead of an init method that should be called by someone
| // return response | ||
| response.setStatusCode(sourceApiResponse.getStatusCode()); | ||
|
|
||
| if (sourceApiResponse.getStatusCode() == HttpStatusCodes.STATUS_CODE_OK) { |
There was a problem hiding this comment.
Maybe better with Response.Status.Family.familyOf(sourceApiResponse.getStatusCode()).equals(Response.Status.Family.SUCCESSFUL); as it will cover all 2xx status
There was a problem hiding this comment.
which pkg is this from? base java, or we need to add a dependency?
There was a problem hiding this comment.
this? https://docs.oracle.com/javaee/7/api/javax/ws/rs/core/Response.html
that requires us to bring in the jax-rs API. not sure that's better than just checking the 200-300 range.
I do question whether it's what we want, as 204 is conventionally 'No Content' right, so will our attempt to parse content crash?
| } else { | ||
| //write error, which shouldn't contain PII, directly | ||
| //TODO: could run this through DLP to be extra safe | ||
| sourceApiResponse.getContent().transferTo(response.getOutputStream()); |
There was a problem hiding this comment.
Add log here to catch info about why request was not successful; the customer could check the function logs to have more details
|
|
||
|
|
||
| GenericUrl buildTarget(HttpRequest request) { | ||
| String targetUri = "https://" |
There was a problem hiding this comment.
As Jose mentioned yesterday, better with a uriBuilder
There was a problem hiding this comment.
k, i'll improve a bit by using URL - but URLBuilder (java) would require us to parse the query string for the request, and then add it back via addParam - seems pointless
URIBuilder is apache http-client, which isn't currently a dependency of this repo
|
|
||
| Pseudonym.PseudonymBuilder builder = Pseudonym.builder(); | ||
| String canonicalValue; | ||
| //q: this auto-detect a good idea? Or invert control and let caller specify with a header |
There was a problem hiding this comment.
At the moment it is, I think. Let's see how this evolve in the future; probably as you say we could specify what we want as a header - and if not specified use the default, which is the one implemented here
|
|
||
| project_id = var.project_id | ||
| connector_service_account_id = "psoxy-gmail-dwd" | ||
| display_name = "Psoxy Connector - GMail Dev Erik" |
There was a problem hiding this comment.
Use developer name as variable? Whenever there is "Erik" replace (maybe done in other PR)
|
|
||
| # todo needed bc as of Sept 2021, no way to expose secret via Cloud Function Maven plugin or | ||
| # terraform | ||
| resource "local_file" "todo" { |
There was a problem hiding this comment.
| "per connection" basis (connection in this context is defined as the abstract concept of ongoing | ||
| data import from a data source to a Worklytics account). |
There was a problem hiding this comment.
maybe worth to do a glossary markdown file?
| <gcpProjectId>psoxy-dev-erik</gcpProjectId> | ||
| <sourceApi>gmail</sourceApi> | ||
| <!-- serviceAccount that function will run as. For Google Workspace use cases, should be the | ||
| one configured as OAuth Client and authorized for your instance. --> | ||
| <serviceAccount>psoxy-gmail-dwd@psoxy-dev-erik.iam.gserviceaccount.com</serviceAccount> |
There was a problem hiding this comment.
all "erik" related stuff ok for ease development, but should be wiped out and use parameters in the future
| final int SOURCE_API_REQUEST_CONNECT_TIMEOUT = 30_000; | ||
| final int SOURCE_API_REQUEST_READ_TIMEOUT = 300_000; |
There was a problem hiding this comment.
Given the amount, I assume are millis. But good to document or set as suffix
| void initSanitizer() { | ||
| //TODO: pull salt from Secret Manager | ||
| sanitizer = new SanitizerImpl( | ||
| PrebuiltSanitizerOptions.MAP.get(getRequiredConfigProperty(ConfigProperty.SOURCE)) | ||
| .withPseudonymizationSalt("salt") | ||
| ); | ||
| } |
There was a problem hiding this comment.
Probably done already, but remember the static initializer to provide global context
| String path = | ||
| request.getPath() | ||
| .replace(System.getenv(RuntimeEnvironmentVariables.K_SERVICE.name()) + "/", "") | ||
| + request.getQuery().map(s -> "?" + s).orElse(""); |
jlorper
left a comment
There was a problem hiding this comment.
Some comments, sorry for the late review
Initial push of working version into
psoxygithub repo. Includes: