-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add atlassian.py for Jira Data Center DLS #2161
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, left some comments about importing / keeping it DRY.
connectors/sources/atlassian.py
Outdated
@@ -17,6 +17,11 @@ | |||
RETRIES = 3 | |||
RETRY_INTERVAL = 2 | |||
USER_BATCH = 50 | |||
DATA_CENTER_USER_BATCH = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does DATA_CENTER_USER_BATCH
specify exactly? Is this the max amount of users we can fetch in one request? Is it the same value as in jira.py?
connectors/connectors/sources/jira.py
Line 43 in 1d4fb78
MAX_USER_FETCH_LIMIT = 1000 |
If they're the same value, I suggest having it defined in this file (atlassian.py) and importing it into any sources that use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also confused about DATA_CENTER_USER_BATCH
, specifically because below it seems to be not for DataCenter, but for anything non-cloud:
f"{url}?startAt={start_at}"
if self.source.configuration["data_source"]
in [JIRA_CLOUD, CONFLUENCE_CLOUD]
else url.format(start_at=start_at, max_results=DATA_CENTER_USER_BATCH)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they're the same value, I suggest having it defined in this file (atlassian.py) and importing it into any sources that use it.
The value is definitely same but the purpose is different at
connectors/connectors/sources/jira.py
Line 43 in 1d4fb78
MAX_USER_FETCH_LIMIT = 1000 |
atlassian.py
, the value is used for fetching just non-cloud users.
I'm also confused about DATA_CENTER_USER_BATCH, specifically because below it seems to be not for DataCenter, but for anything non-cloud:
renaming the constant to get more clear idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they're the same value, I suggest having it defined in this file (atlassian.py) and importing it into any sources that use it.
The value is definitely same but the purpose is different at
connectors/connectors/sources/jira.py
Line 43 in 1d4fb78
MAX_USER_FETCH_LIMIT = 1000 |
atlassian.py
, the value is used for fetching just non-cloud users.
I'm also confused about DATA_CENTER_USER_BATCH, specifically because below it seems to be not for DataCenter, but for anything non-cloud:
renaming the constant to get more clear idea.
@@ -135,7 +152,7 @@ async def user_access_control_doc(self, user): | |||
ACCESS_CONTROL: [<prefixed_account_id>, <prefixed_group_ids>, <prefixed_role_keys>] | |||
} | |||
""" | |||
account_id = user.get("accountId") | |||
account_id = user.get("accountId") or user.get("name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is user.get("name")
guaranteed to be unique? Looks like account_id
ends up being used as the access control document's _id
.
Is this added because accountId
is sometimes missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is user.get("name") guaranteed to be unique?
Yes, it is unique for all users.
Is this added because accountId is sometimes missing?
Exactly, accountId
is missing while fetching users for non-cloud.
if ( | ||
self.source.configuration["data_source"] in [JIRA_CLOUD, CONFLUENCE_CLOUD] | ||
and user_info.get("accountType") != "atlassian" | ||
): | ||
self.source._logger.debug( | ||
f"Skipping {user_name} because the account type is {user_info.get('accountType')}. Only 'atlassian' account type is supported." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this log message get updated too? Since now the account type could be atlassian
, but the data source config is non-cloud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is designed to be logged for just cloud data source according to this condition
self.source.configuration["data_source"] in [JIRA_CLOUD, CONFLUENCE_CLOUD]
Previously, DLS was designed to just cloud platform so did not have this condition but now we've explicitly added it just to exclude non-cloud data source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks 🚀
Part Of #1957
Related PR #2108
This PR contain the changes in atlassian.py file for Jira Data Center DLS
Checklists
Pre-Review Checklist
v7.13.2
,v7.14.0
,v8.0.0
)