Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create plan for Infrastructure Improvements Step 1: Create Roadmap #378

Open
4 tasks done
jkarpen opened this issue Sep 17, 2024 · 13 comments
Open
4 tasks done

Create plan for Infrastructure Improvements Step 1: Create Roadmap #378

jkarpen opened this issue Sep 17, 2024 · 13 comments
Assignees

Comments

@jkarpen
Copy link
Collaborator

jkarpen commented Sep 17, 2024

This can wait until Ian returns since he can help guide on what to prioritize here. Main pain points have been cost management and SCIM rollout (see OKTA related issues) so possibly focus there first, but confirm with Ian. The goals are:

  • Identify any issues that are outdated/no longer needed and can be closed
  • Create issues for topics that should be addressed and do not have one already
  • Identify which issues to tackle first in the near term
  • Create a roadmap for when to tackle remaining issues
@ram-kishore-odi
Copy link
Contributor

Hello Everyone,

I reviewed the items in the backlog under each of the above sections and classified them into groups for discussing as team and plan on the next steps.

#Infrastructure

Future work items

  1. Configure and set up fivetran and dbt
  2. Set up Zoom connector on Fivetran to RAW_PRD with schema odi_zoom
  3. draft future security policy for dbt, snowflake, fivetran
  4. Investigate options for Azure for future projects - Caltrans in particular
  5. Review documentation for new project setup
  6. Document new project setup for Fivetran
  7. Add Project to Sprint Summary table
  8. Capture dbt audit logs in IT-Ops audit platform

Already WIP

  1. Test Asana for Project Management
  2. Investigate notifications to a group when nightly job fails
  3. Investigate options for dbt failure notifications
  4. added Snowflake OAuth instructions and fixed many case, spelling, and grammars errors

#Infrastructure - Okta Rollout

Future work items

  1. Document Okta-related processes
  2. Update dbt set up docs to include Snowflake OAuth
  3. Walk through onboarding/offboarding Okta process with Kevin (had initial discussion took place)

Need to confirm if these are still relevant

  1. Investigate using SSO for AWS CLI and SDK access
  2. Consider using IAM roles for user access
  3. Troubleshoot python errors encountered passing Okta url to snowflake connector function

Already WIP

  1. Consider best practices for Snowflake "break glass" account
  2. Implement Okta SCIM provisioning for our Snowflake accounts

#Infrastructure - Cost Management

Future work items

  1. Dashboard Snowflake Query Costs

Need to confirm if these are still relevant

  1. Set up Fivetran Platform Connector & Dashboard

#Infrastructure - Develop Platform Management Processes

Future work items

  1. Document approach to service accounts in Snowflake
  2. Obtain/review existing documentation template

Need to confirm if these are still relevant

  1. Create incident management system in airtable

#Infrastructure - Orchestration

Need to confirm if these are still relevant

  1. Investigate orchestration options. (This also may require redefining current objectives)

#Infrastructure - Project Templates

Future work items

  1. Validate dbt Cloud CI with Azure DevOps
  2. Create ODI Azure DevOps org
  3. Migrate usage of snowflake_user to snowflake_service_user and snowflake_legacy_service_user

Need to confirm if these are still relevant

  1. Create milestone/issue template

Already WIP

  1. Evaluate options for static docs within Azure DevOps
  2. Connect Azure DevOps MDSA project repo to dbt Cloud
  3. Convert pre-commit GitHub action to Azure Pipelines
  4. Create narrative description of handoff steps in project docs

Completed

  1. Implement workaround for sharing views in terraform configuration
  2. Create more warehouse size options in "ELT" terraform module

#Infrastructure - AWS

Future work items

  1. Improve AWS logging practices
  2. Set up production AWS account

Need to confirm if these are still relevant

  1. Astronomer-on-AWS proof-of-concept

#Infrastructure - Pain Points

Future work items

  1. Improve Batch CI/CD
  2. Figure out how to test template actions
  3. Add ability to customize "ELT" terraform module

Need to confirm if these are still relevant

  1. Should we create a terraform package for our Snowflake ELT setup?
  2. Should we rename our GCP project?

@ram-kishore-odi
Copy link
Contributor

In order complete the individual tasks in the story a follow up discussion with Ian/Team would be necessary.

@ram-kishore-odi
Copy link
Contributor

Hi @ian-r-rose,

Can you please look at the classification of the stories when you get a chance (especially the ones under Need to confirm if these are still relevant sections) ? Based on your feedback, it will be easy to clean up the backlog and prioritize the remaining ones which can be part of the near term roadmap, I feel.

It is fine if you want to discuss these in our sprint planning meeting tomorrow. I hope there will be time to discuss these.

@ian-r-rose
Copy link
Member

Let's discuss in more detail during sprint planning! Some initial thoughts below:

#Infrastructure - Okta Rollout

Need to confirm if these are still relevant

1. [Investigate using SSO for AWS CLI and SDK access](https://github.com/cagov/data-infrastructure/issues/118)

2. [Consider using IAM roles for user access](https://github.com/cagov/data-infrastructure/issues/128)

3. [Troubleshoot python errors encountered passing Okta url to snowflake connector function](https://github.com/cagov/data-infrastructure/issues/289)

Yes, I think these are still relevant. I think Kevin is actually interested in working on Okta+AWS in the new year, you might ask what his plans are there.

Infrastructure - Cost Management

Need to confirm if these are still relevant

1. [Set up Fivetran Platform Connector & Dashboard](https://github.com/cagov/data-infrastructure/issues/167)

I think this one is superseded by #430 and other issues in the cost tracking milestone, and can be closed.

#Infrastructure - Develop Platform Management Processes

Future work items

1. [Document approach to service accounts in Snowflake](https://github.com/cagov/data-infrastructure/issues/201)

2. [Obtain/review existing documentation template](https://github.com/cagov/data-infrastructure/issues/281)

Need to confirm if these are still relevant

1. [Create incident management system in airtable](https://github.com/cagov/data-infrastructure/issues/26)

I think we can close this as not planned right now. We may revisit incident response at a later date, but for now we are not the long-term holder of critical infrastructure.

#Infrastructure - Orchestration

Need to confirm if these are still relevant

1. [Investigate orchestration options](https://github.com/cagov/data-infrastructure/issues/4). (This also may require redefining current objectives)

Let's close this as complete for now. We may revisit orchestration options at some point, but would probably start with a new set of tasks.

Need to confirm if these are still relevant

1. [Create milestone/issue template](https://github.com/cagov/data-infrastructure/issues/240)

We can keep this in the backlog for now.

Need to confirm if these are still relevant

1. [Astronomer-on-AWS proof-of-concept](https://github.com/cagov/data-infrastructure/issues/51)

I think we can close this as not planned for the time being.

Need to confirm if these are still relevant

1. [Should we create a terraform package for our Snowflake ELT setup?](https://github.com/cagov/data-infrastructure/issues/109)

Curious what you think of this one @ram-kishore-odi. We've been using our module's URL in github (plus a commit hash) as an ersatz package for a bit. Do you feel that's working well enough? Or is it worth the effort to publish a more "official" package?

2. [Should we rename our GCP project?](https://github.com/cagov/data-infrastructure/issues/74)

Let's close this one as not planned. We don't have much in GCP anymore.

@ram-kishore-odi
Copy link
Contributor

Thank you so much for your feedback @ian-r-rose ! I will act on the tickets as suggested
Hi @jkarpen We may not need a separate meeting to discuss these as I requested in our sprint planning meeting. I now have sufficient information to proceed further.

Hi @ian-r-rose, with respect to 1. Should we create a terraform package for our Snowflake ELT setup?, I think using module's URL in GitHub (plus a commit hash) is working well enough for now. We can certainly think about creating an official package after the terraform snowflake provider becomes more stable or after the next major release like 1.0 or later.

@ram-kishore-odi
Copy link
Contributor

Closed the task - [ Create issues for topics that should be addressed and do not have one already ] as new stories have been added the the backlog.

@ram-kishore-odi
Copy link
Contributor

Hi @ian-r-rose / @jkarpen - Here is the first version of the tentative roadmap. This can be finalized and updated after the review.

Thank you for your suggestion @ian-r-rose !, I have also included tentative timelines at the end of the roadmap

Like I mentioned in the stand up today (1/3/25), I tried to add this information to coda here - https://coda.io/d/_dY5Oul8-jdi/Supporting-Data_suUkMYYs

Once the review is complete, can you please create a placer holder page for this information ?

Since I started, a few of the stories are in progress or have been completed. Once this content is coda, it can be continuously updated to keep it up to date.

Please let me know if you have any questions.

@ram-kishore-odi
Copy link
Contributor

Included the tentative roadmap and closed the 4th task - [ Create a roadmap for when to tackle remaining issues ]

@jkarpen
Copy link
Collaborator Author

jkarpen commented Jan 3, 2025

Hi @ram-kishore-odi , looks like I don't have access to the roadmap doc currently.

Also I thought the documentation you want to add to Coda is in a different issue? Or you want to add this roadmap to Coda?

I'm going to send you an invite for a short working session on Monday, I can create the Coda pages you need.

@ram-kishore-odi
Copy link
Contributor

ram-kishore-odi commented Jan 3, 2025 via email

@ian-r-rose
Copy link
Member

Also I thought the documentation you want to add to Coda is in a different issue? Or you want to add this roadmap to Coda?

I also thought the documentation you wanted to add was for Okta SCIM/SSO, rather than the roadmap here. I think we could do most of the milestone planning here using GitHub milestones, issues, and boards.

@ram-kishore-odi
Copy link
Contributor

Initially I wanted to add both the high level plan and OKTA to Coda. The reason is even though we use GitHub I felt that it may be good to have high level living document in Coda that can be a reference to planned infrastructure improvements /MDSA stack in the near future. Not sure it would be easy show a top level view of this information in Github easily. But if the preference is to do it in GitHub, I have no concerns.

Also Josh setup a brief meeting today to discuss this topic as well as create a placeholder document for the OKTA SCIM / SSO documentation. I will update the document and send it for review shortly.

@ian-r-rose
Copy link
Member

Initially I wanted to add both the high level plan and OKTA to Coda. The reason is even though we use GitHub I felt that it may be good to have high level living document in Coda that can be a reference to planned infrastructure improvements /MDSA stack in the near future. Not sure it would be easy show a top level view of this information in Github easily. But if the preference is to do it in GitHub, I have no concerns.

I have no objection to having a high-level roadmap in Coda, I think it sounds like a good idea. The only thing that would make me nervous is if we start making lists of GitHub issues, which to me sounds like a lot of work to keep up-to-date, and is better left for a GitHub milestone. But as long as it's more narrative it sounds like a useful page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants