Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate publishing builds to Azure #117891

Closed
1 of 2 tasks
joaomoreno opened this issue Mar 1, 2021 · 3 comments · Fixed by #122886
Closed
1 of 2 tasks

Consolidate publishing builds to Azure #117891

joaomoreno opened this issue Mar 1, 2021 · 3 comments · Fixed by #122886
Assignees
Labels
engineering VS Code - Build / issue tracking / etc. insiders-released Patch has been released in VS Code Insiders
Milestone

Comments

@joaomoreno
Copy link
Member

joaomoreno commented Mar 1, 2021

Until very recently, each build job was directly publishing builds to Azure Storage and CosmosDB. Recently, we made build jobs publish builds as build artifacts, such that it's straightforward for further jobs to download them and use them (ie test build jobs).

We could further:

  • Move all publishing to the Release job. It can simply download the builds as artifacts and upload them to Azure Storage and Mooncake and update CosmosDB. With this we can shave some seconds off each platform agent, since they wouldn't reach out to Azure Storage or CosmosDB.
  • Remove all CosmosDB stored procedures in use. They were introduced due to issues surrounding concurrent writes to CosmosDB from multiple clients. Since we'd have a single client, this would be straightforward.

cc @lszomoru

@joaomoreno joaomoreno added the engineering VS Code - Build / issue tracking / etc. label Mar 1, 2021
@joaomoreno joaomoreno changed the title Consolidate publishing builds to CosmosDB Consolidate publishing builds to Azure Mar 1, 2021
@lszomoru
Copy link
Member

lszomoru commented Mar 3, 2021

As part of this work, we should also improve the error message in case a build already exists. There have been multiple cases over the last several months in which a stable/insiders build fails due to the fact that a build for the commit in question has already been published.

@TylerLeonhardt TylerLeonhardt added this to the April 2021 milestone Apr 1, 2021
@TylerLeonhardt
Copy link
Member

So I ran into a bit of a roadblock with this one... the current createAsset.ts takes in quite a few parameters that are really quite specific to where they were run... for example in linux:

node build/azure-pipelines/common/createAsset.js "server-$PLATFORM_LINUX-web" archive-unsigned "$SERVER_TARBALL_FILENAME" "$SERVER_TARBALL_PATH"

node build/azure-pipelines/common/createAsset.js "$PLATFORM_DEB" package "$DEB_FILENAME" "$DEB_PATH"

while in macOS:

node build/azure-pipelines/common/createAsset.js \
        "$ASSET_ID" \
        archive \
        "VSCode-$ASSET_ID.zip" \
        ../VSCode-darwin-$(VSCODE_ARCH).zip

while in Windows:

exec { node build/azure-pipelines/common/createAsset.js "$AssetPlatform" setup "VSCodeSetup-$Arch-$Version.exe" $SystemExe }

The type is something that requires a lot of context that isn't currently available outside of the stage that createAsset.js was called. As such, if a separate stage were to handle this, and having only the Artifact name as means of communicating parameters to the stage, that would mean that the artifact name needs to be changed to something like:

vscode-(client|server)-(platform)-(type)

and then the actual file inside the artifact (for example: "vscode-server-win32-arm64.zip") would be the file name.

Is that... a good thing? I'm not so sure. Maybe it's fine.

This task is stretching what Azure Pipelines was made to do... the logic would basically be:

      $set = [System.Collections.Generic.HashSet[string]]::new()
      $currentCount = 0

      while ($NUMBER_OF_EXPECTED_ARTIFACTS -gt $currentCount) {
        try {
          # Query the API for the artifacts for the current build
          $res = Invoke-RestMethod "https://dev.azure.com/monacotools/Monaco/_apis/build/builds/$(Build.BuildId)/artifacts?api-version=6.0" -Headers @{
            Authorization = "Bearer $env:SYSTEM_ACCESSTOKEN"
          } -RetryIntervalSec 5 -MaximumRetryCount 5
        } catch {
          Write-Warning $_
          $res = $null
        }

        if ($res -and $res.count) {
          $currentCount = $res.count
          $res.value | ForEach-Object {
            $name = $_.name
            if($name -match 'vscode-*' -and $set.Add($name)) {
              # Download the artifact as a zip
              Invoke-RestMethod $_.resource.downloadUrl -OutFile "$(Agent.TempDirectory)/$name.zip"

              # Get parameters from artifact name
              $vscode, $clientOrServer, $platform, $type = $name.Split('-')

              # Get file name
              Expand-Archive "$(Agent.TempDirectory)/$name.zip" -DestinationPath "$(Agent.TempDirectory)/$name"
              $fileName = Get-ChildItem "$(Agent.TempDirectory)/$name"

              $env:VSCODE_MIXIN_PASSWORD = "$(github-distro-mixin-password)"
              $env:AZURE_DOCUMENTDB_MASTERKEY = "$(builds-docdb-key-readwrite)"
              $env:AZURE_STORAGE_ACCESS_KEY = "$(ticino-storage-key)"
              $env:AZURE_STORAGE_ACCESS_KEY_2 = "$(vscode-storage-key)"
              node build/azure-pipelines/common/createAsset.js `
                $clientOrServer `
                $platform `
                $type `
                $fileName.Name `
                $fileName.FullName
            }
          }
        }

        # wait 10s before polling again
        Start-Sleep -Seconds 10
      }

The unfortunate part is that this would:

  1. have to download the script via PowerShell or wget instead of using Azure Pipelines download concepts
  2. it's a little convoluted IMO
  3. Each stage still has to publish the artifacts with the correct name otherwise things go sidewise
  4. When do we stop polling? We'd have to figure out how many artifacts we expect at the very end

We could still go this route, as it would allow us to move off of storedprocs and only the Release stage will be running these JS files, but ... I do wonder if the extra complexity is worth it.

@joaomoreno
Copy link
Member Author

joaomoreno commented Apr 6, 2021

Is that... a good thing? I'm not so sure. Maybe it's fine.

Yeah I was thinking we could do just that. Either encode the info in the asset filename, or include some data (eg JSON) along with the actual file inside the artifact zip, which would tell createAsset (or its caller) exactly what to do.

have to download the script via PowerShell or wget instead of using Azure Pipelines download concepts

This is OK, we have a PS champ in the team now. 🏆

it's a little convoluted IMO

You actually made it look concise and readable, I really like it.

Each stage still has to publish the artifacts with the correct name otherwise things go sidewise

I think it's equivalent to today's "each call to createAsset.js needs to use the correct name otherwise things to sidewise"

When do we stop polling? We'd have to figure out how many artifacts we expect at the very end

Right, this is the interesting part. Also how do we make it work when reruning failed jobs? Maybe we can have it poll for how many jobs are still running? When that reaches 1 (the current) then we assume there's nothing else to do.

We could still go this route, as it would allow us to move off of storedprocs and only the Release stage will be running these JS files, but ... I do wonder if the extra complexity is worth it.

I really like the fact that (1) all publishing logic happens in one place, (2) every build output is forced to exist as a TFS asset, (3) publishing can be parallelized with other tasks and (4) we can potentially include in here other publishing tasks like Mooncake. The extra complexity is fine to have, as long as we keep it well oiled.


On the actual script, I would actually make createAsset.js run in the background, ie the PowerShell script wouldn't wait for it to finish before retuning to the while loop. Though the PowerShell script would ultimately have to wait for all background invocations to end, after exiting that while loop.

@joaomoreno joaomoreno modified the milestones: April 2021, May 2021 Apr 26, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Jun 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
engineering VS Code - Build / issue tracking / etc. insiders-released Patch has been released in VS Code Insiders
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@joaomoreno @TylerLeonhardt @lszomoru and others