Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEADLINE_EXCEEDED makes application not receiving messages at all #770

Closed
mahaben opened this issue Oct 4, 2019 · 99 comments · Fixed by #772
Closed

DEADLINE_EXCEEDED makes application not receiving messages at all #770

mahaben opened this issue Oct 4, 2019 · 99 comments · Fixed by #772
Assignees
Labels
api: pubsub Issues related to the googleapis/nodejs-pubsub API. needs more info This issue needs more information from the customer to proceed. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@mahaben
Copy link

mahaben commented Oct 4, 2019

Environment details

Node.js version: v12.7.0
npm version: 6.10.0
@google-cloud/pubsub version: "^1.0.0",

Error:

insertId: "gnr3q1fz7eerd" jsonPayload: { level: "error" message: "unhandledRejection" originalError: { ackIds: [1] code: 4 details: "Deadline exceeded" } }

After receiving this error, the app does not receive messages anymore and we have to exit the application to recreate the kubernetes pod.

Any help would be appreciated!

@bcoe bcoe added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. needs more info This issue needs more information from the customer to proceed. labels Oct 4, 2019
@bcoe
Copy link
Contributor

bcoe commented Oct 4, 2019

Hey @mahaben did this issue recently start happening?

@mahaben
Copy link
Author

mahaben commented Oct 5, 2019

Hey @bcoe, first time after upgrading googlecloud/pubsub to ^1.0.0. Any workaround to recreate subsciption after this error?

@hx-markterry
Copy link

We used to see these error messages, we now see these errors in all our projects that use it:

Error: Failed to connect to channel. Reason: Failed to connect before the deadline
    at MessageStream._waitForClientReady (/src/node_modules/@google-cloud/pubsub/build/src/message-stream.js:318:19)

@WaldoJeffers
Copy link

WaldoJeffers commented Oct 7, 2019

I can confirm this after upgrading to PubSub ^1.0.0, all our services stop sending pubsubs after the error occurs.

The full stacktrace is

Retry total timeout exceeded before any response was received Error: Retry total timeout exceeded before any response was received
    at repeat (/app/node_modules/@google-cloud/pubsub/node_modules/google-gax/build/src/normalCalls/retries.js:80:31)
    at Timeout.setTimeout [as _onTimeout] (/app/node_modules/@google-cloud/pubsub/node_modules/google-gax/build/src/normalCalls/retries.js:113:25)
    at ontimeout (timers.js:436:11)
    at tryOnTimeout (timers.js:300:5)
    at listOnTimeout (timers.js:263:5)
    at Timer.processTimers (timers.js:223:10) 

Can I suggest raising the priority on this issue?

@maxmoeschinger
Copy link

maxmoeschinger commented Oct 7, 2019

Non of our services using pubsub is working anymore either. We are using version 1.1.0 Getting this:

Error: Retry total timeout exceeded before any response was received
    at repeat (/var/www/app/node_modules/google-gax/build/src/normalCalls/retries.js:80:31)
    at Timeout.setTimeout [as _onTimeout] (/var/www/app/node_modules/google-gax/build/src/normalCalls/retries.js:113:25)
    at ontimeout (timers.js:436:11)
    at tryOnTimeout (timers.js:300:5)
    at listOnTimeout (timers.js:263:5)
    at Timer.processTimers (timers.js:223:10)

And this:

Error: Failed to connect to channel. Reason: Failed to connect before the deadline
  File "/var/www/app/node_modules/@google-cloud/pubsub/build/src/message-stream.js", line 318, col 19, in MessageStream._waitForClientReady
    throw new ChannelError(e);

We have to restart our services every 10 minutes because of that.

It also seems til it is storing more and more to disk as disk usage goes up over time.

@bcoe bcoe added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Oct 7, 2019
@ddehghan
Copy link

ddehghan commented Oct 7, 2019

We also are hitting this. It happens after an hour or two and our publishing stops completely.

My only suspicion was that since we created and cached the topic in our constructor, the topic was timing out. we changed our implementation to to call publish like this:

pubsub.topic('xx').publish()

Now I running some tests to see if that was it or not. If not I am out of ideas since our code matches the examples in this repo.

Our platform is node 12 on alpine on GKE.

@pworkpop
Copy link

pworkpop commented Oct 7, 2019

Seeing the same error Error: Failed to connect to channel. Reason: Failed to connect before the deadline at MessageStream._waitForClientReady with @google-cloud/pubsub 0.31.1, same outcome application cannot receive messages. Does subscription.close().then(() => subscription.open()); help in this case?

@Tolgor
Copy link

Tolgor commented Oct 7, 2019

Same error here with "@google-cloud/storage": "^3.3.1".

Having

const storage = new Storage({
  projectId: config.googleCloud.projectId
});
const bucket = storage.bucket(config.googleCloud.storage.publicBucketName);

and leaving the nodejs process running, raises the error from time to time.

{ Error: Retry total timeout exceeded before any response was received
    at repeat (/home/deploy/app/node_modules/google-gax/build/src/normalCalls/retries.js:80:31)
    at Timeout.setTimeout [as _onTimeout] (/home/deploy/app/node_modules/google-gax/build/src/normalCalls/retries.js:113:25)
    at ontimeout (timers.js:436:11)
    at tryOnTimeout (timers.js:300:5)
    at listOnTimeout (timers.js:263:5)
    at Timer.processTimers (timers.js:223:10) code: 4 }

@ddehghan
Copy link

ddehghan commented Oct 8, 2019

Nope. That didn't work. ;-( still getting this.

GoogleError: Retry total timeout exceeded before any response was received
    at repeat (/var/www/app/node_modules/google-gax/src/normalCalls/retries.ts:98:23)
    at Timeout._onTimeout (/var/www/app/node_modules/google-gax/src/normalCalls/retries.ts:140:13)
    at listOnTimeout (internal/timers.js:531:17)
    at processTimers (internal/timers.js:475:7) {
  code: 4
}

@pworkpop
Copy link

pworkpop commented Oct 8, 2019

Getting similar total timeout exceeded before any response was received errors with subscription.close().then(() => subscription.get());
What is the best approach, should we retry the operation ourselves until it goes through or better tweak the default GAX retry options? (https://googleapis.github.io/gax-nodejs/interfaces/BackoffSettings.html)

To me it seems google pubsub servers have a bug or degraded to a point that it makes them not respond within the expected deadlines.

@maxmoeschinger
Copy link

I have now downgraded to @google-cloud/pubsub version 0.31.0 and added this to my package.json:

"resolutions": {
        "google-gax": "1.3.0"
}

Seems like things are working for longer than 10 minutes now.

@FabianHutin
Copy link

Hello,
We hit the same problem here since 10/02.
We tried upgrading to 0.32.1, and even to 1.1.0.
Didn't solve a thing.
We are running in App Engine, so when one of the instances starts hitting the error, it snowballs and errors flow like crazy until the instance gets killed and another instance starts. Then, errors stop flowing for a bit.

@Tolgor
Copy link

Tolgor commented Oct 8, 2019

Since grpc/grpc-node#1064 (comment)

Using

"resolutions": {
    "@grpc/grpc-js": "^0.6.6"
}

as a temporary fix, works for me.

@callmehiphop
Copy link
Contributor

I'm putting this issue at the top of my list. Would anyone be able to re-test with the latest version of gRPC? A release (v0.6.6) went out yesterday and it may or may not have a fix for this. All that should be needed is to delete any lock files you might have and re-install the PubSub client with the same version you currently have pinned.

@gae123
Copy link

gae123 commented Oct 8, 2019

I believe we are hitting this one as well. After the application runs fine for several hours, we get the following logged from the subscription error handler. New messages that usually arrive once a minute have stopped arriving 5 minutes earlier.

image

wondering if this issue is related to this @bcoe are you thinking the same?

Here are some environment details:

GKE: 1.14.3-gke.11
nodejs: FROM node:10.14-alpine

# yarn list | grep google
├─ @axelspringer/graphql-google-pubsub@1.2.1
│  ├─ @google-cloud/projectify@0.3.3
│  ├─ @google-cloud/pubsub@^0.28.1
│  ├─ @google-cloud/pubsub@0.28.1
│  │  ├─ @google-cloud/paginator@^0.2.0
│  │  ├─ @google-cloud/precise-date@^0.1.0
│  │  ├─ @google-cloud/projectify@^0.3.0
│  │  ├─ @google-cloud/promisify@^0.4.0
│  │  ├─ google-auth-library@^3.0.0
│  │  ├─ google-gax@^0.25.0
│  ├─ google-auth-library@3.1.2
│  ├─ google-gax@0.25.6
│  │  ├─ google-auth-library@^3.0.0
│  │  ├─ google-proto-files@^0.20.0
│  ├─ google-proto-files@0.20.0
│  │  ├─ @google-cloud/promisify@^0.4.0
├─ @google-cloud/common-grpc@1.0.5
│  ├─ @google-cloud/common@^2.0.0
│  ├─ @google-cloud/common@2.2.2
│  │  ├─ @google-cloud/projectify@^1.0.0
│  │  ├─ @google-cloud/promisify@^1.0.0
│  │  ├─ google-auth-library@^5.0.0
│  ├─ @google-cloud/projectify@^1.0.0
│  ├─ @google-cloud/promisify@^1.0.0
│  ├─ @google-cloud/promisify@1.0.2
├─ @google-cloud/common@0.32.1
│  ├─ @google-cloud/projectify@^0.3.3
│  ├─ @google-cloud/projectify@0.3.3
│  ├─ @google-cloud/promisify@^0.4.0
│  ├─ google-auth-library@^3.1.1
│  ├─ google-auth-library@3.1.2
├─ @google-cloud/iot@1.1.3
│  └─ google-gax@^1.0.0
├─ @google-cloud/kms@0.1.0
│  ├─ google-auth-library@1.6.1
│  ├─ google-gax@^0.17.1
│  ├─ google-gax@0.17.1
│  │  ├─ google-auth-library@^1.6.1
│  │  ├─ google-proto-files@^0.16.0
├─ @google-cloud/logging-winston@2.1.0
│  ├─ @google-cloud/logging@^5.3.1
│  ├─ google-auth-library@^5.2.2
├─ @google-cloud/logging@5.3.1
│  ├─ @google-cloud/common-grpc@^1.0.5
│  ├─ @google-cloud/paginator@^2.0.0
│  ├─ @google-cloud/paginator@2.0.1
│  ├─ @google-cloud/projectify@^1.0.0
│  ├─ @google-cloud/promisify@^1.0.0
│  ├─ @google-cloud/promisify@1.0.2
│  ├─ google-auth-library@^5.2.2
│  ├─ google-gax@^1.0.0
├─ @google-cloud/paginator@0.2.0
├─ @google-cloud/precise-date@0.1.0
├─ @google-cloud/projectify@1.0.1
├─ @google-cloud/promisify@0.4.0
├─ @google-cloud/pubsub@0.31.0
│  ├─ @google-cloud/paginator@^2.0.0
│  ├─ @google-cloud/paginator@2.0.1
│  ├─ @google-cloud/precise-date@^1.0.0
│  ├─ @google-cloud/precise-date@1.0.1
│  ├─ @google-cloud/projectify@^1.0.0
│  ├─ @google-cloud/promisify@^1.0.0
│  ├─ @google-cloud/promisify@1.0.2
│  ├─ google-auth-library@^5.0.0
│  ├─ google-gax@^1.0.0
├─ @google-cloud/storage@2.5.0
│  ├─ @google-cloud/common@^0.32.0
│  ├─ @google-cloud/paginator@^0.2.0
│  ├─ @google-cloud/promisify@^0.4.0
├─ @google/maps@0.5.5
│  ├─ @google-cloud/logging-winston@2.1.0
│  ├─ @google-cloud/logging@5.3.1
│  ├─ @google-cloud/iot@1.1.3
│  ├─ @google-cloud/kms@0.1.0
│  ├─ @google-cloud/pubsub@0.31.0
│  ├─ @google-cloud/storage@2.5.0
│  ├─ @google/maps@0.5.5
│  ├─ @types/google__maps@0.5.2
├─ @types/google__maps@0.5.2
│  ├─ google-libphonenumber@^3.1.6
│  ├─ google-auth-library@^3.0.0
│  ├─ google-auth-library@3.1.2
├─ google-auth-library@5.3.0
│  ├─ google-p12-pem@2.0.2
│  │  ├─ google-p12-pem@^2.0.0
├─ google-gax@1.6.4
│  ├─ google-auth-library@^5.0.0
├─ google-libphonenumber@3.2.5
├─ google-p12-pem@1.0.4
├─ google-proto-files@0.16.1
│  ├─ google-p12-pem@^1.0.0
├─ passport-google-oauth@1.0.0
│  ├─ passport-google-oauth1@1.x.x
│  └─ passport-google-oauth20@1.x.x
├─ passport-google-oauth1@1.0.0
├─ passport-google-oauth20@1.0.0

@bcoe
Copy link
Contributor

bcoe commented Oct 8, 2019

@gae123 mind adding grpc to that grep? The specific dependency having issues is a sub-dependency of pubsub.

One thing that is jumping out at me immediately though, is that you're not running pubsub@1.0.0? So it would appear you're actually having issues with the < 1.0.0 version of PubSub?

@bcoe
Copy link
Contributor

bcoe commented Oct 8, 2019

@gae123 I would, if possible, suggest trying out PubSub@1.0.0 as (outside of the rough week of hot fixes we've had) we've been working hard to improve stability.

@mahaben are you able to ry out 0.6.6 of grpc-js as well, sounds like this fix might be on the right track.

@bcoe bcoe closed this as completed in #772 Oct 8, 2019
@gae123
Copy link

gae123 commented Oct 8, 2019

@gae123 mind adding grpc to that grep? The specific dependency having issues is a sub-dependency of pubsub.

@bcoe I have modified the original post to add the information you asked for

@bcoe
Copy link
Contributor

bcoe commented Oct 8, 2019

@mahaben closing this for now, as we believe it is fixed with the latest version of PubSub we've released.

@gae123 could I bother you to open a new issue. The dependency graph you're using is using PubSub in a variety of places, as a deep dependency, but none of the versions linked are up-to-date. I believe you are running into different issues related to older versions of the grpc library.

@mahaben
Copy link
Author

mahaben commented Oct 9, 2019

@bcoe @callmehiphop I don't think this issue should be closed. It still doesn't work after upgrading to "@google-cloud/pubsub": "^1.1.1"

@callmehiphop callmehiphop reopened this Oct 9, 2019
@MatthieuLemoine
Copy link

@bcoe Why not releasing a new version using the grpc workaround? Every developer upgrading @google-cloud/pubsub will encounter this issue.

This issue is hard to catch on dev environments as you have to wait one hour for the error to be triggered. Therefore we can assume that at least some of them will end up pushing broken code to production.

@bcoe
Copy link
Contributor

bcoe commented Oct 15, 2019

@MatthieuLemoine @MichaelMarkieta I have been running a PubSub consumer for 4d15h now, on Kubernetes Engine, without a single issue.

NAME                          READY   STATUS    RESTARTS   AGE
good-reader-8f5fbb755-jbf28   1/1     Running   0          4d15h

This is an issue hitting a percentage of our users, but is not hitting 100% of library users, and we are continuing to attempt to find the actual root cause.

This is why we haven't opted to switch the grpc dependency.

@ericpearson
Copy link

ericpearson commented Oct 16, 2019

This is an issue hitting a percentage of our users, but is not hitting 100% of library users

this comment makes no sense from optics and I totally agree with @MatthieuLemoine release a proper fix for this or one with this 'workaround' built in, asking customers to change their production code with some speculative 'fix' is irresponsible

what happens when this is actually fixed and this ends up causing more problems later

the updated documentation does not even mention what kind of workloads would be better to use native or not making the mere suggestion of using it even more confusing and bug prone potentially

@gae123
Copy link

gae123 commented Oct 16, 2019

This has been a workaround for us too, no issues in the last 24 hours. I ll keep monitoring....

const {PubSub} = require('@google-cloud/pubsub');
const grpc = require('grpc');
const pubsub = new PubSub({grpc});

@bcoe
Copy link
Contributor

bcoe commented Oct 16, 2019

Asking customers to change their production code with some speculative 'fix' is irresponsible

We're very much trying to avoid this, I realize how frustrating the string of patches to @grpc/grpc-js was last week. Which is why we've taken a step back, and asked folks to opt for the grpc library instead, since we've seen this consistently address the issues that people are experiencing.

Our libraries were designed to allow grpc as an alternative to @grpc/grpc-js, specifically in case a situation like this arose (where we saw inconsistencies between the two libraries impacting users).

Now, even though we are advising that folks running into immediate issues switch to grpc,

We are continuing to try to ascertain a consistent reproduction of the issue effecting users. What has made this difficult, is that we see the default configuration (with @grpc/grpc-js) running effectively for many people -- I've had a cluster running for 5 days now, without the behavior reported, and we haven't been able to recreate the issue consistently with various attempts at stress testing in various environments.

Rather than continuing to float patches to @grpc/grpc-js speculatively (which I agree is irresponsible), we are taking the following approach:

  • we have been asking people for detailed information about their runtime environments, and are looking for patterns.
  • we have also been asking people to run with the environment variables GRPC_TRACE=all and GRPC_VERBOSITY=DEBUG, which is allowing us to deliver to the gRPC team the info they need to figure out what the heck is happening to some people.
  • If we're not happy that we've reached a reasonable resolution soon, we will consider switching the default library back to grpc.

@gberth
Copy link

gberth commented Oct 16, 2019

Have run with grpc since 10.30 AM today. App. 400' msgs read. No errors, no hangups. For information - messages like these below (had app 200 of them during 46 hours) has also disappeared

(node:16) Error: Failed to add metadata entry A...: Mon, 14 Oct 2019 11:40:19 GMT. Metadata key "a..." contains illegal characters

@alexander-fenster
Copy link
Contributor

@gberth Thanks for letting us know! Just to give you some details, the metadata warnings are caused not by @grpc/grpc-js but by some (unknown) bug in Node.js http2 implementation. The bug is here nodejs/node#28632 for tracking purposes but no useful debug info is there yet.

grpc uses its own http2 stack, so the Node.js http2 module bug does not affect it - that's why those messages disappeared.

@yanzit
Copy link

yanzit commented Oct 16, 2019

just adding some observations in case it helps. have two services both running pubsub 1.0.0. One is older, and have no issues, one is newer and have the issue. have not tried to roll back the new service, using the workaround for now.

Here are the differences when running "npm ls @grpc/grpc-js"

old:
├─┬ @google-cloud/logging@4.0.1
│ └─┬ google-gax@0.20.0
│ └── @grpc/grpc-js@0.2.0
├─┬ @google-cloud/pubsub@1.0.0
│ ├── @grpc/grpc-js@0.5.4 deduped
│ └─┬ google-gax@1.6.0
│ └── @grpc/grpc-js@0.5.4 deduped
└── @grpc/grpc-js@0.5.4

new
└─┬ @google-cloud/pubsub@1.0.0
├── @grpc/grpc-js@0.5.4
└─┬ google-gax@1.6.4
└── @grpc/grpc-js@0.6.6

@pworkpop
Copy link

pworkpop commented Oct 16, 2019

@bcoe
We run on GKE using credentials passed to PubSub and GoogleAuth (not the GKE service account). Most services run 0.30.1 with google-gax: 1.1.4 & @grpc/grpc-js: 0.4.3 without issues since 0.30.1 was released in Jun. One service was updated to google-gax: 1.6.4 & @grpc/grpc-js: 0.6.6 started observing the issues last week.
We have this solution for token auto-refresh also

this._client = new PubSub({ ...queueConfig, auth: new GoogleAuth(queueConfig) });

Issue doesn't start exactly one hour after start for us so I suspect it is related to connectivity to pubsub servers and how those are rolled in and out of service for updates.
Re-creating the PubSub client (thus the underlying channel) definitely solves this, so I suggest adding this fix to pubsub itself.
We occasionally see deadline exceeded messages in the old services running @grpc/grpc-js: 0.4.3 but those seem to recover all right.
For people that see this happen every hour it may be related to token expiry - unable to refresh the token once it expires - we used to have unauthorised errors without passing that auth parameter.

@ringzhz
Copy link

ringzhz commented Oct 17, 2019

I'd like to 👍 this going back to a P1. Providing a workaround isn't an acceptable response. Indeed, we have implemented the workaround and rolled out to Prod to see that while it successfully mitigated the lost pub/sub connection, it also introduces a memory leak that requires periodic restart of our k8s pods regardless.

@mrmodelo
Copy link

As mentioned on grpc/grpc-node#1064 , this is also an issue that is surfacing when using scheduled Firebase Functions, which you do not have access to the grcp config value. When the connection bridge drops the entire suite of firebase functions hosted also begin to fail. The only workaround is to not have the scheduled function deployed, which is not an acceptable solution.

Error: No connection established at Http2CallStream.call.on (/srv/node_modules/@grpc/grpc-js/build/src/call.js:68:41) at emitOne (events.js:121:20) at Http2CallStream.emit (events.js:211:7) at process.nextTick (/srv/node_modules/@grpc/grpc-js/build/src/call-stream.js:75:22) at _combinedTickCallback (internal/process/next_tick.js:132:7) at process._tickDomainCallback (internal/process/next_tick.js:219:9)

@npomfret
Copy link

Has there been any progress? If not, and the work around is the official way forward are all the docs updated?

@hx-markterry
Copy link

Everytime there has been a new version of @google-cloud/pubsub we've updated, but since grpc-js, we've had increased latency in sending and/or receiving (dependant on version used) or memory leaks. In the past weeks we've been pinning versions to work around various bugs.

We've made the decision to go back to "@google-cloud/pubsub": "0.29.1", as this fixes the problem for us with no code changes.

Could we have a newer pubsub release which enables grpc-js as an option? as this seems the most unstable part of this eco system.

@bcoe
Copy link
Contributor

bcoe commented Oct 18, 2019

I just wanted to give an update before the weekend, we do have a version of @grpc/grpc-js (0.6.9), that all signs are indicating is stable:

  • the timeout issue, that we had managed to reproduce on one system, are no longer occurring.
  • folks running this version of the API have, so far, indicated that they're not seeing any issues.
  • my colleague @murgatroid99 is feeling confident that he addressed the issues that were leading to the known behavior in this thread.

The reason I was holding off on this update, was that we were doing more stress testing on the system that we had managed to reproduce this issue on.


If anyone is still bumping into issues on 0.6.9, please:

  1. open a new issue on PubSub, so that we can debug your issue in isolation (just in case there's more than one thing being debugged in this thread).

  2. run your environment with the following environment variables set:

GRPC_TRACE=all
GRPC_VERBOSITY=DEBUG
  1. provide your logs to us with the gRPC output immediately before an error occurred

So far, with debug information, @murgatroid99 has been able to address issues almost immediately.

If you do not want to share your logs publicly (understandably) you can open an issue through our issue tracker, and also email me (bencoe [at] google.com so that I can make sure it's escalated immediately):

https://issuetracker.google.com/savedsearches/559741


Now, if folks start using @grpc/grpc-js@0.6.9, and it becomes apparent that it is not in fact stable, I will take steps to move us back to grpc immediately in @google-cloud/pubsub (until such time that we are confident).

@bcoe bcoe added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Oct 18, 2019
@npomfret
Copy link

@bcoe I don't fully understand the fix.

In order to get my system working, a rollback of pubsub to 29.1 wasn't enough. I had to roll back a bunch of other google projects I'm using (logging, storage, kms etc) also. I don't know what combination of rollbacks fixed the problem and so I'm worried about rolling forward now.

What exactly is the proposed fix please?

Which projects do I need to apply the fix to?

@pmcnr-hx
Copy link

I've done a fresh install of 1.1.2 (which now is pulling @grpc/grpc-js@0.6.9) in a project and can confirm it looks stable. I will keep it under observation for the next 24 hours and upgrade the remaining projects based on the results.

@pmcnr-hx
Copy link

pmcnr-hx commented Oct 21, 2019

After a few more hours of testing, the behaviour has improved but I'm still seeing the odd message being lost and nacked back into the queue, which results in latency spikes. This does not happen with 0.29.1.

This means0.29.1 is still the latest version that is giving us consistently predictable latency and 1.1.2 + @grpc/grpc-js@0.6.9 is still not performing as well as 0.29.1.

@bcoe
Copy link
Contributor

bcoe commented Oct 21, 2019

👋 as mentioned Friday, we're testing early this week with @grpc/grpc-js@0.6.9 as a stable release candidate for @grpc/grpc-js.

We ask that folks upgrade to @grpc/grpc-js@0.6.9, and let us know if you have any trouble doing so.

If you continue to run into issues with this new version of the dependency, I ask that we:

1. create a new issue on this repo, which I will prioritize as P1

@pmcnr-hx, I have already done so for the memory issue you've raised.

2. run your system with debugging enabled, so that we can ship logs to the gRPC folks

GRPC_TRACE=all
GRPC_VERBOSITY=DEBUG

3. share the logs with the engineers debugging this issue

You can open an issue on the issue tracker here to deliver the logs, if there's anything you wish to keep private.

https://issuetracker.google.com/savedsearches/559741

You can also send an email to bencoe [at] google.com, so that he can make sure things get escalated appropriately.


If it becomes obvious that things are stable, we will start working on a more significant rollback early this week.

@googleapis googleapis locked as resolved and limited conversation to collaborators Oct 21, 2019
@google-cloud-label-sync google-cloud-label-sync bot added the api: pubsub Issues related to the googleapis/nodejs-pubsub API. label Jan 31, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api: pubsub Issues related to the googleapis/nodejs-pubsub API. needs more info This issue needs more information from the customer to proceed. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.