Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid bundle-split with old load-report #826

Merged
merged 2 commits into from
Oct 17, 2017

Conversation

rdhabalia
Copy link
Contributor

Motivation

As mentioned at #821

This issue happens due to stale data into LoadBalancer and it tries to split already split bundle. So, it doesn't have any functional issue and it would be resolved after a minute as soon as load-manager will get latest data. So, to fix it: load-balancer should get latest data after bundle-split. After reproducing, I am seeing below logs

// (1) Load-balancer selected bundle to split
20:26:57.438 [pulsar-load-manager-11-1] INFO  o.a.p.b.l.impl.SimpleLoadManagerImpl - Will split hot namespace bundle cms-java-systest/perf/part-bundle2/0x78000000_0x80000000, topics 2, producers+consumers 2, msgRate in+out 3753.588550254447, bandwidth in+out 3967543.0976189505
// (2) Split is successful
20:26:57.525 [pulsar-web-72-11] INFO  o.a.pulsar.broker.admin.Namespaces   -  Split namespace bundle cms-java-systest/perf/part-bundle2/0x78000000_0x80000000
20:26:57.545 [pulsar-load-manager-11-1] INFO  o.a.p.b.l.impl.SimpleLoadManagerImpl - Successfully split namespace bundle cms-java-systest/perf/part-bundle2/0x78000000_0x80000000

// (3) Again Load-balancer selected same bundle to split with same old data. and it fails while splitiing 
20:27:02.396 [pulsar-load-manager-11-1] INFO  o.a.p.b.l.impl.SimpleLoadManagerImpl - Will split hot namespace bundle cms-java-systest/perf/part-bundle2/0x78000000_0x80000000, topics 2, producers+consumers 2, msgRate in+out 3753.588550254447, bandwidth in+out 3967543.0976189505
20:27:02.396 [pulsar-simple-load-manager-70-1] INFO  o.a.p.b.l.impl.SimpleLoadManagerImpl - doLoadRanking - load balancing strategy: weightedRandomSelection
20:27:02.468 [pulsar-web-72-15] INFO  o.a.pulsar.broker.admin.Namespaces   - Split namespace bundle cms-java-systest/perf/part-bundle2/0x78000000_0x80000000
20:27:02.469 [pulsar-web-72-15] INFO  o.a.p.broker.web.PulsarWebResource   - Successfully validated clusters on property [cms-java-systest]
20:27:02.470 [pulsar-web-72-15] ERROR o.a.p.broker.web.PulsarWebResource   - Failed to validate namespace bundle cms-java-systest/perf-gq1/part-bundle2/0x78000000_0x80000000
java.lang.IllegalArgumentException: Invalid upper boundary for bundle
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:93) ~[guava-15.0.jar:na]
        at org.apache.pulsar.common.naming.NamespaceBundles.validateBundle(NamespaceBundles.java:110) ~[pulsar-broker-1.20.0-incubating-yahoo.jar:1.20.0-incubating-yahoo]
        at org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleRange(PulsarWebResource.java:369) [pulsar-broker-1.20.0-incubating-yahoo.jar:1.20.0-incubating-yahoo]
        at org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleOwnership(PulsarWebResource.java:395) [pulsar-broker-1.20.0-incubating-yahoo.jar:1.20.0-incubating-yahoo]
        at org.apache.pulsar.broker.admin.Namespaces.splitNamespaceBundle(Namespaces.java:854) [pulsar-broker-1.20.0-incubating-yahoo.jar:1.20.0-incubating-yahoo]
        at sun.reflect.GeneratedMethodAccessor115.invoke(Unknown Source) ~[na:na]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_131]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_131]

Modifications

Generate load-report forcefully if load-manager already derived generation is needed.

Result

Load-manager will not try to split bundle based on old load-report which will prevent split-bundle failure.

@rdhabalia rdhabalia added the type/bug The PR fixed a bug or issue reported a bug label Oct 13, 2017
@rdhabalia rdhabalia added this to the 1.21.0-incubating milestone Oct 13, 2017
@rdhabalia rdhabalia self-assigned this Oct 13, 2017
@rdhabalia rdhabalia closed this Oct 13, 2017
@merlimat
Copy link
Contributor

@rdhabalia was it closed on purpose? Are you working on a different fix?

@rdhabalia rdhabalia reopened this Oct 14, 2017
@rdhabalia
Copy link
Contributor Author

I closed it to make sure we won't merge it before I finish the testing. I missed to add one more change which I figured out while testing it. I have tested it and we don't see this issue so, reopening it.

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@merlimat merlimat merged commit b7bbe90 into apache:master Oct 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants