[Bug] The time out for UpdateMetadata request is too short to cause job failover 

### Search before asking

- [X] I searched in the [issues](https://github.com/alibaba/fluss/issues) and found nothing similar.


### Fluss version

main

### Minimal reproduce step

Currently, `sendMetadataRequestAndRebuildCluster` will have a timeout of 3s. After 3s, the future will be completed without updating metadata. 

### What doesn't meet your expectations?

In my case, a partitioned table with 512 buckets, 512 parallelism for flink sink, it'll be timeout easily and then cause sink job fail..

First, write to a partition, it will try to update the metadata in method `checkAndUpdatePartitionMetadata` to fetch the partition's metadata. If it timeout, the metadata won't be updated in client, and it will then throw PartitionNotExists exception althogth the Partition does exist.

For my case, a time out of 60s works....



### Anything else?

I can see we need to introduce a request time out mechanism to avoid a request to hang out forever.. But for updating metadata request, it should throw Timeout exception instead of just log it to enable caller to decide retry or fail directly..

For example, when creating `FlussTable`, it'll try to fetch the metadata of the table in [metadataUpdater.checkAndUpdateTableMetadata(Collections.singleton(tablePath))](https://github.com/alibaba/fluss/blob/92c6d23bd25b9de61e05859e18c5b267c81f80db/fluss-client/src/main/java/com/alibaba/fluss/client/table/FlussTable.java#L145). If the metadata is timeout, the metadata can't be updated and cause it to throw `table not found in cluster` exception although the table does exist... At least, it should throw timeout exception instead of `table not found in cluster` exception which is really confused.


### Are you willing to submit a PR?

- [ ] I'm willing to submit a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] The time out for UpdateMetadata request is too short to cause job failover #311

Search before asking

Fluss version

Minimal reproduce step

What doesn't meet your expectations?

Anything else?

Are you willing to submit a PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] The time out for UpdateMetadata request is too short to cause job failover #311

Description

Search before asking

Fluss version

Minimal reproduce step

What doesn't meet your expectations?

Anything else?

Are you willing to submit a PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions