Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [oracle] maximum number of processes exceeded #1902

Closed
2 tasks done
Cheers0606 opened this issue Feb 8, 2023 · 15 comments
Closed
2 tasks done

[Bug] [oracle] maximum number of processes exceeded #1902

Cheers0606 opened this issue Feb 8, 2023 · 15 comments
Labels
bug Something isn't working

Comments

@Cheers0606
Copy link
Contributor

Search before asking

  • I searched in the issues and found nothing similar.

Flink version

1.15.2

Flink CDC version

2.3

Database and its version

oracle 11.2.0.4.0

Minimal reproduce step

  1. Create a table in Oracle and import a large amount of data
  2. Set a small split size to make it have a large number of chunks, such as 1w chunks
  3. Start the full incremental synchronization task.

What did you expect to see?

The task is executed normally and the data can be synchronized in full

What did you see instead?

An error occurred after execution for a period of time: oracle maximum number of processes exceeded

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@Cheers0606 Cheers0606 added the bug Something isn't working label Feb 8, 2023
@Cheers0606
Copy link
Contributor Author

I submitted a pr #1903

@Soulmate76
Copy link

Hello, are you sure you're into this method? I added some log printing to this method, but the program did not print any logs when it ran, and I also changed the original log printing, and the program ran not as I changed the log, but as the original log.....

@lujiujiuyj
Copy link

mark!

@Cheers0606
Copy link
Contributor Author

Hello, are you sure you're into this method? I added some log printing to this method, but the program did not print any logs when it ran, and I also changed the original log printing, and the program ran not as I changed the log, but as the original log.....

I'm sure it's in this method. Maybe you can add StartupMode.INITIAL when starting the job.
I also update my pr code, you can take a look and help to test it , thanks

@niaoshuai
Copy link

The database connection pool will not be closed.

@Cheers0606
Copy link
Contributor Author

The database connection pool will not be closed.

There are two problems. One is that the connection pool is not used. The second is that after the connection pool is used, the connection is not be closed , which causes other threads to be unable to obtain the connection

@niaoshuai
Copy link

Last Week,The production server crashed directly. @Cheers0606 This PR can Solve the problem?

@Cheers0606
Copy link
Contributor Author

Last Week,The production server crashed directly. @Cheers0606 This PR can Solve the problem?

You can try. I have tested and successfully synchronized a table with 100 million data

@niaoshuai
Copy link

Caused by: java.sql.SQLTransientConnectionException: connection-pool-192.168.1.192:1521 - Connection is not available, request timed out after 30000ms.

@niaoshuai
Copy link

2023-02-20 18:11:39,159 WARN  org.apache.flink.runtime.taskmanager.Task                    [] - Source: no_primary_1_source[1] -> ConstraintEnforcer[2] (1/1)#8 (8e29a0661c3ee0b7fdbd27166cffd3c7_cbc357ccb763df2852fee8c4fc7d55f2_0_8) switched from RUNNING to FAILED with failure cause: java.lang.RuntimeException: One or more fetchers have encountered exception
	at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:225)
	at org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:169)
	at org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:130)
	at org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:385)
	at org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:542)
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:831)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:780)
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.lang.RuntimeException: SplitFetcher thread 0 received unexpected exception while polling the records
	at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:150)
	at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:105)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	... 1 more
Caused by: org.apache.flink.util.FlinkRuntimeException: Read split SnapshotSplit{tableId=pdb.C##FLINKUSER.NO_PRIMARY_1, splitId='pdb.C##FLINKUSER.NO_PRIMARY_1:1', splitKeyType=[`ROWID` STRING], splitStart=[AAAR2iAAOAAAACgAB6], splitEnd=[AAAR2iAAOAAAAC3AD7], highWatermark=null} error due to java.lang.NullPointerException.
	at com.ververica.cdc.connectors.base.source.reader.external.IncrementalSourceScanFetcher.checkReadException(IncrementalSourceScanFetcher.java:181)
	at com.ververica.cdc.connectors.base.source.reader.external.IncrementalSourceScanFetcher.pollSplitRecords(IncrementalSourceScanFetcher.java:128)
	at com.ververica.cdc.connectors.base.source.reader.IncrementalSourceSplitReader.fetch(IncrementalSourceSplitReader.java:73)
	at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
	at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:142)
	... 6 more
Caused by: io.debezium.DebeziumException: java.lang.NullPointerException
	at com.ververica.cdc.connectors.oracle.source.reader.fetch.OracleScanFetchTask$OracleSnapshotSplitReadTask.execute(OracleScanFetchTask.java:251)
	at com.ververica.cdc.connectors.oracle.source.reader.fetch.OracleScanFetchTask.execute(OracleScanFetchTask.java:110)
	at com.ververica.cdc.connectors.base.source.reader.external.IncrementalSourceScanFetcher.lambda$submitTask$0(IncrementalSourceScanFetcher.java:94)
	... 5 more
Caused by: java.lang.NullPointerException
	at com.ververica.cdc.connectors.oracle.source.reader.fetch.OracleScanFetchTask$OracleSnapshotSplitReadTask.createDataEventsForTable(OracleScanFetchTask.java:332)
	at com.ververica.cdc.connectors.oracle.source.reader.fetch.OracleScanFetchTask$OracleSnapshotSplitReadTask.createDataEvents(OracleScanFetchTask.java:316)
	at com.ververica.cdc.connectors.oracle.source.reader.fetch.OracleScanFetchTask$OracleSnapshotSplitReadTask.doExecute(OracleScanFetchTask.java:276)
	at com.ververica.cdc.connectors.oracle.source.reader.fetch.OracleScanFetchTask$OracleSnapshotSplitReadTask.execute(OracleScanFetchTask.java:246)
	... 7 more

@niaoshuai
Copy link

hello, oracle 19c support? @Cheers0606

@Cheers0606
Copy link
Contributor Author

hello, oracle 19c support? @Cheers0606

I just tested with oracle 11g, flink 1.15. Let me try later.

@Cheers0606
Copy link
Contributor Author

Caused by: java.sql.SQLTransientConnectionException: connection-pool-192.168.1.192:1521 - Connection is not available, request timed out after 30000ms.

There is no problem with the test on my side. Can you describe your represent steps?

@niaoshuai
Copy link

  1. stop job
  2. start job
  3. repeat at least twice

IN FLINK SESSION

@GOODBOY008
Copy link
Member

GOODBOY008 commented Feb 1, 2024

Have fixed in #2254.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants