Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix][Plugin]Add remote plugin discovery for flink yarn applicion deploy mode #7649

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from

Conversation

hanhanzhang
Copy link

Solve the issue: #7321

@github-actions github-actions bot added core SeaTunnel core module Zeta labels Sep 12, 2024
@hanhanzhang hanhanzhang changed the title Add remote plugin discovery for flink yarn applicion deploy mode [Feature][Plugin]Add remote plugin discovery for flink yarn applicion deploy mode Sep 12, 2024
@hanhanzhang hanhanzhang changed the title [Feature][Plugin]Add remote plugin discovery for flink yarn applicion deploy mode [Fix][Plugin]Add remote plugin discovery for flink yarn applicion deploy mode Sep 12, 2024
@corgy-w
Copy link
Contributor

corgy-w commented Sep 13, 2024

Hi,Please follow guide to open ci on your fork repository in github.
this

Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanhanzhang ! Could you add a test case for this?

@Hisoka-X
Copy link
Member

cc @TyrantLucifer

@@ -130,7 +153,6 @@ public List<DataStreamTableInfo> execute(List<DataStreamTableInfo> upstreamDataS
Optional<SaveModeHandler> saveModeHandler = saveModeSink.getSaveModeHandler();
if (saveModeHandler.isPresent()) {
try (SaveModeHandler handler = saveModeHandler.get()) {
handler.open();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore it

@TyrantLucifer
Copy link
Member

image

in my opinion, we natively support application mode, but we need to change the packaging mode

@hanhanzhang
Copy link
Author

image in my opinion, we natively support application mode, but we need to change the packaging mode

yes,we can use connector-plugins as a yarn-ship files parameter. During flink deployment, connector plugins is uploaded to an external storage system. If there are many plugins, task deployment slows down.

@hanhanzhang
Copy link
Author

Remote plugin discover mechanism allows user to upload connector plugins to an external storage system in advance. In application deployment mode, dependency plugin can be loaded from an external storage system as required.

@Carl-Zhou-CN
Copy link
Member

Carl-Zhou-CN commented Sep 14, 2024

${FLINK_HOME}/bin/flink run-application -Dyarn.ship-files='./config/v2.streaming.conf.template' -Dyarn.classpath.include-user-jar=DISABLED -Dyarn.provided.usrlib.dir='hdfs://HDFS4001185/user/flink/usrlib/' --target yarn-application -c org.apache.seatunnel.core.starter.flink.SeaTunnelFlink /usr/local/service/seatunnel/starter/seatunnel-flink-15-starter.jar --config v2.streaming.conf.template --name SeaTunnel

Using the combination of yarn.provided.usrlib.dir along with yarn.ship-files seems to be a good approach.
image
image
image

@Carl-Zhou-CN
Copy link
Member

image

@hanhanzhang
Copy link
Author

${FLINK_HOME}/bin/flink run-application -Dyarn.ship-files='./config/v2.streaming.conf.template' -Dyarn.classpath.include-user-jar=DISABLED -Dyarn.provided.usrlib.dir='hdfs://HDFS4001185/user/flink/usrlib/' --target yarn-application -c org.apache.seatunnel.core.starter.flink.SeaTunnelFlink /usr/local/service/seatunnel/starter/seatunnel-flink-15-starter.jar --config v2.streaming.conf.template --name SeaTunnel

Using the combination of yarn.provided.usrlib.dir along with yarn.ship-files seems to be a good approach. image image image

Yes, I've thought about it this way. I wouldn't recommend putting SeaTunnel lib in yarn.provided.lib.dirs. This would require SeaTunnel and Flink dependency packages to be placed together. I think it might be a good way to use SeaTunnel Connector as a UDF similar to SQL.

In addition, the yarn.provided.usrlib.dirs means all the jars in the classpath, the initial plugin design should be to put the plugins needed for the task in the classpath. What do you think?

@Carl-Zhou-CN
Copy link
Member

Carl-Zhou-CN commented Sep 18, 2024

yarn.provided.usrlib.dir

I am very sorry for my oversight, -Dyarn.provided.usrlib.dir is supported starting from Flink 1.16.

@Hisoka-X
Copy link
Member

Please share some details about how to use this new feature?

@Carl-Zhou-CN
Copy link
Member

command ${SEATUNNEL_HOME}/bin/start-seatunnel-flink-15-connector-v2.sh --config /usr/seatunnel/temp/test.conf --deploy-mode run-application --connectors "hdfs://localhost:8020/seatunnel/2.3.7/connectors" --target yarn-application --yarnqueue default --yarnjobManagerMemory 1024 --yarntaskManagerMemory 1024

@TyrantLucifer
Copy link
Member

TyrantLucifer commented Sep 24, 2024

bin/start-seatunnel-flink-15-connector-v2.sh --config config/v2.batch.config.template --target yarn-application -e run-application --name st-test -Dyarn.application.name=st-test -Dyarn.ship-files="config/v2.batch.config.template\;connectors"

this command can support application mode submit, the act of directly pulling jar packages from HDFS should be executed by Flink, and I think encapsulating them in the client of SeaTunnel is a bit too heavy. Is there any other solution that can directly take advantage of Flink's capabilities?

@hanhanzhang
Copy link
Author

bin/start-seatunnel-flink-15-connector-v2.sh --config config/v2.batch.config.template --target yarn-application -e run-application --name st-test -Dyarn.application.name=st-test -Dyarn.ship-files="config/v2.batch.config.template\;connectors"

this command can support application mode submit, the act of directly pulling jar packages from HDFS should be executed by Flink, and I think encapsulating them in the client of SeaTunnel is a bit too heavy. Is there any other solution that can directly take advantage of Flink's capabilities?

The design allows users to pre-upload Plugin to external storage systems, such as hdfs, s3, etc. flink application deployment only needs to guide the plugin to the external storage system address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core SeaTunnel core module Spark Zeta
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants