Skip to content
This repository has been archived by the owner on Sep 2, 2022. It is now read-only.

feat:nativeFlinkSessionCluster #235

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kinderyj
Copy link
Contributor

Title:
This new feature is to support flink native k8s cluster.
Prerequisites:
Version >= 1.10 of Apache Flink
Why this feature:
Currently, there are jobManager, taskManager and job in the Spec of FlinkCluster and can create session cluster and per job cluster.
But it does't support the flink's native setup on k8s feature(in Flink 1.10).
Solution:
In the Spec of FlinkCluster, we introduce a new property, which called "nativeSessionClusterJob". With this property, only a job will be created. This job is used to create a native session cluster, the whole process to create the native session cluster is done by
the Flink kernel itself.

@functicons
Copy link
Collaborator

/gcbrun

@functicons
Copy link
Collaborator

Thanks for the contribution! Will review and test.

@functicons functicons self-requested a review April 25, 2020 17:31
Copy link
Collaborator

@functicons functicons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few high level comments first:

  1. I find the new field name confusing, why don't we call it nativeSessionClusterSpec, since it is a session cluster, there is no job.
  2. could you add a sample CR YAML file?
  3. could you add the usage of NativeFlinkSessionCluster in the user doc?

@kinderyj
Copy link
Contributor Author

kinderyj commented Apr 26, 2020

A few high level comments first:

  1. I find the new field name confusing, why don't we call it nativeSessionClusterSpec, since it is a session cluster, there is no job.
  2. could you add a sample CR YAML file?
  3. could you add the usage of NativeFlinkSessionCluster in the user doc?

Thanks for your response! Here is my answer.

quesiton1:
The main flow is as below:

  1. The operator will create a k8s job
  2. This job will create a new pod automatically.
  3. In this new created pod, the entrypoint is /opt/flink/bin/kubernetes-session.sh.
  4. The kubernetes-session.sh is supported by the Flink 1.10, which is to create a new session cluster. That is, the session cluster is created by the flink kernel finally.

question2:
Yes, the sample is already in the config/sampless/flinkoperator_v1beta1_flinknativesessioncluster.yaml

cat flinkoperator_v1beta1_flinknativesessioncluster.yaml
apiVersion: flinkoperator.k8s.io/v1beta1
kind: FlinkCluster
metadata:
name: native-flinksessioncluster-sample
spec:
image:
name: ccr.ccs.tencentyun.com/kinderyj/flink-test:1.10
pullPolicy: Always
nativeSessionClusterJob:
flinkClusterID: native-flinksessioncluster-sample

question3:
Do you mean the user_guide.md? Yes I will do it.
The usage is mainly as below:
How to create a NativeFlinkSessionCluster:
kubectl create -f flinkoperator_v1beta1_flinknativesessioncluster.yaml
How to submit a job:
method 1(have been verified):
/opt/flink/bin/flink run -d -e kubernetes-session -Dkubernetes.cluster-id=$cluster-name examples/streaming/WordCount.jar

method 2:
cat <<EOF | kubectl apply --filename -
apiVersion: batch/v1
kind: Job
metadata:
name: my-job-submitter
spec:
template:
spec:
containers:
- name: wordcount
image: flink:1.10
args:
- /opt/flink/bin/flink
- run
- -d
- -e
- kubernetes-session
- -Dkubernetes.cluster-id=$cluster-name
- /opt/flink/examples/streaming/WordCount.jar
restartPolicy: Never
EOF

The $cluster-name is the MetaData name in the flinkoperator_v1beta1_flinknativesessioncluster.yaml

method3:
And, we can also submit job from the jobmanager's WebUI.

How to delete a NativeFlinkSessionCluster:
kubectl delete -f flinkoperator_v1beta1_flinknativesessioncluster.yaml

@functicons
Copy link
Collaborator

functicons commented Apr 27, 2020

Thanks for the clarification! IIUC, NativeSessionClusterJobSpec is a wrapper of Flink's /opt/flink/bin/kubernetes-session.sh which is a CLI for creating a native Flink session cluster. I think this might not be what we want with this operator, and it is not inline with the operator pattern in general. In operators, the CR should be the owner of the underlying resources, in this case, we probably want a NativeSessionClusterSpec which directly manages the underlying resources, i.e. bypass the /opt/flink/bin/kubernetes-session.sh. Otherwise, we don't see much benefit over just using /opt/flink/bin/kubernetes-session.sh.

To be honest, when I first saw /opt/flink/bin/kubernetes-session.sh, I felt it is not the k8s way. Ideally, we want to use kubectl as the only CLI to interact with k8s, everything domain-specific should be expressed as YAML.

@kinderyj
Copy link
Contributor Author

Thanks for the clarification! IIUC, NativeSessionClusterJobSpec is a wrapper of Flink's /opt/flink/bin/kubernetes-session.sh which is a CLI for creating a native Flink session cluster. I think this might not be what we want with this operator, and it is not inline with the operator pattern in general. In operators, the CR should be the owner of the underlying resources, in this case, we probably want a NativeSessionClusterSpec which directly manages the underlying JM, i.e. bypass the /opt/flink/bin/kubernetes-session.sh. Otherwise, we don't see much benefit over just using /opt/flink/bin/kubernetes-session.sh.

To be honest, when I first saw /opt/flink/bin/kubernetes-session.sh, I felt it is not the k8s way. Ideally, we want to use kubectl as the only CLI to interact with k8s, everything domain-specific should be expressed as YAML.

Thanks for your response. Actually, my first solution is what you suggestd. That is , fink operaor handle the resurce creation. The svc, deployment, configmap which created by /opt/flink/bin/kubernetes-session.sh , will be created and haneled by flink operator. This olution is more complicated,but as you say, it's more inline with the operator pattern. So maybe I can share me my another soultion later as a PR. This solution's code is in a demo status.

@kinderyj
Copy link
Contributor Author

kinderyj commented Apr 27, 2020

Thanks for the clarification! IIUC, NativeSessionClusterJobSpec is a wrapper of Flink's /opt/flink/bin/kubernetes-session.sh which is a CLI for creating a native Flink session cluster. I think this might not be what we want with this operator, and it is not inline with the operator pattern in general. In operators, the CR should be the owner of the underlying resources, in this case, we probably want a NativeSessionClusterSpec which directly manages the underlying JM, i.e. bypass the /opt/flink/bin/kubernetes-session.sh. Otherwise, we don't see much benefit over just using /opt/flink/bin/kubernetes-session.sh.
To be honest, when I first saw /opt/flink/bin/kubernetes-session.sh, I felt it is not the k8s way. Ideally, we want to use kubectl as the only CLI to interact with k8s, everything domain-specific should be expressed as YAML.

Thanks for your response. Actually, my first solution is what you suggested and I have done some days ago. That is , fink operaor handle the resurce creation. The svc, deployment, configmap which created by /opt/flink/bin/kubernetes-session.sh , will be created and haneled by flink operator. This olution is more complicated,and did the work what the flink kernel did. If the flink kernel change the process, the flink operator will have to change. But as you say, it's more inline with the operator pattern.
By the way, as I know, the spark operater also use the wrapper solution, such as /opt/flink/bin/kubernetes-session.sh.

@functicons
Copy link
Collaborator

functicons commented Apr 27, 2020

Thanks for your further clarification! Since the feature is still in Beta, it is subject to changes in future Flink versions, let's hold this PR off and think it through. Anyway, thanks for the initiative!

@kinderyj
Copy link
Contributor Author

Thanks for your further clarification! Since the feature is still in Beta, it is subject to changes in future Flink versions, let's hold this PR off and think it through. Anyway, thanks for the initiative!

Thanks. And In Flink 1.11, the feature native per job session will also be released. Currently, we are designing the solution about the native per job session for flink operator and will make a another PR for this feature, and very glad to discuss about the solution for flink native k8s setup feature. I think our final goal is to make a best solustion.

@kinderyj kinderyj mentioned this pull request Apr 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants