如果您有使用弹性容器实例(VCI)运行作业的需求,您可以通过在控制台开启对应开关来开启相应功能:
当您开启 Spark 服务开关后,如果您使用 SparkApplication 方式提交作业,对应作业的 Driver 及 Executor 将使用弹性容器(VCI)来运行。EMR 将采用 VCI 提供的默认规格族来创建资源。如果您有其他资源规格的诉求(比如 GPU 机型),可参考下述配置以便进行自定义。
已拥有 VKE 集群,并已创建 EMR On VKE Spark 集群,开启 VCI 调度开关。
apiVersion: "sparkoperator.k8s.io/v1beta2" kind: SparkApplication metadata: name: spark-wordcount spec: type: Scala sparkVersion: 3.2.1 mainClass: org.apache.spark.examples.SparkPi mainApplicationFile: "xxx/spark-examples_2.12-3.3.3.jar" arguments: - "1000" driver: annotations: vci.vke.volcengine.com/preferred-instance-family: vci.n3i nodeSelector: {} cores: 1 coreLimit: 1000m memory: 2g executor: annotations: vci.vke.volcengine.com/preferred-instance-family: vci.n3i nodeSelector: {} cores: 1 coreLimit: 1000m memory: 2g memoryOverhead: 1g instances: 1
如果您需要了解更多关于实例规格族的相关信息,请参考:Pod Annotation 说明--容器服务-火山引擎。
当您开启 Ray开关后,如果您使用 RayCluster 或者 RayJob 方式提交作业,对应作业的 Head 及 Worker 将使用弹性容器(VCI)来运行。EMR 将采用 VCI 提供的默认规格族来创建资源。如果您有其他资源规格的诉求(比如 GPU 机型),可参考下述配置以便进行自定义。
已拥有 VKE 集群,并已创建 EMR On VKE Ray 集群,开启 VCI 调度开关。
apiVersion: ray.io/v1 kind: RayCluster metadata: annotations: nginx.ingress.kubernetes.io/rewrite-target: /$1 labels: app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: kuberay helm.sh/chart: ray-cluster-1.0.0 name: raycluster spec: enableInTreeAutoscaling: false headGroupSpec: rayStartParams: num-cpus: "0" dashboard-host: 0.0.0.0 serviceType: ClusterIP template: metadata: annotations: vci.vke.volcengine.com/preferred-instance-family: vci.n3i prometheus.io/path: /metrics prometheus.io/port: "8080" prometheus.io/scrape: "true" labels: app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: kuberay helm.sh/chart: ray-cluster-1.0.0 spec: terminationGracePeriodSeconds: 600 affinity: {} containers: - env: - name: LOG_UPDATE_INTERVAL_S value: '5' - name: VOLC_REGION value: cn-beijing - name: EMR_TOS_BUCKET_TAG_ENABLED value: "true" image: emr-vke-qa-cn-beijing.cr.volces.com/emr/ray:2.36.0-py3.11-ubuntu20.04-278 imagePullPolicy: IfNotPresent name: ray-head resources: limits: cpu: "1" memory: 2Gi requests: cpu: "1" memory: 2Gi securityContext: capabilities: add: - SYS_PTRACE volumeMounts: - mountPath: /opt/hadoop/etc/hadoop name: core-site-volume imagePullSecrets: - name: emr-image-regsecret tolerations: [] volumes: - configMap: defaultMode: 420 name: ray-cluster-core-site name: core-site-volume workerGroupSpecs: - groupName: workergroup maxReplicas: 2147483647 minReplicas: 0 rayStartParams: {} replicas: 1 template: metadata: annotations: vci.vke.volcengine.com/preferred-instance-family: vci.n3i prometheus.io/path: /metrics prometheus.io/port: "8080" prometheus.io/scrape: "true" labels: app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: kuberay helm.sh/chart: ray-cluster-1.0.0 spec: affinity: {} containers: - env: - name: LOG_UPDATE_INTERVAL_S value: '5' - name: VOLC_REGION value: cn-beijing - name: EMR_TOS_BUCKET_TAG_ENABLED value: "true" image: emr-vke-qa-cn-beijing.cr.volces.com/emr/ray:2.36.0-py3.11-ubuntu20.04-278 imagePullPolicy: IfNotPresent name: ray-worker resources: limits: cpu: "1" memory: 1Gi requests: cpu: "1" memory: 1Gi securityContext: capabilities: add: - SYS_PTRACE volumeMounts: - mountPath: /opt/hadoop/etc/hadoop name: core-site-volume imagePullSecrets: - name: emr-image-regsecret tolerations: [] volumes: - configMap: defaultMode: 420 name: ray-cluster-core-site name: core-site-volume --- apiVersion: v1 data: core-site.xml: | <configuration> <property> <name>fs.AbstractFileSystem.tos.impl</name> <value>io.proton.tos.TOS</value> </property> <property> <name>fs.tos.impl</name> <value>io.proton.fs.RawFileSystem</value> </property> <property> <name>fs.tos.endpoint</name> <value>tos-cn-beijing.ivolces.com</value> </property> <property> <name>mapreduce.outputcommitter.factory.scheme.tos</name> <value>io.proton.commit.CommitterFactory</value> </property> <property> <name>fs.tos.credentials.provider</name> <value>io.proton.common.object.tos.auth.DefaultCredentialsProviderChain</value> </property> </configuration> kind: ConfigMap metadata: name: ray-cluster-core-site --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: meta.helm.sh/release-name: ingress-release-name meta.helm.sh/release-namespace: ingress-namespace nginx.ingress.kubernetes.io/rewrite-target: /$1 labels: app.kubernetes.io/instance: ingress-release-name app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: kuberay helm.sh/chart: ray-cluster-1.0.0 name: ingress-release-name namespace: ingress-namespace spec: ingressClassName: nginx rules: - http: paths: - backend: service: name: ingress-release-name-head-svc port: number: 8265 path: /ingress-namespace/ingress-release-name/(.*) pathType: Exact - backend: service: name: ingress-release-name-head-svc port: number: 8080 path: /ingress-namespace/ingress-release-name-metrics/(.*) pathType: Exact
如果您需要了解更多关于实例规格族的相关信息,请参考:Pod Annotation 说明--容器服务-火山引擎。