You need to enable JavaScript to run this app.
导航
VCI中部署spark-operator
最近更新时间:2024.01.09 10:16:06首次发布时间:2024.01.09 10:16:06

本文介绍在 VCI 中部署spark-operator。

前言

在VCI中通过Helm CLI部署spark-operator,并并运行任务。

关于实验

预计实验时间:30分钟
级别:初级
相关产品:VKE
受众: 通用

实验说明

如果还没有火山引擎账号,点击此[链接]注册账号
如果还没有VCI集群参考此链接快速创建一个VCI集群
火山引擎基础版镜像仓库准备,参考此链接

第一步、环境说明

kubernetes版本:v1.20.15-vke.5
1.安装helm CLI
参考helm官网
2.添加WordPress官方Repo地址

$ helm repo add spark-operator https://googlecloudplatform.github.io/spark-on-k8s-operator

3.下载chart包并解压

$ helm pull spark-operator/spark-operator
$ tar xf spark-operator-1.1.26.tgz
$ ls -l
total 36
drwxr-xr-x 4 root root  4096 Nov  9 10:53 spark-operator
-rw-r--r-- 1 root root 28828 Nov  9 10:52 spark-operator-1.1.26.tgz

4.下载spark-operator镜像并上传到镜像仓库

$ docker pull ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.8-3.1.1
$ docker tag ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.8-3.1.1 cr-cn-beijing.volces.com/spark-operator/spark-operator:v1beta2-1.3.8-3.1.1
$ docker login --username=dongxiaogang@2100175284 cr-cn-beijing.volces.com
$ docker push cr-cn-beijing.volces.com/spark-operator/spark-operator:v1beta2-1.3.8-3.1.1

5.修改 spark-operator values.yaml 文件

cat values.yaml |grep repository
  # -- Image repository
  # repository: ghcr.io/googlecloudplatform/spark-operator
  # 修改为火山引擎镜像仓库地址
  repository: cr-cn-beijing.volces.com/spark-operator/spark-operator

6.修改 spark-operator deployment.yaml 文件

$ cat templates/deployment.yaml |grep -A5 -B2 vci
    {{- if or .Values.podAnnotations .Values.metrics.enable }}
      annotations:
        vke.volcengine.com/burst-to-vci: enforce   #强制使用 VCI
      {{- if .Values.metrics.enable }}
        prometheus.io/scrape: "true"
        prometheus.io/port: "{{ .Values.metrics.port }}"
        prometheus.io/path: {{ .Values.metrics.endpoint }}
      {{- end }}

7.部署spark-operator

$ kubectl create ns spark-operator
$ pwd
/root/spark-operator
$ helm install -n spark-operator spark-operator .

8.验证是否部署成功

$ helm list -n spark-operator
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION        
spark-operator  spark-operator  1               2022-11-09 11:29:47.44509618 +0800 CST  deployed        spark-operator-1.1.26   v1beta2-1.3.8-3.1.1
$ kubectl  get pod -n spark-operator
NAME                             READY   STATUS    RESTARTS   AGE
spark-operator-d47b949c9-2dqzd   1/1     Running   0          2m7s

9.运行测试程序

cat test-spark-operaotr.yaml 
# Copyright 2017 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: spark-pi
  namespace: spark-operator
spec:
  type: Scala
  mode: cluster
  #image: "cr-share-cn-shanghai.cr.volces.com/spark/spark-operator:v3.1.1" #修改了官方地址为火山引擎镜像仓库地址
  image: "cr-cn-beijing.volces.com/spark-operator/spark-operator:v1beta2-1.3.8-3.1.1" #修改了官方地址为火山引擎镜像仓库地址
  imagePullPolicy: Always
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar"
  sparkVersion: "3.1.1"
  restartPolicy:
    type: Never
  volumes:
    - name: "test-volume"
      emptyDir: {}
  driver:
    annotations:
      vke.volcengine.com/burst-to-vci: enforce  #强制使用 VCI       
    cores: 1
    coreLimit: "1200m"
    memory: "512m"
    labels:
      version: 3.1.1
    serviceAccount: spark-operator
    volumeMounts:
      - name: "test-volume"
        mountPath: "/tmp"
  executor:
    annotations:
      vke.volcengine.com/burst-to-vci: enforce  #强制使用 VCI          
    cores: 1
    instances: 1
    memory: "512m"
    labels:
      version: 3.1.1
    volumeMounts:
      - name: "test-volume"
        mountPath: "/tmp"

$ kubectl apply -f test-spark-operaotr.yaml
$ kubectl get pod -n spark-operator 
NAME                             READY   STATUS    RESTARTS   AGE
spark-operator-d47b949c9-2dqzd   1/1     Running   0          65m
spark-pi-driver                  1/1     Running   0          63s

参考链接

[1] https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/tree/master/charts/spark-operator-chart