How To Use Kubernetes Client (kubectl) inside SAP Data Intelligence / Data Hub

There is way, how to use the Kubernetes client (kubectl) from within the vflow-graph in the Data Intelligence (prior version 3.0 known as Data Hub) to perform and schedule more sophisticated Kubernetes actions.

1. Generate the kubeconfig file

Couple of prerequisite steps with the full Kubernetes admin powers (kubectl) are needed upfront.

1.1 Service account, role and role binding

It’s a good practice to generate a kubeconfig file with restricted authorisation, otherwise you could tear the whole cluster down with some vflow-graph mistake. Let’s create a separate service account for our purposes:

Create a file vflow-kubectl-sa.yaml for the service account with following content:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: vflow-kubectl
  namespace: $NAMESPACE

Replace the $NAMESPACE with the actual namespace, where your DI instance is running.

Create a file vflow-kubectl-role.yaml for the role with following content:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: vflow-kubectl-role
  namespace: $NAMESPACE
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - list
  - watch
  - describe
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods/log
  verbs:
  - get

Replace the $NAMESPACE with the actual namespace, where your DI instance is running.

This role grants the actions get, list, watch, describe and logs for all pods in the specified $NAMESPACE only.

Create a file vflow-kubectl-rolebinding.yaml with following content:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: vflow-kubectl
  namespace: $NAMESPACE
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: vflow-kubectl-role
subjects:
- kind: ServiceAccount
  name: vflow-kubectl
  namespace: $NAMESPACE

Replace the $NAMESPACE with the actual namespace, where your DI instance is running.

This will bind the vflow-kubectl-role with the service account vflow-kubectl.

Now create the objects with:

kubectl create -f vflow-kubectl-sa.yaml
kubectl create -f vflow-kubectl-role.yaml
kubectl create -f vflow-kubectl-rolebinding.yaml

 

1.2 Get the service account token and cluster data

Get the service account token:

kubectl describe serviceaccount vflow-kubectl -n $NAMESPACE | grep "Mountable secrets" | cut -d":" -f2 | xargs kubectl describe secret -n $NAMESPACE | grep "token:" | cut -d":" -f2 | tr -d " "

 

Get the certificate authority data from the cluster:

kubectl config view --flatten --minify | grep certificate-authority-data | cut -d":" -f2 | tr -d " "

 

Get the api master server hostname:

kubectl config view --flatten --minify | grep server | tr -d " " | cut -c 8-

 

1.3 Create the config file

Create the file named config in your preferred text editor with following content:

apiVersion: v1
kind: Config
users:
- name: vflow-kubectl
user:
token: <service account token>
clusters:
- cluster:
certificate-authority-data: <certificate authority data>
server: <api master server hostname>
name: vflow-kubectl
contexts:
- context:
cluster: vflow-kubectl
user: vflow-kubectl
name: vflow-kubectl
current-context: vflow-kubectl

 

2. Upload the config file to your user workspace

Open the System Management app in your DI system, navigate to the tab Files and upload the config file to the /files folder:

3. Build the graph

Open the Modeler app, create a new graph, put the Py3 Data Generator operator in it and open the script section:

Delete the whole predefined text and paste following:

# Import the python os module for connecting to the underlying operating system.
import os

# Download the latest Kubernetes client (kubectl), make it executable and prepare the .kube directory.
os.system("curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl && chmod +x kubectl && mkdir $HOME/.kube")

# Retrieve the token bearer using the DI vflow user's environment variables VSYSTEM_APP_ID and VSYSTEM_SECRET to read the config file for the kubectl and store it in the folder .kube.
# There has to be the config file in the user's workspace (path /files), otherwise the kubectl won't work!
os.system("TOKENB=$(curl -s --user $VSYSTEM_APP_ID:$VSYSTEM_SECRET -X GET http://vsystem-internal:8796/token/v2 | python -m json.tool | grep access_token | cut -d'\"' -f4) && curl -X GET -H \"Authorization:$TOKENB\" http://vsystem-internal:8796/repository/v2/files/user/files/config?op=read > $HOME/.kube/config")

With the above, the vflow container is ready to fully use the most recent version of kubectl, having the config file in the vflow’s user home folder.

Having that, you can e.g. add following two lines to retrieve all the pods from your $NAMESPACE:

os.system("./kubectl get pods -n $NAMESPACE > allpods.txt")
api.send("output", "echo && cat allpods.txt")

Sending the output to a command executor and then to terminal:

Output:

See the whole JSON of the above graph here:

{
	"description": "Get All Pods",
	"processes": {
		"py3datagenerator1": {
			"component": "com.sap.util.datageneratorpy3",
			"metadata": {
				"label": "Py3 Data Generator",
				"x": 12,
				"y": 42,
				"height": 80,
				"width": 120,
				"extensible": true,
				"config": {
					"script": "# Import the python os module for connecting to the underlying operating system.\nimport os\n\n# Download the latest Kubernetes client (kubectl), make it executable and prepare the .kube directory.\nos.system(\"curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl && chmod +x kubectl && mkdir $HOME/.kube\")\n\n# Retrieve the token bearer using the DI vflow user's environment variables VSYSTEM_APP_ID and VSYSTEM_SECRET to read the config file for the kubectl and store it in the folder .kube.\n# There has to be the config file in the user's workspace (path /files), otherwise the kubectl won't work!\nos.system(\"TOKENB=$(curl -s --user $VSYSTEM_APP_ID:$VSYSTEM_SECRET -X GET http://vsystem-internal:8796/token/v2 | python -m json.tool | grep access_token | cut -d'\\\"' -f4) && curl -X GET -H \\\"Authorization:$TOKENB\\\" http://vsystem-internal:8796/repository/v2/files/user/files/config?op=read > $HOME/.kube/config\")\n\nos.system(\"./kubectl get pods -n $NAMESPACE > allpods.txt\")\napi.send(\"output\", \"echo && cat allpods.txt\")"
				}
			}
		},
		"commandexecutor1": {
			"component": "com.sap.system.commandExecutor",
			"metadata": {
				"label": "Command Executor",
				"x": 280,
				"y": 42,
				"height": 80,
				"width": 120,
				"config": {
					"cmdLine": "/bin/sh"
				}
			}
		},
		"toblobconverter1": {
			"component": "com.sap.util.toBlobConverter",
			"metadata": {
				"label": "ToBlob Converter",
				"x": 181,
				"y": 57,
				"height": 50,
				"width": 50,
				"config": {}
			}
		},
		"terminal1": {
			"component": "com.sap.util.terminal",
			"metadata": {
				"label": "Terminal",
				"x": 579.9999980926514,
				"y": 42,
				"height": 80,
				"width": 120,
				"ui": "dynpath",
				"config": {
					"maxSize": 100000
				}
			}
		},
		"tostringconverter1": {
			"component": "com.sap.util.toStringConverter",
			"metadata": {
				"label": "ToString Converter",
				"x": 464.9999990463257,
				"y": 12,
				"height": 50,
				"width": 50,
				"config": {}
			}
		},
		"tostringconverter2": {
			"component": "com.sap.util.toStringConverter",
			"metadata": {
				"label": "ToString Converter",
				"x": 464.9999990463257,
				"y": 102,
				"height": 50,
				"width": 50,
				"config": {}
			}
		}
	},
	"groups": [],
	"connections": [
		{
			"metadata": {
				"points": "136,82 176,82"
			},
			"src": {
				"port": "output",
				"process": "py3datagenerator1"
			},
			"tgt": {
				"port": "ininterface",
				"process": "toblobconverter1"
			}
		},
		{
			"metadata": {
				"points": "235,82 275,82"
			},
			"src": {
				"port": "outbytearray",
				"process": "toblobconverter1"
			},
			"tgt": {
				"port": "stdin",
				"process": "commandexecutor1"
			}
		},
		{
			"metadata": {
				"points": "518.9999990463257,37 546.9999985694885,37 546.9999985694885,82 574.9999980926514,82"
			},
			"src": {
				"port": "outstring",
				"process": "tostringconverter1"
			},
			"tgt": {
				"port": "in1",
				"process": "terminal1"
			}
		},
		{
			"metadata": {
				"points": "518.9999990463257,127 546.9999985694885,127 546.9999985694885,82 574.9999980926514,82"
			},
			"src": {
				"port": "outstring",
				"process": "tostringconverter2"
			},
			"tgt": {
				"port": "in1",
				"process": "terminal1"
			}
		},
		{
			"metadata": {
				"points": "404,73 432,73 432,28 459.9999990463257,28"
			},
			"src": {
				"port": "stdout",
				"process": "commandexecutor1"
			},
			"tgt": {
				"port": "ininterface",
				"process": "tostringconverter1"
			}
		},
		{
			"metadata": {
				"points": "404,91 432,91 432,118 459.9999990463257,118"
			},
			"src": {
				"port": "stderr",
				"process": "commandexecutor1"
			},
			"tgt": {
				"port": "ininterface",
				"process": "tostringconverter2"
			}
		}
	],
	"inports": {},
	"outports": {},
	"properties": {}
}

Tested in following versions:

SAP Data Hub 2.7 patch 3
SAP Data Intelligence 3.0 patch 1

Original Article:
https://blogs.sap.com/2020/07/23/how-to-use-kubernetes-client-kubectl-inside-sap-data-intelligence-data-hub/

ASK SAP EXPERTS ONLINE
Related blogs

LEAVE A REPLY

Please enter your comment!
Please enter your name here