Tuesday, 21 July 2020

Kubernetes: Jobs

By default, Once a Pod started, it will run always. When a Pod terminates, Kubernetes will automatically start another new instance of the Pod with the same image. 


What if you want to run Pod exactly 1 time?

What if you want to terminate, once specified number of Pods successfully exit.?

You can achieve this via Jobs.

 

In Kubernetes, a Job creates one or more Pods and ensures that a specified number of them successfully terminate.

 

Job keeps track of all the successful completion of Pods, whenever a specified number of successful completions reached, then the Job is complete.

 

When a Job is deleted, all the Pods created by this job also get deleted.

 

Let’s create a simple Job that just makes sure that one Pod completed successfully. By any chance, if the Pod is failed or deleted before successful completion, then the Job Object will start a new pod.

 

simpleSleepJob.yml

apiVersion: batch/v1
kind: Job
metadata:
  name: sleep-job-1-pod
  labels:
    app: sleep-job-1
    author: krishna
    serviceType: terminal-app
spec:
  template:
    spec:
      containers:
        - name: sleep-job-container
          image: busybox
          command: ["/bin/sleep"]
          args: ["20"]
      restartPolicy: Never

Let’s create a Job with the above definition.

$kubectl create -f simpleSleepJob.yml 
job.batch/sleep-job-1-pod created

Let’s query for Jobs.

$kubectl get jobs
NAME              COMPLETIONS   DURATION   AGE
sleep-job-1-pod   0/1           6s         6s

As you see the output, 0/1 represents that the job is not yet completed.

 

Query for the Pods again after some time.

$kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
sleep-job-1-pod-jtnfz   1/1     Running   0          11s

As you see the output, Pod is started 11seconds ago and still in Running state. 1/1 means we are expecting 1 Pod and 1 Pod is already running.

 

Wait for some more time and query Pods again.

$kubectl get pods
NAME                    READY   STATUS      RESTARTS   AGE
sleep-job-1-pod-jtnfz   0/1     Completed   0          28s

As you see the output, Pod is completed. 0/1 means, we are expecting 0 pods here.

 

Let’s query for jobs again.

$kubectl get jobs
NAME              COMPLETIONS   DURATION   AGE
sleep-job-1-pod   1/1           27s        38s

Since this Job is completed, the Pod also completes and not restart again.

 

spec.template.spec.restartPolicy

‘restartPolicy’ has two possible values 'Never' or ‘onFailure’.

 

If you set ‘restartPolicy’ to ‘onFailure’, then the container will re-run on the same Pod.

 

If you set ‘restartPolicy’ to ‘Never’, Kubernetes will rerun the failing container on a new Pod.

 

Demo 2: Let’s delete the Pod and confirm that the Job starts the Pod again.

 

sleepJob2.yml

apiVersion: batch/v1
kind: Job
metadata:
  name: sleep-job-2-pod
  labels:
    app: sleep-job-2
    author: krishna
    serviceType: terminal-app
spec:
  template:
    spec:
      containers:
        - name: sleep-job-container
          image: busybox
          command: ["/bin/sleep"]
          args: ["20"]
      restartPolicy: Never

Create a Pod using the above definition file.

$kubectl create -f sleepJob2.yml 
job.batch/sleep-job-2-pod created

Let’s query jobs.

$kubectl get jobs
NAME              COMPLETIONS   DURATION   AGE
sleep-job-1-pod   1/1           27s        41m
sleep-job-2-pod   0/1           8s         8s

Let’s query Pods.

$kubectl get pods
NAME                    READY   STATUS      RESTARTS   AGE
sleep-job-1-pod-jtnfz   0/1     Completed   0          41m
sleep-job-2-pod-4s4jw   1/1     Running     0          12s

Let me delete the Pod sleep-job-2-pod-4s4jw.

$kubectl delete pod sleep-job-2-pod-4s4jw
pod "sleep-job-2-pod-4s4jw" deleted

Job2 pod is deleted. Let’s query for Pods and confirm that the new Pod is started.

$kubectl get pods
NAME                    READY   STATUS      RESTARTS   AGE
sleep-job-1-pod-jtnfz   0/1     Completed   0          42m
sleep-job-2-pod-db9fw   1/1     Running     0          16s

Wait for some time and query jobs, you will confirm that the job ‘sleep-job-2-pod’ is in completion state.

$kubectl get jobs
NAME              COMPLETIONS   DURATION   AGE
sleep-job-1-pod   1/1           27s        43m
sleep-job-2-pod   1/1           28s        88s

Different Types Of Jobs

There are three types of tasks that are suitable to run as a Job.

 

a.   Non-Parallel Jobs: Only one pod is started unless the Pod fails. This Job will complete as soon as the Pod complete successfully.

 

b.   Parallel Jobs with a fixed completion count: This job will complete when there is one successful pod for each value in the range 1 to .spec.completions. For a fixed completion count Job, you should set .spec.completions to the number of completions needed. You can set .spec.parallelism, or leave it unset and it will default to 1.

 

c.    Parallel Jobs with a work queue: Multiple Jobs are started. When One of the Pod in a Job complete successfully, then the Job is complete.

 

Let’s delete all the previous jobs and pods to make the things clear.

$kubectl delete job sleep-job-1-pod
job.batch "sleep-job-1-pod" deleted
$
$kubectl delete job sleep-job-2-pod
job.batch "sleep-job-2-pod" deleted
$
$
$kubectl get pods
No resources found in default namespace.

Now everything is clear. Let’s create a Job with completions 10 and parallelism 2.

 

sleepParallelJob.yaml

apiVersion: batch/v1
kind: Job
metadata:
  name: sleep-parallel-job-pod
  labels:
    app: sleep-parallel-job
    author: krishna
    serviceType: terminal-app
spec:
  completions: 10
  parallelism: 2
  template:
    spec:
      containers:
        - name: sleep-parallel-job-container
          image: busybox
          command: ["/bin/sleep"]
          args: ["30"]
      restartPolicy: Never

Create a Job using the above definition file.

$kubectl create -f sleepParallelJob.yaml 
job.batch/sleep-parallel-job-pod created

Query for the Jobs.

$kubectl get jobs
NAME                     COMPLETIONS   DURATION   AGE
sleep-parallel-job-pod   0/10          4s         4s

You can confirm 0 out of 10 tasks are finished.

 

Since we set parallelism to 2, at a time two pods will get started. You can confirm the same by querying pods.

$kubectl get pods
NAME                           READY   STATUS              RESTARTS   AGE
sleep-parallel-job-pod-ppskr   0/1     ContainerCreating   0          8s
sleep-parallel-job-pod-xntp9   1/1     Running             0          8s

Wait for some time and query, you will observe, once these two Pods finish execution other two pods will be in Running state.

$kubectl get pods
NAME                           READY   STATUS              RESTARTS   AGE
sleep-parallel-job-pod-96z6d   0/1     ContainerCreating   0          0s
sleep-parallel-job-pod-bcncd   0/1     ContainerCreating   0          7s
sleep-parallel-job-pod-ppskr   0/1     Completed           0          43s
sleep-parallel-job-pod-xntp9   0/1     Completed           0          43s

Once all the 10 Pods ran successfully, Job will go to completion state.

$kubectl get pods
NAME                           READY   STATUS      RESTARTS   AGE
sleep-parallel-job-pod-24t8j   0/1     Completed   0          88s
sleep-parallel-job-pod-2vrjj   0/1     Completed   0          80s
sleep-parallel-job-pod-5dkdt   0/1     Completed   0          42s
sleep-parallel-job-pod-96z6d   0/1     Completed   0          2m35s
sleep-parallel-job-pod-bcncd   0/1     Completed   0          2m42s
sleep-parallel-job-pod-fh4j5   0/1     Completed   0          117s
sleep-parallel-job-pod-kz66s   0/1     Completed   0          2m4s
sleep-parallel-job-pod-ppskr   0/1     Completed   0          3m18s
sleep-parallel-job-pod-xntp9   0/1     Completed   0          3m18s
sleep-parallel-job-pod-zrdbb   0/1     Completed   0          50s
$
$kubectl get jobs
NAME                     COMPLETIONS   DURATION   AGE
sleep-parallel-job-pod   10/10         3m12s      3m21s

Once you delete the job ‘sleep-parallel-job-pod’, all the Pods created by this job also get deleted.

$kubectl delete job sleep-parallel-job-pod
job.batch "sleep-parallel-job-pod" deleted
$
$kubectl get jobs
No resources found in default namespace.

Previous                                                    Next                                                    Home


No comments:

Post a Comment