Sometimes you want a Pod to execute some work and then stop. You could deploy a Pod spec, but that has limited retry support if the work fails, but you can't use a Deployment because it will replace the Pod if it exits successfully.
Jobs are for this use-case - they're a Pod controller which creates a Pod and ensures it runs to completion. If the Pod fails the Job will start a replacement, but when the Pod succeeds the Job is done.
Jobs can have their own controller with a CronJob that contains a Job spec and a schedule. On the schedule it creates a Job, which creates and monitors a Pod.
The simplest Job spec just has metadata and a template with a standard Pod spec:
apiVersion: batch/v1
kind: Job
metadata:
name: pi-job
spec:
template:
spec:
containers:
- # container spec
restartPolicy: Never
template.spec
- a Pod spec which can include volumes, configuration and everything else in a normal PodrestartPolicy
- the default Pod restart policy is Always
which is not allowed for Jobs; you must specify Never
or OnFailure
CronJobs wrap the Job spec, adding a schedule in the form of a *nix cron expression:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: db-backup
spec:
schedule: "0 9 * * *"
concurrencyPolicy: Forbid
jobTemplate:
# job spec
apiVersion
- Kubernetes uses beta versions to indicate the API isn't final; CronJobs will graduate to stable in Kubernetes 1.21schedule
- cron expression for when Jobs are to be createdconcurrencyPolicy
- whether to Allow
new Job(s) to be created when the previous scheduled Job is still runing, Forbid
that or Replace
the old Job with a new oneWe have a website we can use to calculate Pi, but the app can also run a one-off calculation:
Create the Job:
kubectl apply -f labs/jobs/specs/pi/one
kubectl get jobs
Jobs apply a label to Pods they create (in addition to any labels in the template Pod spec).
📋 Use the Job's label to get the Pod details and show its logs.
kubectl get pods --show-labels
kubectl get pods -l job-name=pi-job-one
kubectl logs -l job-name=pi-job-one
You'll see Pi computed. That's the only output from this Pod.
When Jobs have completed they are not automatically cleaned up:
kubectl get jobs
The Job shows 1/1 completions - which means 1 Pod ran successfully
You can't update the Pod spec for an existing Job, Jobs don't manage Pod upgrades like Deployments do.
Try to change the existing Job and you'll get an error:
kubectl apply -f labs/jobs/specs/pi/one/update
To change a Job you would first need to delete the old one
Jobs aren't just for a single task, in some scenarios you want the same Pod to run for a fixed number of times.
When you have a fixed set of work to process, use can use a Job to run all the pieces in parallel:
Run the random Pi Job:
kubectl apply -f labs/jobs/specs/pi/many
kubectl get jobs -l app=pi-many
You'll see one Job, with 3 expected completions
📋 Check the Pod status and logs for this Job.
kubectl get pods -l job-name=pi-job-many
kubectl logs -l job-name=pi-job-many
You'll get logs for all the Pods - pages of Pi :)
The Job has details of all the Pods it creates:
kubectl describe job pi-job-many
Shows Pod creation events and Pod statuses
Jobs are not automatically cleared up so you can work with the Pods and see the logs.
Periodically running a cleanup task is one scenario where you use a CronJob:
The CronJob is set to run every minute so you'll soon see it get to work.
📋 Deploy the CronJob and watch all Jobs to see them being removed.
kubectl apply -f labs/jobs/specs/cleanup
kubectl get cronjob
kubectl get jobs --watch
You'll see the cleanup Job get created, and then the list will be updated with new a new cleanup Job every minute
Confirm that completed Pi Jobs and their Pods have been removed:
# Ctrl-C to exit the watch
kubectl get jobs
kubectl get pods -l job-name --show-labels
The most recent cleanup job is still there because CronJobs don't delete Jobs when they complete
You can check the logs to see what the cleanup script did:
kubectl logs -l app=job-cleanup
Real CronJobs don't run every minute - they're used for maintenance tasks and run much less often, like hourly, daily or weekly.
Often you want to run a one-off Job from a CronJob without waiting for the next one to be created on schedule.
The first task for this lab is to edit the job-cleanup
CronJob and set it to suspended, so it won't run any more Jobs and confuse you when you create your new Job. See if you can do this without using kubectl apply
.
Then deploy this new CronJob:
kubectl apply -f labs/jobs/specs/backup
And the next task is to run a Job from this CronJob's spec. See if you can also do this without using kubectl apply
.
Background tasks in Jobs could run for a long time, and you need some control on how you handle failures.
The first option is to allow Pod restarts, so if the container fails then a new container is started in the same Pod:
Try this Job:
kubectl apply -f labs/jobs/specs/pi/one-failing
kubectl get jobs pi-job-one-failing
The Pod is created but the container will immediately exit, causing the Pod to restart:
kubectl get pods -l job-name=pi-job-one-failing --watch
You'll see RunContainerError statuses & multiple restarts until the Pod goes into CrashLoopBackoff
You may not want a failing Pod to restart, and the Job can be set to create replacement Pods instead. This is good if a failure was caused by a problem on one node, because the replacement Pod could run on a different node:
You can't update the existing Job, so you'll need to delete it first:
kubectl delete jobs pi-job-one-failing
kubectl apply -f labs/jobs/specs/pi/one-failing/update
Now when you watch the Pods you won't see the same Pod restarting, you'll see new Pods being created:
kubectl get pods -l job-name=pi-job-one-failing --watch
You'll see ContainerCannotRun status, 0 restarts & by the end a total of 4 Pods
A side-effect of a Pod hitting ContainerCannotRun
status is that you won't see any logs, and to find out why you'll need to describe the Pod:
kubectl logs -l job-name=pi-job-one-failing
kubectl describe pods -l job-name=pi-job-one-failing
Just a typo in the command line...
kubectl delete job,cronjob,cm,sa,clusterrole,clusterrolebinding -l kubernetes.courselabs.co=jobs