Velero is an open-source tool for backing up and restoring Kubernetes cluster resources and persistent volumes. This guide will walk you through configuring Velero on AWS to ensure your cluster data is securely backed up.
Introduction: In the ever-evolving landscape of cloud-native infrastructure, ensuring the resilience and integrity of your data is paramount. With Kubernetes becoming the de facto standard for container orchestration, it's essential to have robust backup and recovery mechanisms in place. Enter Velero – an open-source tool designed to streamline the backup, restore, and migration of Kubernetes resources and persistent volumes. In this comprehensive guide, we'll explore everything you need to know about Velero and how it can help safeguard your Kubernetes data.
1. Understanding the Need for Kubernetes Backup Solutions:
- Discuss the importance of data protection in Kubernetes environments.
- Highlight common data loss scenarios and their impact on operations.
- Introduce the concept of disaster recovery and the role of backup solutions like Velero in mitigating risks.
2. Getting Started with Velero:
- Step-by-step guide to installing Velero on your Kubernetes cluster.
- Configuring Velero to work with your preferred cloud provider or storage backend.
- Exploring Velero's command-line interface (CLI) and RESTful API for managing backups and restores.
3. Key Features and Functionality:
- Deep dive into Velero's core features, including:
- Customizable backup policies for selective resource inclusion/exclusion.
- Support for both cluster-wide and individual persistent volume backups.
- Incremental backups to optimize storage and reduce backup duration.
- Extensibility through plugins for various cloud providers and storage solutions.
- Cross-cluster migration capabilities for seamless workload portability.
- Real-world examples illustrating how these features address common use cases and challenges.
4. Best Practices for Velero Deployment:
- Recommendations for optimizing Velero's performance and reliability.
- Strategies for securely managing backup data, including encryption and access control.
- Backup lifecycle management tips to ensure efficient resource utilization and cost savings.
- Scalability considerations for large-scale Kubernetes deployments.
5. Integrating Velero into Your Workflow:
- Exploring Velero's integration with CI/CD pipelines for automated backup and recovery.
- Leveraging Velero's API for seamless integration with monitoring and alerting systems.
- Incorporating Velero into disaster recovery plans and testing procedures.
6. Community and Support Ecosystem:
- Overview of the Velero community, including contributors, forums, and resources.
- Where to find documentation, tutorials, and best practices for Velero deployment and usage.
- Available support channels and options for engaging with the Velero community for assistance and collaboration.
Setup
To set up Velero on AWS, you:
You can also use this plugin to migrate PVs across clusters or create an additional Backup Storage Location.
If you do not have the aws
CLI locally installed, follow the user guide to set it up.
Create S3 bucket
Velero requires an object storage bucket to store backups in, preferably unique to a single Kubernetes cluster (see the FAQ for more details). Create an S3 bucket, replacing placeholders appropriately:
BUCKET=<YOUR_BUCKET> REGION=<YOUR_REGION> aws s3api create-bucket \ --bucket $BUCKET \ --region $REGION \ --create-bucket-configuration LocationConstraint=$REGION
NOTE: us-east-1 does not support a LocationConstraint
. If your region is us-east-1
, omit the bucket configuration:
aws s3api create-bucket \ --bucket $BUCKET \ --region us-east-1
Set permissions for Velero
For more information, see the AWS documentation on IAM users.
Create the IAM user:
aws iam create-user --user-name velero
If you'll be using Velero to backup multiple clusters with multiple S3 buckets, it may be desirable to create a unique username per cluster rather than the default
velero
.Attach policies to give
velero
the necessary permissions:cat > velero-policy.json <<EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ec2:DescribeVolumes", "ec2:DescribeSnapshots", "ec2:CreateTags", "ec2:CreateVolume", "ec2:CreateSnapshot", "ec2:DeleteSnapshot" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:DeleteObject", "s3:PutObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts" ], "Resource": [ "arn:aws:s3:::${BUCKET}/*" ] }, { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::${BUCKET}" ] } ] } EOF
aws iam put-user-policy \ --user-name velero \ --policy-name velero \ --policy-document file://velero-policy.json
Create an access key for the user:
aws iam create-access-key --user-name velero
The result should look like:
{ "AccessKey": { "UserName": "velero", "Status": "Active", "CreateDate": "2017-07-31T22:24:41.576Z", "SecretAccessKey": <AWS_SECRET_ACCESS_KEY>, "AccessKeyId": <AWS_ACCESS_KEY_ID> } }
Create a Velero-specific credentials file (
credentials-velero
) in your local directory:[default] aws_access_key_id=<AWS_ACCESS_KEY_ID> aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
where the access key id and secret are the values returned from the
create-access-key
request.
Install and start Velero
Download Velero
Install Velero, including all prerequisites, into the cluster and start the deployment. This will create a namespace called velero
, and place a deployment named velero
in it.
If using IAM user and access key:
velero install \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.10.0 \ --bucket $BUCKET \ --backup-location-config region=$REGION \ --snapshot-location-config region=$REGION \ --secret-file ./credentials-velero
velero.exe install \
> --provider aws We are closing
this
ticket now.
> --plugins velero/velero-plugin-
for
-aws:v1.
9.0
\
> --bucket velero-xfi-test \
> --backup-location-config region=eu-west-
2
\
> --snapshot-location-config region=eu-west-
2
\
> --secret-file ./credentials-velero
;6bd1439b-3bea-441f-a5bf-b5390b6174ceCustomResourceDefinition/backuprepositories.velero.io: attempting to create resource
CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource client
CustomResourceDefinition/backuprepositories.velero.io: already exists, proceeding
CustomResourceDefinition/backuprepositories.velero.io: created
CustomResourceDefinition/backups.velero.io: attempting to create resource
CustomResourceDefinition/backups.velero.io: attempting to create resource client
CustomResourceDefinition/backups.velero.io: already exists, proceeding
CustomResourceDefinition/backups.velero.io: created
CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource
CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource client
CustomResourceDefinition/backupstoragelocations.velero.io: already exists, proceeding
CustomResourceDefinition/backupstoragelocations.velero.io: created
CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource
CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource client
CustomResourceDefinition/deletebackuprequests.velero.io: already exists, proceeding
CustomResourceDefinition/deletebackuprequests.velero.io: created
CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource
CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource client
CustomResourceDefinition/downloadrequests.velero.io: already exists, proceeding
CustomResourceDefinition/downloadrequests.velero.io: created
CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource
CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource client
CustomResourceDefinition/podvolumebackups.velero.io: already exists, proceeding
CustomResourceDefinition/podvolumebackups.velero.io: created
CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource
CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource client
CustomResourceDefinition/podvolumerestores.velero.io: already exists, proceeding
CustomResourceDefinition/podvolumerestores.velero.io: created
CustomResourceDefinition/restores.velero.io: attempting to create resource
CustomResourceDefinition/restores.velero.io: attempting to create resource client
CustomResourceDefinition/restores.velero.io: already exists, proceeding
CustomResourceDefinition/restores.velero.io: created
CustomResourceDefinition/schedules.velero.io: attempting to create resource
CustomResourceDefinition/schedules.velero.io: attempting to create resource client
CustomResourceDefinition/schedules.velero.io: already exists, proceeding
CustomResourceDefinition/schedules.velero.io: created
CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource
CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource client
CustomResourceDefinition/serverstatusrequests.velero.io: already exists, proceeding
CustomResourceDefinition/serverstatusrequests.velero.io: created
CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource
CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource client
CustomResourceDefinition/volumesnapshotlocations.velero.io: already exists, proceeding
CustomResourceDefinition/volumesnapshotlocations.velero.io: created
CustomResourceDefinition/datadownloads.velero.io: attempting to create resource
CustomResourceDefinition/datadownloads.velero.io: attempting to create resource client
CustomResourceDefinition/datadownloads.velero.io: already exists, proceeding
CustomResourceDefinition/datadownloads.velero.io: created
CustomResourceDefinition/datauploads.velero.io: attempting to create resource
CustomResourceDefinition/datauploads.velero.io: attempting to create resource client
CustomResourceDefinition/datauploads.velero.io: already exists, proceeding
CustomResourceDefinition/datauploads.velero.io: created
Waiting
for
resources to be ready in cluster...
Namespace/velero: attempting to create resource
Namespace/velero: attempting to create resource client
Namespace/velero: created
ClusterRoleBinding/velero: attempting to create resource
ClusterRoleBinding/velero: attempting to create resource client
ClusterRoleBinding/velero: created
ServiceAccount/velero: attempting to create resource
ServiceAccount/velero: attempting to create resource client
ServiceAccount/velero: created
Secret/cloud-credentials: attempting to create resource
Secret/cloud-credentials: attempting to create resource client
Secret/cloud-credentials: created
BackupStorageLocation/
default
: attempting to create resource
BackupStorageLocation/
default
: attempting to create resource client
BackupStorageLocation/
default
: created
VolumeSnapshotLocation/
default
: attempting to create resource
VolumeSnapshotLocation/
default
: attempting to create resource client
VolumeSnapshotLocation/
default
: created
Deployment/velero: attempting to create resource
Deployment/velero: attempting to create resource client
Deployment/velero: created
Velero is installed! ⛵ Use
'kubectl logs deployment/velero -n velero'
to view the status.
Reference: https://github.com/vmware-tanzu/velero-plugin-for-aws
Important Commands:
velero.exe schedule create name --schedule=
"* 18 * * *"
--include-namespaces vector
velero.exe backup get
velero.exe backup create
new
--include-namespaces vector
velero.exe backup logs 16may-my-immediate-backup
Using Helm to Install Velero
Alternatively, you can install Velero using Helm:
helm install my-velero vmware-tanzu/velero --version 6.2.0
helm upgrade --reset-values my-velero vmware-tanzu/velero -f values-velero.yaml --version 6.2.0
kubectl create secret generic velero-credentials --from-file=cloud=file.txt
helm upgrade --reset-values my-velero vmware-tanzu/velero -f values-velero.yaml --version 6.2.0 --set-file credentials.secretContents.cloud=file.txt
Add the Helm repository and install Velero:
helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts helm install my-velero vmware-tanzu/velero --version 6.2.0
Create the credentials secret:
kubectl create secret generic velero-credentials --from-file=cloud=credentials-velero
Upgrade Velero with the credentials:
helm upgrade --set-file credentials.secretContents.cloud=credentials-velero my-velero vmware-tanzu/velero --version 6.2.0
Reference:
https://artifacthub.io/packages/helm/vmware-tanzu/velero
Take a Backup
$ velero.exe backup create full-cluster-backup
Backup request "full-cluster-backup" submitted successfully.
Run `velero backup describe full-cluster-backup` or `velero backup logs full-cluster-backup` for more details.
$ velero.exe get backup
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
16may-my-immediate-backup Failed 0 0 2024-05-16 17:34:02 +0530 IST 29d default <none>
full-cluster-backup Completed 0 0 2024-05-16 19:16:34 +0530 IST 29d default <none>
my-immediate-backup PartiallyFailed 6 0 2024-05-16 16:01:15 +0530 IST 29d default <none>
new Completed 0 0 2024-05-16 18:13:46 +0530 IST 29d default <none>
test-my-immediate-backup Completed 0 0 2024-05-16 17:36:59 +0530 IST 29d default <none>
velero.exe backup create new --include-namespaces vector
Restore a Backup
To restore a backup: Create a restore from a backup
velero restore create --from-backup <backup-name> --include-namespaces <namespace>
Check the status of the restore:
velero restore describe <restore-name> velero restore logs <restore-name>
Restore Demo:
For testing, I previously deleted the Grafana Deployment and PVC and restored them with the original data.
velero.exe restore create --from-backup new --include-namespaces vector Restore request "new-20240516184747" submitted successfully. Run `velero restore describe new-20240516184747` or `velero restore logs new-20240516184747` for more details.
$ kubectl get pvc -n vector NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE data-loki-backend-0 Bound pvc-a9d7164c-1640-48af-9417-741fb29c8d1b 10Gi RWO gp2 <unset> 16d data-loki-backend-1 Bound pvc-9185e1e5-8b3d-4089-8414-b7425dcd4680 10Gi RWO gp2 <unset> 16d data-loki-write-0 Bound pvc-18cfeda2-a5c9-4faa-8cc9-7648217c0871 10Gi RWO gp2 <unset> 57d data-loki-write-1 Bound pvc-76739fcf-f999-4a46-ada9-a5fbe71add07 10Gi RWO gp2 <unset> 57d data-loki-write-2 Bound pvc-840d28a2-0d32-4122-8896-89e0a8c18443 10Gi RWO gp2 <unset> 57d
$ kgpo -n vector NAME READY STATUS RESTARTS AGE loki-backend-0 2/2 Running 0 16d loki-backend-1 2/2 Running 0 16d loki-canary-dd8zs 1/1 Running 0 16d loki-canary-ddcm2 1/1 Running 0 16d loki-canary-ntqkk 1/1 Running 0 16d loki-canary-pbj9q 1/1 Running 0 16d loki-gateway-5f5dfb657c-w5pcb 1/1 Running 0 20h loki-read-8f6b77b99-9zfwx 1/1 Running 0 20h loki-read-8f6b77b99-hzl4x 1/1 Running 0 20h loki-write-0 1/1 Running 0 16d loki-write-1 1/1 Running 0 16d my-loki-grafana-agent-operator-7f58c9bd66-56w6h 1/1 Running 0 20h my-loki-logs-bhtn2 2/2 Running 0 16d my-loki-logs-dbf5g 2/2 Running 0 16d my-loki-logs-mpmgt 2/2 Running 0 16d my-loki-logs-sbqsw 2/2 Running 0 16d my-vector-0 1/1 Running 0 16d my-velero-74ff4d56d-hbrdc 1/1 Running 2 (11m ago) 11m
$ kubectl get pvc -n vector NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE data-loki-backend-0 Bound pvc-a9d7164c-1640-48af-9417-741fb29c8d1b 10Gi RWO gp2 <unset> 16d data-loki-backend-1 Bound pvc-9185e1e5-8b3d-4089-8414-b7425dcd4680 10Gi RWO gp2 <unset> 16d data-loki-write-0 Bound pvc-18cfeda2-a5c9-4faa-8cc9-7648217c0871 10Gi RWO gp2 <unset> 57d data-loki-write-1 Bound pvc-76739fcf-f999-4a46-ada9-a5fbe71add07 10Gi RWO gp2 <unset> 57d data-loki-write-2 Bound pvc-840d28a2-0d32-4122-8896-89e0a8c18443 10Gi RWO gp2 <unset> 57d my-grafana Bound pvc-39f24317-e484-4e98-b4c9-625de1b9367b 10Gi RWO gp2 <unset> 28s $ kubectl get pods -n vector NAME READY STATUS RESTARTS AGE loki-backend-0 2/2 Running 0 16d loki-backend-1 2/2 Running 0 16d loki-canary-dd8zs 1/1 Running 0 16d loki-canary-ddcm2 1/1 Running 0 16d loki-canary-ntqkk 1/1 Running 0 16d loki-canary-pbj9q 1/1 Running 0 16d loki-gateway-5f5dfb657c-w5pcb 1/1 Running 0 20h loki-read-8f6b77b99-9zfwx 1/1 Running 0 20h loki-read-8f6b77b99-hzl4x 1/1 Running 0 20h loki-write-0 1/1 Running 0 16d loki-write-1 1/1 Running 0 16d my-grafana-69c9db76dc-cwhmm 1/1 Running 0 54s my-loki-grafana-agent-operator-7f58c9bd66-56w6h 1/1 Running 0 20h my-loki-logs-bhtn2 2/2 Running 0 16d my-loki-logs-dbf5g 2/2 Running 0 16d my-loki-logs-mpmgt 2/2 Running 0 16d my-loki-logs-sbqsw 2/2 Running 0 16d my-vector-0 1/1 Running 0 16d