Configuring Velero for Cluster Backup on AWS

Velero is an open-source tool for backing up and restoring Kubernetes cluster resources and persistent volumes. This guide will walk you through configuring Velero on AWS to ensure your cluster data is securely backed up.


Introduction: In the ever-evolving landscape of cloud-native infrastructure, ensuring the resilience and integrity of your data is paramount. With Kubernetes becoming the de facto standard for container orchestration, it's essential to have robust backup and recovery mechanisms in place. Enter Velero – an open-source tool designed to streamline the backup, restore, and migration of Kubernetes resources and persistent volumes. In this comprehensive guide, we'll explore everything you need to know about Velero and how it can help safeguard your Kubernetes data.

1. Understanding the Need for Kubernetes Backup Solutions:

  • Discuss the importance of data protection in Kubernetes environments.
  • Highlight common data loss scenarios and their impact on operations.
  • Introduce the concept of disaster recovery and the role of backup solutions like Velero in mitigating risks.

2. Getting Started with Velero:

  • Step-by-step guide to installing Velero on your Kubernetes cluster.
  • Configuring Velero to work with your preferred cloud provider or storage backend.
  • Exploring Velero's command-line interface (CLI) and RESTful API for managing backups and restores.

3. Key Features and Functionality:

  • Deep dive into Velero's core features, including:
    • Customizable backup policies for selective resource inclusion/exclusion.
    • Support for both cluster-wide and individual persistent volume backups.
    • Incremental backups to optimize storage and reduce backup duration.
    • Extensibility through plugins for various cloud providers and storage solutions.
    • Cross-cluster migration capabilities for seamless workload portability.
  • Real-world examples illustrating how these features address common use cases and challenges.

4. Best Practices for Velero Deployment:

  • Recommendations for optimizing Velero's performance and reliability.
  • Strategies for securely managing backup data, including encryption and access control.
  • Backup lifecycle management tips to ensure efficient resource utilization and cost savings.
  • Scalability considerations for large-scale Kubernetes deployments.

5. Integrating Velero into Your Workflow:

  • Exploring Velero's integration with CI/CD pipelines for automated backup and recovery.
  • Leveraging Velero's API for seamless integration with monitoring and alerting systems.
  • Incorporating Velero into disaster recovery plans and testing procedures.

6. Community and Support Ecosystem:

  • Overview of the Velero community, including contributors, forums, and resources.
  • Where to find documentation, tutorials, and best practices for Velero deployment and usage.
  • Available support channels and options for engaging with the Velero community for assistance and collaboration.




Setup

To set up Velero on AWS, you:

You can also use this plugin to migrate PVs across clusters or create an additional Backup Storage Location.

If you do not have the aws CLI locally installed, follow the user guide to set it up.

 

Create S3 bucket

 

Velero requires an object storage bucket to store backups in, preferably unique to a single Kubernetes cluster (see the FAQ for more details). Create an S3 bucket, replacing placeholders appropriately:

BUCKET=<YOUR_BUCKET>
REGION=<YOUR_REGION>
aws s3api create-bucket \
    --bucket $BUCKET \
    --region $REGION \
    --create-bucket-configuration LocationConstraint=$REGION

NOTE: us-east-1 does not support a LocationConstraint. If your region is us-east-1, omit the bucket configuration:

aws s3api create-bucket \
    --bucket $BUCKET \
    --region us-east-1

Set permissions for Velero

For more information, see the AWS documentation on IAM users.

  1. Create the IAM user:

    aws iam create-user --user-name velero

    If you'll be using Velero to backup multiple clusters with multiple S3 buckets, it may be desirable to create a unique username per cluster rather than the default velero.

  2. Attach policies to give velero the necessary permissions:

    cat > velero-policy.json <<EOF
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "ec2:DescribeVolumes",
                    "ec2:DescribeSnapshots",
                    "ec2:CreateTags",
                    "ec2:CreateVolume",
                    "ec2:CreateSnapshot",
                    "ec2:DeleteSnapshot"
                ],
                "Resource": "*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:DeleteObject",
                    "s3:PutObject",
                    "s3:AbortMultipartUpload",
                    "s3:ListMultipartUploadParts"
                ],
                "Resource": [
                    "arn:aws:s3:::${BUCKET}/*"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:ListBucket"
                ],
                "Resource": [
                    "arn:aws:s3:::${BUCKET}"
                ]
            }
        ]
    }
    EOF
    
    aws iam put-user-policy \
      --user-name velero \
      --policy-name velero \
      --policy-document file://velero-policy.json
  3. Create an access key for the user:

    aws iam create-access-key --user-name velero

    The result should look like:

    {
      "AccessKey": {
            "UserName": "velero",
            "Status": "Active",
            "CreateDate": "2017-07-31T22:24:41.576Z",
            "SecretAccessKey": <AWS_SECRET_ACCESS_KEY>,
            "AccessKeyId": <AWS_ACCESS_KEY_ID>
      }
    }
  4. Create a Velero-specific credentials file (credentials-velero) in your local directory:

    [default]
    aws_access_key_id=<AWS_ACCESS_KEY_ID>
    aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>

    where the access key id and secret are the values returned from the create-access-key request.

 

 

Install and start Velero

 

Download Velero

 

Install Velero, including all prerequisites, into the cluster and start the deployment. This will create a namespace called velero, and place a deployment named velero in it.

 

If using IAM user and access key:

 

velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.10.0 \
    --bucket $BUCKET \
    --backup-location-config region=$REGION \
    --snapshot-location-config region=$REGION \
    --secret-file ./credentials-velero

velero.exe install \

>     --provider aws We are closing this ticket now.

>     --plugins velero/velero-plugin-for-aws:v1.9.0 \

>     --bucket velero-xfi-test \

>     --backup-location-config region=eu-west-2\

>     --snapshot-location-config region=eu-west-2 \

>     --secret-file ./credentials-velero

 

 

;6bd1439b-3bea-441f-a5bf-b5390b6174ceCustomResourceDefinition/backuprepositories.velero.io: attempting to create resource

CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource client

CustomResourceDefinition/backuprepositories.velero.io: already exists, proceeding

CustomResourceDefinition/backuprepositories.velero.io: created

CustomResourceDefinition/backups.velero.io: attempting to create resource

CustomResourceDefinition/backups.velero.io: attempting to create resource client

CustomResourceDefinition/backups.velero.io: already exists, proceeding

CustomResourceDefinition/backups.velero.io: created

CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource

CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource client

CustomResourceDefinition/backupstoragelocations.velero.io: already exists, proceeding

CustomResourceDefinition/backupstoragelocations.velero.io: created

CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource

CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource client

CustomResourceDefinition/deletebackuprequests.velero.io: already exists, proceeding

CustomResourceDefinition/deletebackuprequests.velero.io: created

CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource

CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource client

CustomResourceDefinition/downloadrequests.velero.io: already exists, proceeding

CustomResourceDefinition/downloadrequests.velero.io: created

CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource

CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource client

CustomResourceDefinition/podvolumebackups.velero.io: already exists, proceeding

CustomResourceDefinition/podvolumebackups.velero.io: created

CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource

CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource client

CustomResourceDefinition/podvolumerestores.velero.io: already exists, proceeding

CustomResourceDefinition/podvolumerestores.velero.io: created

CustomResourceDefinition/restores.velero.io: attempting to create resource

CustomResourceDefinition/restores.velero.io: attempting to create resource client

CustomResourceDefinition/restores.velero.io: already exists, proceeding

CustomResourceDefinition/restores.velero.io: created

CustomResourceDefinition/schedules.velero.io: attempting to create resource

CustomResourceDefinition/schedules.velero.io: attempting to create resource client

CustomResourceDefinition/schedules.velero.io: already exists, proceeding

CustomResourceDefinition/schedules.velero.io: created

CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource

CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource client

CustomResourceDefinition/serverstatusrequests.velero.io: already exists, proceeding

CustomResourceDefinition/serverstatusrequests.velero.io: created

CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource

CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource client

CustomResourceDefinition/volumesnapshotlocations.velero.io: already exists, proceeding

CustomResourceDefinition/volumesnapshotlocations.velero.io: created

CustomResourceDefinition/datadownloads.velero.io: attempting to create resource

CustomResourceDefinition/datadownloads.velero.io: attempting to create resource client

CustomResourceDefinition/datadownloads.velero.io: already exists, proceeding

CustomResourceDefinition/datadownloads.velero.io: created

CustomResourceDefinition/datauploads.velero.io: attempting to create resource

CustomResourceDefinition/datauploads.velero.io: attempting to create resource client

CustomResourceDefinition/datauploads.velero.io: already exists, proceeding

CustomResourceDefinition/datauploads.velero.io: created

Waiting for resources to be ready in cluster...

Namespace/velero: attempting to create resource

Namespace/velero: attempting to create resource client

Namespace/velero: created

ClusterRoleBinding/velero: attempting to create resource

ClusterRoleBinding/velero: attempting to create resource client

ClusterRoleBinding/velero: created

ServiceAccount/velero: attempting to create resource

ServiceAccount/velero: attempting to create resource client

ServiceAccount/velero: created

Secret/cloud-credentials: attempting to create resource

Secret/cloud-credentials: attempting to create resource client

Secret/cloud-credentials: created

BackupStorageLocation/default: attempting to create resource

BackupStorageLocation/default: attempting to create resource client

BackupStorageLocation/default: created

VolumeSnapshotLocation/default: attempting to create resource

VolumeSnapshotLocation/default: attempting to create resource client

VolumeSnapshotLocation/default: created

Deployment/velero: attempting to create resource

Deployment/velero: attempting to create resource client

Deployment/velero: created

Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero' to view the status.

Reference: https://github.com/vmware-tanzu/velero-plugin-for-aws

 

Important Commands:

velero.exe schedule create name --schedule="* 18 * * *" --include-namespaces vector

velero.exe backup get

velero.exe backup create new --include-namespaces vector

velero.exe backup logs 16may-my-immediate-backup

Using Helm to Install Velero

Alternatively, you can install Velero using Helm:

helm install my-velero vmware-tanzu/velero --version 6.2.0
helm upgrade --reset-values my-velero vmware-tanzu/velero -f values-velero.yaml --version 6.2.0
kubectl create secret generic velero-credentials --from-file=cloud=file.txt
helm upgrade --reset-values my-velero vmware-tanzu/velero -f values-velero.yaml --version 6.2.0 --set-file credentials.secretContents.cloud=file.txt

Add the Helm repository and install Velero:

helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts helm install my-velero vmware-tanzu/velero --version 6.2.0

Create the credentials secret:

kubectl create secret generic velero-credentials --from-file=cloud=credentials-velero

Upgrade Velero with the credentials:

helm upgrade --set-file credentials.secretContents.cloud=credentials-velero my-velero vmware-tanzu/velero --version 6.2.0

Reference:

https://artifacthub.io/packages/helm/vmware-tanzu/velero

 

Take a Backup

 

$ velero.exe backup create full-cluster-backup
Backup request "full-cluster-backup" submitted successfully.
Run `velero backup describe full-cluster-backup` or `velero backup logs full-cluster-backup` for more details.


$ velero.exe get backup
NAME                        STATUS            ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
16may-my-immediate-backup   Failed            0        0          2024-05-16 17:34:02 +0530 IST   29d       default            <none>
full-cluster-backup         Completed         0        0          2024-05-16 19:16:34 +0530 IST   29d       default            <none>
my-immediate-backup         PartiallyFailed   6        0          2024-05-16 16:01:15 +0530 IST   29d       default            <none>
new                         Completed         0        0          2024-05-16 18:13:46 +0530 IST   29d       default            <none>
test-my-immediate-backup    Completed         0        0          2024-05-16 17:36:59 +0530 IST   29d       default            <none>

velero.exe backup create new --include-namespaces vector

 

 

Restore a Backup

 

To restore a backup:  Create a restore from a backup

 

velero restore create --from-backup <backup-name> --include-namespaces <namespace>
 

Check the status of the restore:

 

velero restore describe <restore-name> velero restore logs <restore-name>
 
 

Restore Demo:

For testing, I previously deleted the Grafana Deployment and PVC and restored them with the original data.

 

velero.exe restore create --from-backup new --include-namespaces vector  Restore request "new-20240516184747" submitted successfully.
Run `velero restore describe new-20240516184747` or `velero restore logs new-20240516184747` for more details.
$ kubectl get pvc -n vector
NAME                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
data-loki-backend-0   Bound    pvc-a9d7164c-1640-48af-9417-741fb29c8d1b   10Gi       RWO            gp2            <unset>                 16d
data-loki-backend-1   Bound    pvc-9185e1e5-8b3d-4089-8414-b7425dcd4680   10Gi       RWO            gp2            <unset>                 16d
data-loki-write-0     Bound    pvc-18cfeda2-a5c9-4faa-8cc9-7648217c0871   10Gi       RWO            gp2            <unset>                 57d
data-loki-write-1     Bound    pvc-76739fcf-f999-4a46-ada9-a5fbe71add07   10Gi       RWO            gp2            <unset>                 57d
data-loki-write-2     Bound    pvc-840d28a2-0d32-4122-8896-89e0a8c18443   10Gi       RWO            gp2            <unset>                 57d
$ kgpo -n vector
NAME                                              READY   STATUS    RESTARTS      AGE
loki-backend-0                                    2/2     Running   0             16d
loki-backend-1                                    2/2     Running   0             16d
loki-canary-dd8zs                                 1/1     Running   0             16d
loki-canary-ddcm2                                 1/1     Running   0             16d
loki-canary-ntqkk                                 1/1     Running   0             16d
loki-canary-pbj9q                                 1/1     Running   0             16d
loki-gateway-5f5dfb657c-w5pcb                     1/1     Running   0             20h
loki-read-8f6b77b99-9zfwx                         1/1     Running   0             20h
loki-read-8f6b77b99-hzl4x                         1/1     Running   0             20h
loki-write-0                                      1/1     Running   0             16d
loki-write-1                                      1/1     Running   0             16d
my-loki-grafana-agent-operator-7f58c9bd66-56w6h   1/1     Running   0             20h
my-loki-logs-bhtn2                                2/2     Running   0             16d
my-loki-logs-dbf5g                                2/2     Running   0             16d
my-loki-logs-mpmgt                                2/2     Running   0             16d
my-loki-logs-sbqsw                                2/2     Running   0             16d
my-vector-0                                       1/1     Running   0             16d
my-velero-74ff4d56d-hbrdc                         1/1     Running   2 (11m ago)   11m
$ kubectl get pvc -n vector
NAME                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
data-loki-backend-0   Bound    pvc-a9d7164c-1640-48af-9417-741fb29c8d1b   10Gi       RWO            gp2            <unset>                 16d
data-loki-backend-1   Bound    pvc-9185e1e5-8b3d-4089-8414-b7425dcd4680   10Gi       RWO            gp2            <unset>                 16d
data-loki-write-0     Bound    pvc-18cfeda2-a5c9-4faa-8cc9-7648217c0871   10Gi       RWO            gp2            <unset>                 57d
data-loki-write-1     Bound    pvc-76739fcf-f999-4a46-ada9-a5fbe71add07   10Gi       RWO            gp2            <unset>                 57d
data-loki-write-2     Bound    pvc-840d28a2-0d32-4122-8896-89e0a8c18443   10Gi       RWO            gp2            <unset>                 57d
my-grafana            Bound    pvc-39f24317-e484-4e98-b4c9-625de1b9367b   10Gi       RWO            gp2            <unset>                 28s

$ kubectl get pods -n vector
NAME                                              READY   STATUS    RESTARTS   AGE
loki-backend-0                                    2/2     Running   0          16d
loki-backend-1                                    2/2     Running   0          16d
loki-canary-dd8zs                                 1/1     Running   0          16d
loki-canary-ddcm2                                 1/1     Running   0          16d
loki-canary-ntqkk                                 1/1     Running   0          16d
loki-canary-pbj9q                                 1/1     Running   0          16d
loki-gateway-5f5dfb657c-w5pcb                     1/1     Running   0          20h
loki-read-8f6b77b99-9zfwx                         1/1     Running   0          20h
loki-read-8f6b77b99-hzl4x                         1/1     Running   0          20h
loki-write-0                                      1/1     Running   0          16d
loki-write-1                                      1/1     Running   0          16d
my-grafana-69c9db76dc-cwhmm                       1/1     Running   0          54s
my-loki-grafana-agent-operator-7f58c9bd66-56w6h   1/1     Running   0          20h
my-loki-logs-bhtn2                                2/2     Running   0          16d
my-loki-logs-dbf5g                                2/2     Running   0          16d
my-loki-logs-mpmgt                                2/2     Running   0          16d
my-loki-logs-sbqsw                                2/2     Running   0          16d
my-vector-0                                       1/1     Running   0          16d




Post a Comment

Previous Post Next Post