How to Easily Migrate Data From AWS to GCP | Adzerk

How to Easily Migrate Data From AWS to GCP

Justin Niessner

There are a number of popular cloud providers, but migrating your data from one to another can be daunting.

At Adzerk, by default our Data Shipping feature (which gives customers full logs of all ad serving requests) stores info in an AWS S3 bucket. But sometimes our customers request GCP (Google Cloud Platform) instead.

For these clients I built a tool to accomplish this - and I thought I'd share publicly, as it's abstract and powerful enough to be used by anyone migrating data from S3 to GCP.

Weighing the options

I started by weighing three data shipping options:

  1. Lambda Function triggered by a CloudWatch Scheduled Event

    While simple, this option required re-syncing every time I needed near real-time data in GCP during the day - or every night, if I were holding analysis for the following day.

  2. Lambda Function triggered by an S3 event

    This option was slightly more complicated, as it required having a trigger on an existing S3 bucket. It would be called any time a file was put in the data shipping bucket.

  3. Lambda Function triggered by a CloudWatch Scheduled Event with SQS

    This was the most complex option. Data was slow to arrive in GCP, and I wasn’t sure how it would handle failure.

Deciding on a solution

After weighing these options, I decided to proceed with an S3-triggered Lambda job, as it offered the best balance, the ability to handle scale at a reasonable price, and the ability to ‘upgrade’ if needed.

Authenticating in GCP

This step required a service account with limited permissions - similar to the IAM role for AWS. I had to embed JSON credentials in the Lambda package prior to upload.

Weighing the deployment options

Next, I weighed the deployment options - CloudFormation or Serverless.

CloudFormation is a simple, built-in option. It features native AWS tooling but requires pre-packaged deployment assets, and it doesn’t have native support for S3 events on existing buckets.

Serverless is a Cloud Agnostic Framework for developing and deploying cloud applications. It also includes handy tools for managing deployments including different environments. It’s easy to use with an existing S3 bucket, but it requires a third-party application to deploy it.

Deciding on a deployment solution (with a workaround)

Given these options - and the unique challenges of each - I found a CloudFormation workaround that used both JavaScript (Adzerk’s code) and Python (AWS’ workaround) and made it a winner.

You, too, can you flip your data shipping from AWS to GCP by following these 10 steps.

10 steps to replicating and shipping your user data from AWS to GCP:

AWS to GCP set-up: 6 steps

Step One: Ensure that you have the Make and Zip command line tools installed.

On MacOS, you'll need to install the XCode Command Line Tools using: xcode-select --install and Zip via Homebrew using: brew update && brew install zip

If you're using an Ubuntu-based Linux distribution, you can run sudo apt update && sudo apt install zip build-essential.

Step Two: Next you'll need the AWS CLI. You can find installation and configuration instructions here.

Once installed and configured, you can run aws s3 mb s3://your-globablly-unique-lambda-source-bucket-name to create an S3 Bucket to store your Lambda source bundles

Step Three: Enable and configure data shipping for your network

Step Four: Now we need somewhere for the data to go. If you already have a GCP Storage bucket created, great, you can skip to creating the Service Account Key. If not, go ahead and create one in your account.

Step Five: Access or create your Service Account Key, which the Connector will use to write JSON files to your bucket (see note below)

Step Six: Save the key locally and set the Google Application Credentials environment variable to the path of the JSON file (You’ll need to remove any spaces in your JSON token filename for the make tasks)

NOTE: For Adzerk customers, data shipping may re-write files using the same filename. To create new files in Google Cloud, grant the Storage Object Creator role when generating your token. You can allow the Connector to overwrite files by designating it Storage Object Admin.

AWS to GCP installation: 4 steps

Step One: Gather the following values for deployment:

  • Stack name
  • Lambda source bucket (to create S3 events on existing S3 buckets)
  • Source bucket
  • Destination bucket

Step Two: Start your deployment

Step Three: Create the stack:

make create STACK_NAME=your-stack-name \
            LAMBDA_SOURCE_BUCKET=your-source-bucket \
            SOURCE_BUCKET=your-adzerk-data-bucket \

Step Four: Update the stack:

make update STACK_NAME=your-stack-name \
            LAMBDA_SOURCE_BUCKET=your-source-bucket \
            SOURCE_BUCKET=your-adzerk-data-bucket \

Results: New architecture and functions

aws to gcp migration process

The Google CloudFormation template in AWS will create new Lambda functions and infrastructure:

  • New S3 events on existing S3 buckets
  • New S3 events for your data shipping bucket
  • New functions and S3 events to ship new files from S3 to GCP
  • All necessary IAM permissions

Once your new Google Cloud Storage bucket is installed, you can also set triggers to alert new data and execute further processing logic.

Want to learn more?

View our GitHub guide, and share your data migration experiments and experiences in the comments below. We’d love to hear what’s working well for you - and what other topics you’d find helpful.

Questions or feedback about Adzerk data shipping? Your Account Manager is always glad to hear from you!

Justin Niessner

Recommended Articles