Copy a public dataset to your own AWS account

Overview

You can use the AWS CLI to transfer datasets from Pennsieve Discover into your own AWS account for further analysis.

Prerequisites

All datasets can be accessed by directly interacting with AWS using your own AWS account. All data for a dataset is stored in a publicly accessible Amazon S3 Bucket. You will have to provide your own AWS credentials to access the data as downloading data can have costs associated with it.

There are 2 easy steps to configure your computer for downloading a dataset:

  1. Creating, and configuring an AWS account for getting data from Pennsieve Discover
  2. Installing the AWS Command Line Interface

Copying a dataset to another AWS S3 Bucket

Once you have created an AWS account and installed the CLI, you can transfer data from Blackfynn Discover to your bucket by typing the following in your local terminal:

aws s3 sync s3://[discover-dataset-bucket] s3://[your-bucket] --request-payer requester

You can get the [discover-dataset-bucket] name by clicking the Get Dataset button on the dataset's page in Discover. [your-bucket] is the name of your bucket and can be obtained using the AWS Console.

To learn more about moving S3 objects between buckets you can visit
https://aws.amazon.com/premiumsupport/knowledge-center/move-objects-s3-bucket/.

📘

Requester Payer

By including the request-payer requester attribute, you acknowledge that any costs associated with downloading the data will be charged to your AWS account. For transfer pricing information, visit the AWS S3 Pricing documentation. The relevant section is Data Transfer OUT From Amazon S3 To Internet.