Pennsieve Dataset Submission Requirements

Overview

The Pennsieve platform supports large and complex scientific datasets comprised of organized files and graph based metadata.

Data acceptance criteria

File size

Currently, the platform supports files that are smaller than 150GB. Although larger files are technically supported, we see an increase in time-out errors when users upload these files. Please contact our development team if you need to support files larger than 150GB per file.

File types

The Pennsieve platform imposes no limitations to the types of files that are accepted. The platform recognizes a large number of scientific file formats.

Metadata records

The Pennsieve platform imposes a limit of 1M metadata records per dataset across all defined models.

Known Limitations:

Upload Service fails for datasets with large numbers of files:
May 18, 2022: At this time, the upload mechanism for the Pennsieve platform has trouble handling single upload sessions for folders with very large numbers of files (> 1000). If datasets with very large numbers of files are uploaded in a single upload session, the upload service can significantly slow down, or fail. This is not impacted by file-size.

Current Workaround:
In case an investigator needs to submit a dataset that is comprised of large numbers of files, they can either:
1. Upload the dataset in multiple steps (upload folders individually, or grouped instead of the entire dataset folder)
2. Contact our team ([email protected], [email protected]) for more information.

Planned improvements:
A new upload service is currently under development and is expected to be in place by end of July 2022. This should remove any restrictions in the number of files and supported file-sizes.