Uploading Large Files to GenePattern


Posted on Friday, June 06, 2025 at 09:50AM by GenePattern Team


Many modern GenePattern analyses require multi-gigabyte input files. For example, AmpliconSuite runs might use FASTQ files that are 70GB or larger. Here are some tips for uploading and analyzing large files in GenePattern.

Methods for Uploading Large Files

There are three ways to upload large files to GenePattern:

  1. Upload in the Browser
  2. Transfer Using Globus
  3. Use S3 URIs

If you need more than the standard 30GB quota for data files, contact the GenePattern team to request a temporary larger disk quota. You can start this process by sending a message on the GenePattern help forum at https://groups.google.com/g/genepattern-help.

Uploads in the Browser

You can upload files of arbitrary size to the Files tab. For example, one AmpliconSuite user successfully uploaded two 60GB FASTQ files this way. Uploads to the Files tab bypass the GenePattern server and go directly to our Amazon Web Services (AWS) S3 bucket, leveraging AWS's high bandwidth and excellent connectivity.

highlights of where to upload  in the browser

Tips for Browser Uploads:

  • Avoid starting a large upload when you have jobs running on GenePattern, as the page may refresh periodically. If the page refreshes, the upload stops.
  • Prevent your computer from going to sleep during the upload process. On macOS, you can use a utility like Caffeine to keep your computer awake. Similar options are available for other operating systems.
  • Note that the File Upload button on the Run Job page is limited to files under 2GB. Use the Files tab for larger uploads.

Transfer Using Globus

Another option is to use Globus to transfer files to the GenePattern server. Detailed instructions for using Globus are available in the GenePattern User Guide.

highlights of where to upload  using Globus

Important Notes:

  • Initiate the Globus transfer from within GenePattern to ensure privacy. This allows us to set up access control lists (ACLs) on directories to protect your data and it allows GenePattern to find the files and add them to your Files tab. Do not initiate transfers from the Globus Dashboard - such files will not become visible within GenePattern.
  • Transfers may fail if your files are coming from a Globus Connect Personal endpoint and the host computer goes to sleep or disconnects. This often happens when the endpoint is running on a laptop or desktop computer. Utilities like Caffeine can help prevent this issue.

Use S3 URIs

The third option is to use S3 URIs to reference files already stored in an S3 bucket.

highlights of where to upload  S3 URIs

Tips for Using S3 URIs:

  • For files in a public S3 bucket, simply enter the S3 URI in the "Add Path or URL" input for a file parameter.
  • For files in a private S3 bucket, create a pre-signed URL for the file. Refer to the AWS documentation for instructions on generating pre-signed URLs. Once created, use the pre-signed URL in the "Add Path or URL" input.

If you have any questions about uploading large files to GenePattern, please reach out to us on the GenePattern help forum.

Back to Blog