Explore 6.1 PB of genomics data across 5M files
Sample Genomics Data Data from various genomics file formats (BAM, VCF, BED, etc), and sequencing technologies. (5.15 GB) | Sample Genomics Data | 5.15 GB | Data from various genomics file formats (BAM, VCF, BED, etc), and sequencing technologies. | |
Human Pangenome Project Sequencing data and analysis of 10 trios. First complete human genome assembly. (3.75 PB) | Human Pangenome Project | 3.75 PB | Sequencing data and analysis of 10 trios. First complete human genome assembly. | |
Genome in a Bottle Reference data from several sequencing technologies. Used as ground truth for benchmarking. (124 TB) | Genome in a Bottle | 124 TB | Reference data from several sequencing technologies. Used as ground truth for benchmarking. | |
1000 Genomes Project Sequencing data and analysis of >2,500 individuals from around the world. (766 TB) | 1000 Genomes Project | 766 TB | Sequencing data and analysis of >2,500 individuals from around the world. | |
Bio Data Zoo Example genomics data for tool developers (619 kB) | Bio Data Zoo | 619 kB | Example genomics data for tool developers | |
DeepVariant Datasets Sample data used for testing and benchmarking the DeepVariant variant caller. (5.4 TB) | DeepVariant Datasets | 5.4 TB | Sample data used for testing and benchmarking the DeepVariant variant caller. | |
Broad Public Datasets Sample datasets from the Broad Institute for testing bioinformatics workflows. (4.09 TB) | Broad Public Datasets | 4.09 TB | Sample datasets from the Broad Institute for testing bioinformatics workflows. | |
Genome Ark Data from the Vertebrate Genomes Project (VGP), featuring reference genomes for vertebrate species. (1.13 PB) | Genome Ark | 1.13 PB | Data from the Vertebrate Genomes Project (VGP), featuring reference genomes for vertebrate species. | |
Human Microbiome Project Microbiome data of 300 healthy adults, and several individuals with disease conditions. (5.86 TB) | Human Microbiome Project | 5.86 TB | Microbiome data of 300 healthy adults, and several individuals with disease conditions. | |
Australasian Genomes Sequencing datasets and reference genomes of several threatened Australasian species. (8.38 TB) | Australasian Genomes | 8.38 TB | Sequencing datasets and reference genomes of several threatened Australasian species. | |
3000 Rice Genomes Sequencing data and analysis of >3,000 rice varieties from 89 countries. (248 TB) | 3000 Rice Genomes | 248 TB | Sequencing data and analysis of >3,000 rice varieties from 89 countries. | |
GATK Test Data Test datasets for the GATK variant caller, with data from WGS, WES, and RNA-seq. (1.05 TB) | GATK Test Data | 1.05 TB | Test datasets for the GATK variant caller, with data from WGS, WES, and RNA-seq. | |
Element Bio Data Data from the Element Bio manuscript about the Avidity instrument. (535 GB) | Element Bio Data | 535 GB | Data from the Element Bio manuscript about the Avidity instrument. | |
ONT Data Oxford Nanopore benchmarking datasets from various sequencing chemistries and samples. (91.7 TB) | ONT Data | 91.7 TB | Oxford Nanopore benchmarking datasets from various sequencing chemistries and samples. | |
Pediatric Brain Tumor Atlas Analysis of pediatric brain tumors: gene expression, gene fusions, somatic mutations, CNVs, and SVs. (3.03 TB) | Pediatric Brain Tumor Atlas | 3.03 TB | Analysis of pediatric brain tumors: gene expression, gene fusions, somatic mutations, CNVs, and SVs. | |
Genome in a Bottle (FTP) Reference data from several sequencing technologies. Used as ground truth for benchmarking. | Genome in a Bottle (FTP) | Reference data from several sequencing technologies. Used as ground truth for benchmarking. |