Hi all
Recently, I have build a web based RNA-Seq analysis platform and It has run successfully. However, I have no bam file of transcriptome to test my platform. Where can I find some bam files which have been released?
Thanks~~!!
Hi all
Recently, I have build a web based RNA-Seq analysis platform and It has run successfully. However, I have no bam file of transcriptome to test my platform. Where can I find some bam files which have been released?
Thanks~~!!
If you want smaller BAM files for testing, here is a ENCODE collection that starts at around 40 MB: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeUwRepliSeq
Here is an example (to show it, you can paste the link as it is into the "Custom track" text box in the UCSC genome browser): http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeUwRepliSeq/wgEncodeUwRepliSeqGm12878G1bAlnRep1.bam
Here you have FASTQ and BAM files from different human cell lines organized in a pretty nice interface: http://genome.crg.es/~jlagarde/encode_RNA_dashboard/
For a quick-gratification one-liner (instead of multi-step and multi-MB approaches), one could just use that command (after installing samtools):
samtools view -h ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/data/HG00154/alignment/HG00154.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam 17:7512445-7513455 -O bam > test.bam
I'll leave the data gathered from it to your discretion though, not sure of which regions from 1000 genomes would work best for your usecase, 17:7512445-7513455
might not be what you are looking after, but you get the idea.
There are of course many many places to look, but one place to start is the ENCODE RNA-seq data hosted at UCSC, e.g.: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeSydhRnaSeq/ and the other RNA-seq tracks at http://genome.ucsc.edu/ENCODE/downloads.html.
1000 genomes project gives access to all collected data: http://www.1000genomes.org/data
Quick links:
The sequence and alignment data generated by the 1000genomes project is made available as quickly as possible via our mirrored ftp sites.
You can also use the cool Data Slicer app to retrieve subset of data from 1000 genomes BAM files. Command-line utility documentation here and web-app here. Using Data Slicer you can import BAM files on-the fly to your web app.
Some here too: https://github.com/brainstorm/tiny-test-data
do you maintain this separately from https://github.com/roryk/tiny-test-data ?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
How about finding BAM files representing viral genomes, such as HIV? None of these sites mentioned has high coverage represented. Any ideas? Thank you