Dear all,
I have an already aligned .bam file stored in google bucket. I'm trying to retrieve and ummap them to ubam so i will be able to perform all analysis using GATK best practices. When i use gsutil cp gs:url/to/file/in/google/bucket there's an additional .gstmp extension at .bam files from google bucket.
Samtool view tells me that EOF marker is absent;
is this a problem with downloading my .bam files from google bucket or the .bam file that is actually actually corrupted? How will i be able to tell which is which? Any help will be much appreciated.
why do you need to unmap them ?
@Pierre, data pre-processing for variant discovery according to gatk best practices requires either FASTQ or uBAM format. In my case I have only aligned .bam files (i didn't do this), hence the need to ummap reads. https://software.broadinstitute.org/gatk/documentation/article?id=6484
it doesn't makes sense to me, your reads are already mapped, you'll unmap them and remap them ?
try to reload the file. Is there any md5 available on the google bucket ?