Dear all,
I have an already aligned .bam file stored in google bucket. I'm trying to retrieve and ummap them to ubam so i will be able to perform all analysis using GATK best practices. When i use gsutil cp gs:url/to/file/in/google/bucket there's an additional .gstmp extension at .bam files from google bucket.
Samtool view tells me that EOF marker is absent;
is this a problem with downloading my .bam files from google bucket or the .bam file that is actually actually corrupted? How will i be able to tell which is which? Any help will be much appreciated.
why do you need to unmap them ?
@Pierre, data pre-processing for variant discovery according to gatk best practices requires either FASTQ or uBAM format. In my case I have only aligned .bam files (i didn't do this), hence the need to ummap reads.
it doesn't makes sense to me, your reads are already mapped, you'll unmap them and remap them ?
try to reload the file. Is there any md5 available on the google bucket ?