Hello everyone,
I here use BioStars to ask a question.
I would like to extract the data from Bam files and already succeeded in retrieving the data from small Bam file (<1Gb). But I couldn't extract the nucleotide data from relatively large Bam files (about 10Gb or more). My question is what is the reason for this. The problem of PC spec?? I used Ubuntu installed to windows 10. I could not retrieve the data even if I changed only bam file name. The bam files were made by the collaborators, but I believe that files are not broken at all because I can visualize these bam file by using IGV viewer. But I cannot exclude the possibility. This phenomenon is observed even if I use different software. In some software I can see the error while in the other software there was no error description.
One error command is as follows:
AssertionError: chromosome not in SAM references: chr1
Hi, it's not clear what you were trying to do and what exactly didn't work. Can you provide the command you ran and explain what happened? You get an error message or it just gets stuck? Judging by the error you mention, there might be something wrong with the bam header. Take a look using samtools view -h.
apart from liorglic 's comment (specifically about the question formatting), this sounds like you request a chromosome/sequence name that is not present in the bam file.
Are you sure you are asking for the correct ID? (you can check the IDs in the bam file with the cmdline given by liorglic ) , keep in mind that
chr1
is not same asChr1
orchr_1
....