Hi all,
I am using python and pysam for the analysis of my sequencing data. I have aligned my paired reads obtained from the Illumina MiSeq platform with bowtie2 to my reference genome. The reference genome consists of 17 chromosomes. When I now check my .sam file with pysam and iterate over the records, I find that most records have a 'reference_id' with a value between 0 and 16, which corresponds to the chromosomes. However, some records have the 'reference_id'= '-1'.
I'm unsure what this indicates. Does this indicate that the read could not be mapped to the reference genome? I would think so, but I can't find the information anywhere. Any help would be much appreciated!!
I did a test using a random read, which won't map to the reference. I then check this unmapped read, its reference_id is -1.