I aligned all my Nanopore Rna-Seq reads with Human Reference Genome and Transcript (Then I had 2 different bam files one of them is aligned with Reference Genome other is aligned with Transcript)
I download reference files from:https://www.ncbi.nlm.nih.gov/genome/guide/human/
I want to extract chromosome X from the aligned BAM file. I tried extract region with samtools view command but it did not work I taken the error specifies an invalid region or unknown reference. Continue anyway.'
I used to check chromosome list with samtools view -H input.bam
then the list for both bam file(transcript and genome)
@SQ SN:XR_928432.2 LN:810
Can you tell me what am I doing wrong ? and How to extract that region?
what was the command ???
your parameter order is wrong, samtools expects:
so you need:
Thank you for the answer but it is not working, I have taken the same error
check if chromosome X is labeled as X or chrX
I checked with
samtools view -H input.bam
with that command but I did not see any chromosome I downloaded ref gene from that link: https://www.ncbi.nlm.nih.gov/genome/guide/human/I saw only like
Check your reference that you aligned to for the chromosome names.
You can ussally use grep for it
Thank you for learned to me grep command, I check it, and I see all chromosomes in there. But I can not extract from bam file I am exactly sure I aligned with the correct file.
Ok, and when you do
Does the X in the above command match what the X chromosome is called in the output of the grep command?
You need to make sure the samtools X: is the same as what X is called in the grep command output.
In grep command its output like(genomic ref);
You are right, I think it's output different from
X:123456-654321
then and I didn't see any location as a number. In that case how to extract that region?I am confused should I align with the reference file from the reading, I did but I can not visualize to look chrX.
Sure, please try this:
It will not solve the problem because it has many different kinds of 'NT_ '
In your reference should be one NT_ sequence for the X chromosome, the other NT_ sequences are the other chromomosomes (1,2,3...22, Y, MT)
Does anyone have a different idea?
Please paste the entire header of the SAM/BAM file.