Hi everyone!
I've done a short read alignment on multifasta reference sequence. I'm working with windows... The multifasta reference contains two sequences NG_005905, NG_012772. And I've put the two fasta files in one (called it multiref.fasta), because I wanted to do a multi alignment
So then I did the steps:
- built bowtie index from ref multifasta file ( bowtie2-build)
- aligned the read files (unpaired) (bowtie2-align) - got the sam file (the alignment rate was 99.33%)
- sam > bam (samtools)
- sort bam (samtools) - bam
- sorted_bam index file (samtools) - bai
Then I tried to view it in IGV: First it said: does not contain any sequence names which match the current genome. Then I tried to open it from 'genomes' and it showed nothing.
So my questions are:
1. Did I do it wrong that I wanted to do the multi align by putting the two fasta formatted sequence in one fasta file?
2. Is the header of the sequences in fasta file wrong? So the IGV doesn't recognize NG_...
>NG_005905.2 Homo sapiens BRCA1, DNA repair associated (BRCA1), RefSeqGene (LRG_292) on chromosome 17
>NG_012772.3 Homo sapiens BRCA2, DNA repair associated (BRCA2), RefSeqGene (LRG_293) on chromosome 13
3. unpaired reads name is the same like for example(first two line):
exampl1.fastaq
@Frag_1 chr17 (Strand + Offset 106709--107175) 467M 101M
GAAGCCTGAGAATAATGACATTTGAGCCAATCTGCAGAGGTAAGTGAGTCCATAAAAGAAACTGAGGCTGGGCCTAGT
GGCTCACACCTGTAATCCTAGCA
exampl2.fastaq
@Frag_1 chr17 (Strand + Offset 106709--107175) 467M 101M
AGGCAGGTCTCAAACTCCTGACCTCAGGTGATCCACCCACCTCAAGCCTCCCAAAGTGCTGGGATTATAGGCATGAGC
CACCATGTCCGGCAAGTTTCTTT
Thank you for your answers! Best regards,
Anna
First import your multiref.fasta as a genome and only then load your bam file.
Thank you, that helped too:)