Entering edit mode
4.5 years ago
roy.granit
▴
890
I have 3 amplicons that I have sequenced and now I wish to align them to the sequences of these three genes so I can quantify how many reads mapped to each gene.
In order to use STAR must I go through the steps of genome indexing or is there a way to align stright with the fasta file containing the 3 genes?
Thanks!
you could make (and index) a custom file with only those 3 genes (perhaps go for the gene locus sequences, rather than CDS or such). Keep in mind their is a risk that you bias your analysis.
Personally I think STAR is fast enough to use the whole genome as reference and do some filtering after mapping.
I did not mention that some of these genes do not exist in the genome..
If you don't expect off-target amplification then why not use whole genome indexes if they are available. That can also show you off-target amplification (if there was any).
If not, you will need to index the reference sequences (which should be trivial for a small number).
BBMap allows you to align directly without creating indexes first:
bbmap.sh in=your.fq.gz ref=your_multifasta.fa out=aligned.bam
Thanks, how reliable is bbmap? never heard of it..
You are missing some important knowledge in that case :-) BBMap goes toe to toe with any NGS aligner out there. I will leave this link here so you can see what the BBTools suite can do (besides alignments). A guide for BBMap the aligner is available here.