Hello All
I am working on homosapien chipset data I have done the indexing using the bowtie2 tool of the top-level assembly from the ensemble website, https://ftp.ensembl.org/pub/release-110/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_rm.toplevel.fa.gz
Now while doing the alignment using bowtie2 I am getting a very low overall alignment rate like 35%, 39% for all the samples. Why is the alignment rate so low? Any suggestions on what am I doing wrong or what is causing that ?? and what can I do for better results I am new to bioinformatics so kindly bear if this is a very basic question but any advice/help is appreciated.
Following is the code for indexing
bowtie2-build input.fasta genome_index
for alignment :
bowtie2 -p 12 -q -x genome_index -U my_inputfile -S my_output
Any help would be appreciated.
Thanks Regards Mehvi
Did you run FASTQC to check the quality of your sequencing samples?
Otherwise, try taking some the reads that failed to align, and run them through NCBI BLAST to see where those reads might be coming from.
Yeap, you can sample your reads with
seqtk
like this:Take a couple and run them through BLAST as dsull suggested, and once you have an idea where they are coming from you can use
FASTQScreen
to actually quantify what proportion of the reads come from a genome or the other one.Thanks for your reply ,I will try this . Yes , I had run fastqc ,the results were fine , let me know if I should focus or check again any results in Fastqc.