Hello reader,
we just got some data regarding Macrophage infection analysis with some fungal strains. then rna was extracted from these analysis and sequenced for mRNA 150bpx2.
Human macrophages where used for this analysis. and my colleague extracted rna using the whole well of the assay plate. which means that it has both macrophage and fungus mRNA, no idea in what proportions.
now i want to extract the reads which are fungal specific to use in another genome annotation process.
The idea which i have works like this
- Map QC-cleaned reads to human reference genome using
Hisat2/STAR
- Extract all the reads which do not map to human Ref. Using
samtools -f 4
flag.
These reads will be theoratically fungal mRNA reads.
Questions:
- is this really the best approach to follow ?
- Which human genome to use for this analysis. Hg38 or T2T genome.
- samtools will only extract reads which show no alignment at all or there should be a threshold to extract reads ?
If you can suggest anyother way which is better that the idea that i have, it will be rally helpful. This data will also be used for RNAseq study but on later stages.
thank you
unfortunatly there is no chromosoem level reference genome available. But there are some genome assemblies available at scaffold or contig level with annotations, tho the wigth have some gaps too. can those be used ?
It is worth trying. You will know reads that map to the fungal reference (even though fragmented) definitely belong there.
You could also use a transcriptime/cDNA collection instead (if available) since you are working with RNAseq data.
i used the following command. Kindly let me know if i missed something or if it is now correct
and these were the results
and stats
Thank you
Looks good to me. You are being strict with multi-mappers across genomes (i.e. throwing them away). Something to make a note of.
Hey, GenoMax I have another query regarding
bbaplit.sh
.As i have mentioned that the raw-reads are of RNASeq, is
bbsplit.sh
performing splice-awear alignments ?or it is just considering the data as DNA-sequenceand performing alinments like BWA-mem or other DNA-alignment tools ?