I'm trying to use xenome, which is a program to differentiate host vs graft sequences in RNAseq data. The input requires that I have one fasta file for the mouse genome, and one fasta file for a human genome:
xenome index -T 8 -P idx -H mouse.fa -G human.fa
. where -T is the # of threads, -P is the prefix for the output files, -H is the host fasta file, and -G is the graft fasta file.
I have a few questions.
- Are the .fna and .fa file formats the same?
- I can download a zipped file of all the separate chromosome fasta files. Can I input that zipped file into xenome?
- If I can't do #2, do I have to just combine all of the fasta files? -- How do I do that?
- Can I use the newest assemblies with xenome (e.g. mm11), but then use HISAT2 with the older assemblies (e.g. mm10) because they're pre-loaded?
- Where can I learn more about this stuff?
New to bioinformatics and couldn't find straightforward answers to these questions on google, so sorry if these are basic. Thank you!
Thanks so much! Actually got the indexing running finally. But ya Q5 -- realizing that learning on the job is the only way lol