Hi, I need to concatenate the chromosome files into a single FASTA file but I have a lot of files with strange names.
- The typical
chr*.fa
- Random
chr*_gl0000**_random.fa
- ChrUn
chrUn_gl0000**.fa
Note that *
means any number
What is the correct order to concatenate these files with cat
? In working in making the reference for mapping smallRNA.
Thanks in advance
Those are sequences can not be assigned to a certain chromosome. see here.
I do not think the order of the fasta file matters (correct me if I am wrong)
what I did:
and if this is small rna you will get a lot of multiple mapped reads if you add all chromosomes, that if you plan to remove is better to use only the known chromosomes and maybe the chrUn. But the latter, normally are ribosomal genes, that have some copies in the genome as well. So, you need to think about the strategy you want to follow to map and annotate to decide this beforehand.