Entering edit mode
7.1 years ago
Arindam Ghosh
▴
530
I was working on EBI: Next Generation Sequencing Practical Course. Here in the first step to align the RNA-Seq data to the genome using BOWTIE2 we need a reference genome. The reference genome as I found comes as fragmented fasta file for each chromosome. I guess we need a combined file; where do I get it? Or is there a method to concatenate each file?
Bowtie2 is compatible with multiple reference fasta file. It should work just fine with the reference file you have downloaded.
Multiple reference fasta file? So do I need to list the names of all the files containing each individual chromosome.
is same as multiple sequences in a fasta file. Apologies for poor choice of terminology.
You can concatenate the files together with a simple
cat *.fa > /path_to/new.fa
command (putting the combined file someplace else so the command would not try to include it in the originalcat
).Well I guess I found a way out. The database also contains something called Top-level sequence and Primary Assembly that is actually supposed to be used.