Question

New To DiscoSnp-RAD

0

Entering edit mode

7.4 years ago

Gio12 ▴ 220

Dear all,

I am relatively new to bioinformatics and coding. Looking the DiscoSnp's supplementary data, the commands seem pretty straight forward; however, I just can't seem to wrap my head around the command where the fof.txt was created. For the experiment in the DIscoSnp++ paper, the command was

for((j=0;j<i;j++)); do echo coli_muted_n_30_genome_${j}_reads.fasta > fof.txt; done.

I am working with RAD-seq data and would like some advice on this. I appreciate any help and advice!

Sincerely, Giovanni Madrigal

SNP DiscoSnp • 1.5k views

ADD COMMENT • link updated 7.4 years ago by pierre.peterlongo ▴ 900 • written 7.4 years ago by Gio12 ▴ 220

score 2 · Answer 1 · 2018-03-07

Hi all,

This code comes from the supp mat of the original paper where we explained how experiments were conducted.

discoSnp (discoSnpRad or discoSnp++) takes as input read files organized in file of files. The idea (described in the documentation) is that a fof contains a list of read files.

The command above is a simple bash for loop, that creates automatically a fof file containing indeed all .fasta of n coli individuals.

More generally, the fof files enable to deal with pair end files, to virtually concatenate read sets, ... Have a look to the documentation.

Have a nice day, Pierre

score 1 · Answer 2 · 2018-03-06

This is a bash for loop, contaminated with what I think is html code. This code just creates a file called fof.txt, with:

coli_muted_n_30_genome_0_reads.fasta
coli_muted_n_30_genome_1_reads.fasta
coli_muted_n_30_genome_2_reads.fasta
...

There will be $i names, last name is coli_muted_n_30_genome_(i-1)_reads.fasta.

Where did you get this code?