Automating Paired End mating
1
0
Entering edit mode
10.0 years ago
Furor ▴ 40

Hi all,

I know there have been multiple questions about demultiplexed FASTQ files for Illumina paired end reads, but as far as I've seen, none seems to deal specifically with the following issue: I have nearly 300 samples of which the forward and reverse reads should be mated, and I then should create a barcode label based on the filename (=sample) of the PEAR output files to use with either USEARCH (and/or QIIME) or a names/groups file to use with mothur.

I use PEAR to mate my reads (I might check some alternatives later). Is there a way (a 'for loop' I presume) in which I could take the 'sample+lib' part of the filename (e.g. NG-7611_SAMPLE1_lib53965_2904_1_1.fastq) of all files in a directory and use this as input for PEAR (or other).

A basic PEAR command simply goes like

pear -f NG-7611_SAMPLE1_lib53965_2904_1_1.fastq -r NG-7611_SAMPLE1_lib53965_2904_1_2.fastq -o SAMPLE1.fastq

I guess I should set a variable which retrieves the 'sample+lib#' in each loop for the specific input files, and a second one with only the sample name for the output, and fit this in the pear command.

I guess I could then use this script as a template for adding the barcodelabel too:

sed "-es/^@(.*)/@\1;barcodelabel=SAMPLE_1;/" < $in/SAMPLE_1.assembled.fastq > $P/SAMPLE_1.fastq

I've just started learning Python (and shell scripting), and sed and grep to (a.o.) do this kind of automation, but I guess it might take a couple of weeks until I'd be able to do this, so if anyone would want to help me out or set me on my way?

Thanks!

Illumina Paired-End automation Assembly • 3.6k views
ADD COMMENT
0
Entering edit mode
10.0 years ago

Off the top of my head the easiest (not the most elegant though) way in bash would be:

for a in sample1 sample2 ... etc
do
  for b in lib1 lib2 lib3 ... etc
  do
    pear -f NG-7611_${a}_${b}_1_1.fastq -r NG-7611_${a}_${b}_1_2.fastq -o $a.fastq
  done
done
ADD COMMENT
0
Entering edit mode

Thanks. Needs some preparation, but does the job very well! Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2227 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6