Question

bash script loop to join multiple amplicon seq, pair end read on HPC server using ea-utils fastq- join

0

Entering edit mode

5.9 years ago

Bioinfonext ▴ 470

Hi,

I am want to use ea-utils fastq-join to join multiple amplicon sequencing pair end read.

I am working on HPC server but I have installed myself ea-utils as it is not available on server.

so the path for ea-utils is:

/users/3052/Amplicon_data/ITS_analysis/ITS/ExpressionAnalysis-ea-utils-bd148d4/clipper/fastq-join

and multiple fast files names are like this:

Soil-7_S32_L001_R1_001.fastq.gz   Soil-7_S32_L001_R2_001.fastq.gz
Soil-8_S32_L001_R1_001.fastq.gz  Soil-8_S32_L001_R2_001.fastq.gz
Soil-9_S42_L001_R1_001.fastq.gz  Soil-9_S42_L001_R2_001.fastq.gz

Now, How should I run bash scripts loop to join the multiple files in a single command?

Thanks in advance.

Assembly • 1.9k views

ADD COMMENT • link updated 5.8 years ago by Ram 44k • written 5.9 years ago by Bioinfonext ▴ 470

1

Entering edit mode

What have you tried? Or do you just need someone to do all the analysis for you?

ADD REPLY • link 5.9 years ago by Benn 8.3k

0

Entering edit mode

I am trying to use below job scripts but I am not sure If any tool is not available on server then how to load tool in job script which is installed in my own directory, should I use module load or what I have to do exactly?

!/bin/bash 
$ -N fastq-join 
$ -o /users/30521/testdata/test-job 
$ -pe smp-verbose 20
$ /users/30521/testdata/test-job

module load /users/3052/Amplicon_data/ITS_analysis/ITS/ExpressionAnalysis-ea-utils-bd148d4/clipper/fastq-join 

/mnt/scratch/users/staffnumber/ea-utils.1.1.2-537/fastq-join -m 200 -p1 Soil-7_S32_L001_R1_001.fastq Soil-7_S32_L001_R2_001.fastq -o Soil-7_S32_L001.join.fastq
/mnt/scratch/users/staffnumber/ea-utils.1.1.2-537/fastq-join -m 200  -p1 Soil-8_S32_L001_R1_001.fastq Soil-8_S32_L001_R2_001.fastq-o Soil-8_S32_L001_join.fastq
/mnt/scratch/users/staffnumber/ea-utils.1.1.2-537/fastq-join -m 200  -p1 Soil-9_S42_L001_R1_001.fastq Soil-9_S42_L001_R2_001.fastq -o Soil-9_S42_L001_join.fastq

Thanks

ADD REPLY • link updated 5.8 years ago by Ram 44k • written 5.9 years ago by Bioinfonext ▴ 470

1

Entering edit mode

Noone here knows your exact system, but generally using the full path to a software binary which is correctly installed if necessary and visible to all cluster nodes - i.e on a shared storage - should work.

Module load won't help if the software is not installed on your cluster.

Good luck!

ADD REPLY • link 5.9 years ago by colindaven 7.0k

0

Entering edit mode

If fastq-join is located as /users/3052/Amplicon_data/ITS_analysis/ITS/ExpressionAnalysis-ea-utils-bd148d4/clipper/fastq-join, use that location in place of /mnt/scratch/users/staffnumber/ea-utils.1.1.2-537/fastq-join in the last three lines.

If it is located in /mnt/scratch/users/staffnumber/ea-utils.1.1.2-537/fastq-join, remove the module load line.

The concept is well explained by colindaven. module load will only work for modulefiles located in $MODULEPATH, not for custom binaries.

ADD REPLY • link 5.8 years ago by Ram 44k