Question

automating multiple fastq.gz -> SAM

0

Entering edit mode

9.8 years ago

s.vandenhurk ▴ 10

Hello Guys,

I am interested in automating the combining of multiple fastq.gz files. As you can see in the image below I have already indexed my reference genome with BWA and I have combined the fastq.gz files from my COL strain. Because this Arabidopsis is a test set, I could do it manually. However, in the near future I plan to do the same thing with over 500 strains. So I would like to automate the process.

I have used the linux command cat *_1.fastq.gz > Combined_1.fastq.gz to get it to work at the COL strain. Should I create some sort of bash loop for things like this?

I have already aligned these Combined_1 and Combined_2 to the reference genome and I have used these results with BWA sampe to get a SAM file. So far, results look promising. But, could this process also be automated?

Assembly sequencing • 2.6k views

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by s.vandenhurk ▴ 10

Ram · Answer 1 · 2015-03-11

1

Entering edit mode

9.8 years ago

Pierre Lindenbaum 164k

See my demo project https://github.com/lindenb/ngsxml to generate Makefile-based workflows using XSLT

wordlow

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by Pierre Lindenbaum 164k

1

Entering edit mode

I had considered posting a comment of the form, "Pierre's reply mentioning ngsxml in 3, 2, 1...". I guess I should have!

ADD REPLY • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by Devon Ryan 105k

0

Entering edit mode

:-)

ADD REPLY • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

How did you make that graph?

ADD REPLY • link 7.6 years ago by jnowacki ▴ 110

0

Entering edit mode

https://linux.die.net/man/1/tree

ADD REPLY • link 7.6 years ago by Pierre Lindenbaum 164k