Question

Create txt file in a for loop

0

Entering edit mode

5.0 years ago

Ati ▴ 50

I have some paired-end sequencing data in a directory:

sam1_R1.fastq.gz
sam1_R2.fastq.gz
sam2_R1.fastq.gz
sam2_R2.fastq.gz

I want to generate txt files as follow (for each sample (sam) separately:

$less sam1.txt

(dir)/sam1_R1.fastq.gz
(dir)/sam1_R2.fastq.gz

$less sam2.txt

(dir)/sam2_R1.fastq.gz
(dir)/sam2_R2.fastq.gz

bash loop linux • 1.6k views

ADD COMMENT • link updated 5.0 years ago by padwalmk ▴ 140 • written 5.0 years ago by Ati ▴ 50

0

Entering edit mode

what's the difference between the two dirs ? may be you wanted sam2_...

ADD REPLY • link 5.0 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Sorry! it was a typo!

ADD REPLY • link 5.0 years ago by Ati ▴ 50

0

Entering edit mode

try this:

$ find ../ -type f -name "*R1.fastq.gz" | sed 's_../__'

assuming that dir in OP is parent directory and files share the same pattern as in OP.

ADD REPLY • link 5.0 years ago by cpad0112 21k

0

Entering edit mode

What does this do, cpad? It seems to be just a cd .. && ls *R1.fastq.gz ** cd dir. OP seems to want each a list of sample's fastq files in separate text files.

ADD REPLY • link 5.0 years ago by Ram 44k

0

Entering edit mode

it would print: <parent_directory>/filename (for eg. test/sam1_R1.fastq.gz, test/sam1_R2.fastq.gz ....so on, assuming that test being parent directory). @ RamRS

output would look like this (on dummy .fq files in a directory named "test"):

$ cat sam1.txt 
test/test5_R1.fq
test/test7_R1.fq
test/test1_R1.fq
test/test2_R1.fq
test/test8_R1.fq
test/test10_R1.fq
test/test4_R1.fq
test/test6_R1.fq
test/test9_R1.fq
test/test3_R1.fq

ADD REPLY • link 5.0 years ago by cpad0112 21k

0

Entering edit mode

OK, but this is not what OP wants at all. It actually leads them down a slightly wrong path.

ADD REPLY • link 5.0 years ago by Ram 44k

0

Entering edit mode

is this OP wants? :

$ ls *.gz
sam1_R1.fastq.gz  sam1_R2.fastq.gz  sam2_R1.fastq.gz  sam2_R2.fastq.gz

$ ls *.fastq.gz  | sed 's/\(.*\)_\(.*\)/\1\\\1_\2/' | awk -F '\' '{print > $1}'

or

$ find  -type f  -name "*.fastq.gz" -exec basename {} \;| awk -F '_'  '{print $1"\\"$1"_"$2>$1.txt}'

$ cat sam1
sam1\sam1_R1.fastq.gz
sam1\sam1_R2.fastq.gz

$ cat sam2
sam2\sam2_R1.fastq.gz
sam2\sam2_R2.fastq.gz

try this with awk:

$ awk 'FNR==1 {FN=FILENAME;gsub(/_R[^0-9],"",FN); print FN"\\"FILENAME > FN".txt" }' *.fastq.gz

ADD REPLY • link 5.0 years ago by cpad0112 21k

0

Entering edit mode

I think that's close to what they want. I also think we should not be investing so much effort into a pure shell question that the OP has already solved.

ADD REPLY • link 5.0 years ago by Ram 44k

Ram · Answer 1 · 2019-12-30

0

Entering edit mode

5.0 years ago

Ati ▴ 50

I found the solution myself:

for sample in samples;
  do ls (dir)/*.fastq.gz | grep $sample > (dir)/$sample.txt
done

ADD COMMENT • link updated 5.0 years ago by Ram 44k • written 5.0 years ago by Ati ▴ 50

0

Entering edit mode

That is an extremely roundabout way to go about this. Why do you need ls in a loop if you already have a samples variable with a list of samples?

You could do something like padwalmk suggests.

ADD REPLY • link 5.0 years ago by Ram 44k

Ram · Answer 2 · 2019-12-30

0

Entering edit mode

5.0 years ago

padwalmk ▴ 140

Here you go .....

ls *.gz | cut -f 1 -d "_" | sort | uniq |  while read line
do 
    touch "$line".txt 
    echo "/dir/$line"_R1.fastq.gz > "$line".txt 
    echo "/dir/$line"_R2.fastq.gz >> "$line".txt
done

ADD COMMENT • link updated 5.0 years ago by Ram 44k • written 5.0 years ago by padwalmk ▴ 140

1

Entering edit mode

Why the touch?
Why not a simple echo -e "/dir/${line}_R1.fastq.gz\n/dir/${line}_R2.fastq.gz" > "${line}.txt"? That would be one line instead of 3.
Why sort | uniq and not sort -u?

ADD REPLY • link 5.0 years ago by Ram 44k

0

Entering edit mode

Thank you for guiding, I am a novice and I have started shell scripting, so learning, but happy for your advice next time will try to implement it.

ADD REPLY • link 5.0 years ago by padwalmk ▴ 140

0

Entering edit mode

That's great! You will enjoy learning the various ways to get a job done on the shell. If like me, you were taught to create files using the touch command, be aware that it's not necessary. >, 2>, etc - redirection operators as well as programs that write to files, such as tee and editors like emacs, nano, vim, vi can directly create files.

echo has an option which enables us to use -extended expressions, such as \t\ and\n.sort -uachieves the same assort | uniq, so it's better unless we need someuniqspecific features, such asuniq -c`.

ADD REPLY • link 5.0 years ago by Ram 44k

0

Entering edit mode

Thank you, Yeah, I am a hardcore biologist, but I love this, so exploring my own so suggestions are always welcome. Happy New Year...

ADD REPLY • link 5.0 years ago by padwalmk ▴ 140