How to rename multiple files?
2
0
Entering edit mode
2.8 years ago
Princy ▴ 60

Hello, How can I rename multiple fastq files? I have total 34 fastq files. how can I do this by using for loop? pls, let me know.

SRR6_Brain-IGA_in_Labeo_rohita_1.fastq.gz
SRR1_Brain-IGA_in_Labeo_rohita_2.fastq.gz
SRR2_Brain-PSR_from_Labeo_rohita_1.fastq.gz
SRR607_Brain-PSR_from_Labeo_rohita_2.fastq.gz
SRR60048_Pituitary-IGA_in_Labeo_rohita_1.fastq.gz
SRR63548_Pituitary-IGA_in_Labeo_rohita_2.fastq.gz
SRR60549_Pituitary-PSR_in_Labeo_rohita_1.fastq.gz
SRR60053_Liver-PSR_in_Labeo_rohita_1.fastq.gz
SRR60053_Liver-PSR_in_Labeo_rohita_2.fastq.gz
SRR69866_cSNPs_1.fastq.gz
SRR6986_cSNPs_2.fastq.gz
SRR6967_cSNPs_1.fastq.gz
SRR6067_cSNPs_2.fastq.gz
SRR69068_cSNPs_1.fastq.gz
SRR69068_cSNPs_2.fastq.gz
SRR70730_RNA-Seq_of_Selected_Rohu_1.fastq.gz
SRR70730_RNA-Seq_of_Selected_Rohu_2.fastq.gz
SRR70731_RNA-Seq_of_Control_Rohu_1.fastq.gz
SRR70731_RNA-Seq_of_Control_Rohu_2.fastq.gz
SRR7732_RNA-Seq_of_Unselected_Rohu_1.fastq.gz
SRR70732_RNA-Seq_of_Unselected_Rohu_2.fastq.gz
SRR8555_Liver_transcriptome_analysis_1.fastq.gz
SRR855_Liver_transcriptome_analysis_2.fastq.gz

I want my file name like this

SRR60048_R1.fastq.gz
SRR63548_R2.fastq.gz 
sed linux fastq • 2.1k views
ADD COMMENT
0
Entering edit mode
$ rename -n 's/_.*_/_R/' *.gz

Remove '-n' if you are okay with dry-run.

$ parallel --dry-run 'mv {} {=s/_.*_/_R/=}' ::: *.fastq.gz

Remove --dry-run if you are okay with dry-run.

in bash/zsh:

$ for i in *.fastq.gz; do echo mv $i ${i/_[A-Za-z]*_/_R};done

Remove echo if you are okay with dry-run.

ADD REPLY
2
Entering edit mode
2.8 years ago

Using the Linux command rename:

rename 's/_[\w|-]*_/_/' *.fastq.gz
rename 's/_1/_R1/' *.fastq.gz
rename 's/_2/_R2/' *.fastq.gz
ADD COMMENT
2
Entering edit mode
2.8 years ago
Malcolm.Cook ★ 1.5k

For such problems, I usually let GNU Parallel do the looping for me.

It has ability to embed perl regular expressions to transform the filenames.

If you have it installed...

and your files are one-per-line in some file_of_filenames.txt

then you can write something like this:

> cat file_of_filenames.txt | parallel --dry-run 'mv {} {=s|^(SRR\d+).*([1,2])(.fastq.gz)$|$1_R$2$3|=}'
mv SRR6_Brain-IGA_in_Labeo_rohita_1.fastq.gz SRR6_R1.fastq.gz
mv SRR1_Brain-IGA_in_Labeo_rohita_2.fastq.gz SRR1_R2.fastq.gz
mv SRR2_Brain-PSR_from_Labeo_rohita_1.fastq.gz SRR2_R1.fastq.gz
mv SRR607_Brain-PSR_from_Labeo_rohita_2.fastq.gz SRR607_R2.fastq.gz
mv SRR60048_Pituitary-IGA_in_Labeo_rohita_1.fastq.gz SRR60048_R1.fastq.gz
mv SRR63548_Pituitary-IGA_in_Labeo_rohita_2.fastq.gz SRR63548_R2.fastq.gz
mv SRR60549_Pituitary-PSR_in_Labeo_rohita_1.fastq.gz SRR60549_R1.fastq.gz
mv SRR60053_Liver-PSR_in_Labeo_rohita_1.fastq.gz SRR60053_R1.fastq.gz
mv SRR60053_Liver-PSR_in_Labeo_rohita_2.fastq.gz SRR60053_R2.fastq.gz
mv SRR69866_cSNPs_1.fastq.gz SRR69866_R1.fastq.gz
mv SRR6986_cSNPs_2.fastq.gz SRR6986_R2.fastq.gz
mv SRR6967_cSNPs_1.fastq.gz SRR6967_R1.fastq.gz
mv SRR6067_cSNPs_2.fastq.gz SRR6067_R2.fastq.gz
mv SRR69068_cSNPs_1.fastq.gz SRR69068_R1.fastq.gz
mv SRR69068_cSNPs_2.fastq.gz SRR69068_R2.fastq.gz
mv SRR70730_RNA-Seq_of_Selected_Rohu_1.fastq.gz SRR70730_R1.fastq.gz
mv SRR70730_RNA-Seq_of_Selected_Rohu_2.fastq.gz SRR70730_R2.fastq.gz
mv SRR70731_RNA-Seq_of_Control_Rohu_1.fastq.gz SRR70731_R1.fastq.gz
mv SRR70731_RNA-Seq_of_Control_Rohu_2.fastq.gz SRR70731_R2.fastq.gz
mv SRR7732_RNA-Seq_of_Unselected_Rohu_1.fastq.gz SRR7732_R1.fastq.gz
mv SRR70732_RNA-Seq_of_Unselected_Rohu_2.fastq.gz SRR70732_R2.fastq.gz
mv SRR8555_Liver_transcriptome_analysis_1.fastq.gz SRR8555_R1.fastq.gz
mv SRR855_Liver_transcriptome_analysis_2.fastq.gz SRR855_R2.fastq.gz

On the other hand if you want to rename all .fastq.gz files in the current directory, try this:

parallel --dry-run 'mv {} {=s|^(SRR\d+).*([1,2])(.fastq.gz)$|$1_R$2$3|=}' ::: *.fastq.gz

In either case, if you like these results, removed the --dry-run and run it again.

ADD COMMENT
0
Entering edit mode
[p@login02 new]$ parallel --dry-run 'mv {} {=s|^(SRR\d+).*([1,2])(.fastq.gz)$|$1_R$2$3|=}' ::: *.fastq.gz
Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

To silence the citation notice: run 'parallel --bibtex'.

mv SRR6003546_Brain-IGA_in_Labeo_rohita_1.fastq.gz SRR6003546_R1.fastq.gz
mv SRR6003546_Brain-IGA_in_Labeo_rohita_2.fastq.gz SRR6003546_R2.fastq.gz
mv SRR6003547_Brain-PSR_from_Labeo_rohita_1.fastq.gz SRR6003547_R1.fastq.gz
[p@login02 new]$ ls
SRR6003546_Brain-IGA_in_Labeo_rohita_1.fastq.gz  SRR6003547_Brain-PSR_from_Labeo_rohita_1.fastq.gz
SRR6003546_Brain-IGA_in_Labeo_rohita_2.fastq.gz

Hi I used this command but the file name did not change. Could you pl tell me what I can do next? Thank you for your response.

ADD REPLY
2
Entering edit mode

Hi I used this command but the file name did not change

Did you miss the part where @Malcom told you to remove --dry-run from the command to actually execute the change? This option is built in so you can check the result before making the actual change.

ADD REPLY

Login before adding your answer.

Traffic: 2248 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6