I have two samples where the sequencing was done on 3-lanes paired end reads. The two samples are A44943_1 and A44944_1.
I would like to generate a for in loop to merge the R1 files of sample A44943_1 from the 3-lanes, and then do the same for R2 files of this sample, then I want to merge R1 files of sample A44944_1 from the 3-lanes and then do the same for R2 files of this sample. So I thought about for in loop. My fastq files looks like:
Do not post on multiple forums - that is just bad etiquette. What's worse, you did not link between the two so no one knows that you're asking two sets of online volunteers to spend their time on your problem without telling them that you're also asking the other group.
You've done this multiple times. If you repeat this behavior, you risk your account being suspended.
If this comment is any indicator, this is how much you annoy people in both communities with this sort of behavior.
I crammed a function into my .bash_rc to do just this:
function concat_fastq {
sample=$1
if [[ -z "$sample" ]]; then
echo "A sample name must be provided as an argument."
return 1
fi
# Check if files for R2 exist
if ls "${sample}"_L00*_R2_001.fastq.gz 1> /dev/null 2>&1; then
# This is a paired-end sample
echo "Processing paired-end sample: $sample"
# Concatenate R1
cat "${sample}"_L00*_R1_001.fastq.gz > "${sample}_merged_R1_001.fastq.gz"
if [[ $? -ne 0 ]]; then
echo "An error occurred while concatenating R1 files."
return 1
fi
# Concatenate R2
cat "${sample}"_L00*_R2_001.fastq.gz > "${sample}_merged_R2_001.fastq.gz"
if [[ $? -ne 0 ]]; then
echo "An error occurred while concatenating R2 files."
return 1
fi
else
# This is a single-end sample
echo "Processing single-end sample: $sample"
# Concatenate R1 only
cat "${sample}"_L00*_R1_001.fastq.gz > "${sample}_merged_R1_001.fastq.gz"
if [[ $? -ne 0 ]]; then
echo "An error occurred while concatenating R1 files."
return 1
fi
fi
echo "Concatenation complete for sample: $sample"
}
To use:
for sample_name in $(ls *_L00*_R1_001.fastq.gz | rev | cut -d'_' -f4- | rev | sort | uniq); do
concat_fastq "$sample_name"
done
Your file suffixes are a bit different, so will require some light edits.
https://www.cyberciti.biz/faq/bash-for-loop/ . What have you tried so far ?
Here is an easier to understand option: Concatenating fastq.gz files across lanes
There are other answers you can use in the thread as well.
Do not post on multiple forums - that is just bad etiquette. What's worse, you did not link between the two so no one knows that you're asking two sets of online volunteers to spend their time on your problem without telling them that you're also asking the other group.
You've done this multiple times. If you repeat this behavior, you risk your account being suspended.
If this comment is any indicator, this is how much you annoy people in both communities with this sort of behavior.