Hi guys,
I'm totally new in the field of ChipSeq histon acetylation data analysis.
I have to analyse some ChipSeq data. As from many tutorials and on-line documentation I started generating .fastq.gz from raw sequenced data. Then the trimming with Trimmomatic and then the alignment using BWA align.
After bwa align I ended up with .sai files each for each lane, i.e. each for each .fastq.gz file.
The situation is this:
X1_10_S8_L001_R1_001.aligned.sai
X1_10_S8_L001_R1_001.fastq.gz
X2_11_S8_L001_R1_001.aligned.sai
X2_11_S8_L001_R1_001.fastq.gz
X3_13_S8_L001_R1_001.aligned.sai
X3_13_S8_L001_R1_001.fastq.gz
I have thousands of files like this.
I would like to create .sam files to finally generate .bam files.
Is there a way to loop over all the "paired" files to generate the .sam files and then the bam files?
Thank you in advance for your help
B.
.sai
are index files, if I recall correctly. You should also have the actual SAM files that these files are an index of. Could you edit your question and add the BWA command(s) that you used please?Looks like I was super mistaken (ref: Bwa What Is In .Sai File )
If you're just looking for a loop, you should be able to find primers from searching the site. What is your sai -> sam command for one set of input files? Once you define that, you should be able to frame a loop with a tiny bit of effort.
Dear Ram, I would like just to perform this: bwa sampe <in.db.fasta> <in1.sai> <in1.fq> > <out.sam> for the full list of "paired" *.sai, *.fastq.gz files.
Do the following exercise:
This is a bash loop question, so resources such as https://wiki.bash-hackers.org/syntax/pe should be really helpful.