Dear All, I need to setup BWA for alignment. I followed below steps:
1-downlod reference genome version through 'wget
' and below link:
Wget https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz
2-then extract the downloaded file:
tar zvfx chromFa.tar.gz
3- 'cat' .fa file to wg.fa :
cat *.fa > wg.fa
4- remove additional files:
rm chr*.fa
5- Now, I have to generate an index file through the below command by bwa version 0.7.17-r1188:
./bwa index -p hg19bwaidx -a bwtsw wg.fa
After that in the folder I have these 5 files:
hg19bwaidx.amb, hg19bwaidx.ann, hg19bwaidx.bwt, hg19bwaidx.pac,hg19bwaidx.sa
6- Now, I would like to generate a SAM file for a paired-end through mem algorithm based on the below command:
./bwa mem -T 19 ????? file_paired_1.fastq file_paired_2.fastq > aln.sam
My question is that which one of the five files generated in the indexing process(step 5) should I use instead of ????? at step 6?
I appreciate it if anybody shares his/her comment with me.
Best Regards,
if you want to use GATK in your downstream analysis you should care about the order of the chromosomes. GATK raises an error if the chromosomes are "chr1" "chr10" "chr11" ... instead of "chr1" "chr2" "chr3" ....
Thank you for your comment. honestly, I need a BWA-generated SAM file as input for finding CircularRNA in the CircularRNA finder. up to know, I couldn't find any CircularRNA in my samples. could you please guide me on what should I do instead of using the below command: