Hi all
The data I used for analysis is for the solid platform and was paired sequentially. The problem is with the length of the sequence of readings in the forward and reverse. My procedure was to align the csfastq file with the bowtie and the genome reference (for color space). But I aligned the forward and reverse files separately. Is this method correct?
The text of the article related to this data ....
""The samples were sequenced using the 50625 paired-end protocol, generating 75 nt+35 nt (Paired-End)+5 nt (Barcode) sequences. Quality data were measured using software SETS parameters (SOLiD Experimental Tracking System). For both reads, forwards and reverse, the seed was the first 25 nucleotides with a maximum of 2 mismatches.""
I have a few questions, please help me.
1- Surely I have to align the first 25 nucleotides similar to the article?
2.5 barcode nucleotides When I executed fastqc, the analysis result was not observed. Do I still need to delete the first 5 nucleotides of each reading?
My workflow is as follows
csfasta/ .qual files -------> alignment by bowtie (colorspase index/hg19) (f/r files seprately) ----------> sam to fastq files by 123fastq software ------> fastQC -------> trimmomatic ----------> alignment by hisat2( grch38// F,R files together ) ----------->htseq count
is it true?
I appreciate your help beforehand.
Save yourself the trouble and find alternate dataset if you can. Colorspace data is going to be a hassle to deal with. Most current programs have stopped supporting it.
I have to do this analysis because there is no other for the disease under study. I'm so confused and do not know what to do. Please guide me.
Please show all commands you used...and be aware that if these are correct and data still get poorly aligned you are not doing yourself a facor with starting a project based on very suboptimal (=crap) data.
Thanks for your attention
I read in an article that the alignment rate of solid data is low (about 40%). Do you think this is true?
I used "abi-dump SRR1175538" for download data from ncbi. then download 4 files :
then align by bowtie whit colorspace index- hg19
the result from the above command was a sam file whit 0 kb and 0% alignment rate.
so then align forward and revers files separately by
then by 123fastq software convert sam files separately to fastq file
and quality control by fastQC
and trim fastq files together by trimmomatic by
I did not continue the analysis because I was not sure of the correctness of the steps and the results obtained.
Please help me, friends