Dear all:
I have questions which wish your help.
I read paper about exome sequencing and found that in cancer sequencing, there were very few papers. I do not know what are your opinions about Solid sequencing, is it not good or having some problems?
Besides, when using chip to capture exome regions, how about 1 chip corresponds to 4 samples? Is there any problems?
Thanks.
Which paper did you read? what is your question? Could you please explain your question more clearly? what do you mean by -" when using chip to capture exome regions, how about 1 chip corresponds to 4 samples" ?
Excuse me for my poor English. I just want to say some people in order to save money, they just pooled samples together and captured exon region and then sequence.
From my own experience and empirical observation most (if not all) bioinformaticians seem to try to avoid working with the SOLiD sequencing platform because it produces data in color space format and that precludes them from using the majority of existing tools and techniques. This can be very frustrating.
The second part of your question has to do with barcoding the samples, the only question that needs to be determined is the coverage of the samples for each barcode. As long as you get sufficient coverage you can add as many samples as the platform supports.
Having worked predominantly with SOLiD reads (both 4 and 5), I agree with Istvan that it's a bitch to work with. The only effective aligners you can use are LifeScope/BioScope or Bowtie. ABI's software is also difficult to access as their page is constantly moving or down. And don't even get me started on their XSQ format...
Mutations are called after the alignment is done. The number of mutations will depend on the quality and quantity of the data, the calling algorithm called, and the parameters for the algorithm (the more stringent the less false positives, but also the more true positives missed). At similar calling calling algorthms and parameters, false positives will depend on the depth at the location of calling and on the sequencing error rate in the reads.
For the second part of your question, it might be worth having some sense of the number of reads going to unassigned barcodes.
For example, if the runs often have 10,000s of reads (or greater) going to unassigned barcodes, then I might be cautious about trying to sequencing 1,000 reads per sample. I also sometimes get nervous when your number of observed reads varies a lot from the number of expected reads. Sometimes, you can figure out when a barcode has been mixed up (for example, if genomic coverage is very different, and one sample has >10M reads and the other has <100k reads), but that gets harder as the number of missing barcodes increases.
I think there may be also some complications, depending upon the types of samples that you mix. However, I am still trying to understand those trends better, and I am hesitant to say you absolutely can't do something.
In other words, I have some notes on barcoding here:
However, you may want to run fewer samples per lane (or less diversity of library/barcode/adapter types per lane) then you may technically be allowed to do.
Which paper did you read? what is your question? Could you please explain your question more clearly? what do you mean by -" when using chip to capture exome regions, how about 1 chip corresponds to 4 samples" ?
Excuse me for my poor English. I just want to say some people in order to save money, they just pooled samples together and captured exon region and then sequence.