Entering edit mode
9.2 years ago
pingEde
▴
40
Good morning,
I am implementing a pipeline for Roche 454 Junior and I have few questions, I hope that someone could help me :)
I do not know the structure of reads that are in the SFF file, I think that they are composed by MID- Primer 1- Sequence - Primer 2- MID, it is wrong? Primer1 and Primer2 are different or same?
Could you suggest a good tool to remove primers?
Thank you in advance for your attention!
Best regards
Thank you for your response, I have already converted the file to fast with a tool of Galaxy, I have divided the file .fasta according MIDs and now I have to trim primer... It is my problem, because I am not sure of the structure of the read and what is happened if the amplicon sequencing is longer than the final reads and it is possible that the end of the read (3') is incomplete. Thank you for your help
Mothur can trim the primers as well. It is easy to see the primers, they are at the beginning of each read.
Note that you don't even always need to remove primers for analysis methods to work. And yes the end of the amplicon may be missing, that's not necessarily a problem either. It all depends on the quality of data, the statements that you wish to make off of them.
Thank you very much for you response, could you suggest me where can I download reference genome (human hg19) to use in tool of mapping?
Thank you in advance for your help!
see this https://www.biostars.org/local/search/page/?q=download+human+genome
Thank you for your response, I have verified that the MID is present in the read :)