Entering edit mode
7.6 years ago
clear.choi
▴
30
I am new for analyzing whole genomic sequences, I got de novo assembly results (fasta file format and around 4400 indexes).
This is huge file (file size is around 3GiB)
I tried to analyze this data with HG19. So I was using BWA and Blasr for alignment. but It was failed (BWA is core dump. Blasr is just never finished).
So I'd like to get some tip how I can handle this whole genomic sequence results.
Is there any recommendation to do this alignment and make bam file to see in IGV? Or any other suggestion would be thankful.
Thank you!
What sequencing platform is used? Illumina?
I would align the reads (fastq) from your sequencer directly to hg19 with bwa mem, instead of using a de novo assembly first.
It is PacBio Platform. I'd like to study how we are analyzing de novo assembly sequence results using visualization tool.
You could use
mapPacBio.sh
from BBMap suite to do the mapping and create BAM files if you wish.Assuming your assembly has been checked and is reasonably good (did the sequence provider do that for you) then you could try using one of the tools above to see what sort of contiguity you have in the assembly with the reference. I would suggest using GRCh38 at this time since hg19 is getting a bit long in tooth.
Thank you so much for your information! yes, sequence provider has been checked sequence quality. I am running it right now! I will see how does it look like! And also Could you share with me normally how informatics team analyze de novo assembly results?
So this is human genome sequence that has been de novo assembled? When you say there are
4400 indexes
does that mean there are that many contigs/sequences in your fasta file?For long sequences like that your best bet is to use BLAST+, blat or LASTZ for doing the alignments. I am not sure why you want to make BAM files at this point.
Yes, human genome sequence that has been de novo assembled, and also Looks like there are many contigs/sequences in the sample fasta file.
Thank you for suggest tool ! I'd like to see sequence using visualization tool like IGV. So I wants to see how does it looks like. Like Resequencing.