hi,
I have some samples, where each has PE reads(150 bp length) of NGS in fastq format.
I already aligned them to the refseq with BWA.
I want to take N bases and do some kind of manipulation on them.
I thought to assemble the bam file, with the mutations, and in the gaps(where I don't have reads) fill it with the refseq.
do you know any reference-guided assembly that creates genomes to each sample, so I can take n bases from wherever I want?
thanks
what is the diffrence from :
gatk4 FastaAlternateReferenceMaker -R $REFERENCE -O $CONSENSUS_FASTA -V $VCF
https://medium.com/brown-compbiocore/building-a-consensus-sequence-with-vcf-files-db7407f3f86f
When you say reference guided assembly, a professional bioinformatician will think you are trying to do an assembly using tools such as Canu, Spades, etc, but using a reference genome to guide you.
What you are trying to do is use an alignment (BAM, VCF etc) and an existing Fasta to generate a consensus reference. Note : this is _not_ assembly.
But yes, GATK or
samtools mutfa
will likely do a good job on that. Beware insertions and deletions are inserted correctly.