Entering edit mode
4.4 years ago
stan.aanhane
▴
30
Hi everyone,
After performing a novo assembly, with the followed command, i want to filter the biggest 50 contigs.
spades.py --untrusted-contigs lclav_genome.fa -1 randomnietnfectedFP.fastq.gz -2 randomnietinfectedRP.fastq.gz -t 2 -m 28 NINnovo --phred-offset 33
This creates a directory with the contigs in it. This file is sorted from biggest to smallest, and we want just the top 50 of these contigs. I have tried something with awk, but it is not working how i want it to. CAn someone help me out?
Thank you!
You could convert multiline fasta to single line using Multiline Fasta To Single Line Fasta and then extract the first 100 lines using
head
which should extract the top 50 contigs for you.Don't forget to change them back to fasta format.
See past threads for inspiration:
How To Filter Multi Fasta By Length??
how to rearrange fasta file according to its length (add a filter for 50)