I have a genome of about 2 gb composed by scaffolds I would random sample the genome.
I used reformat.sh but the output was only a scaffold. I need 1/3 of the total genome... I report only some scaffolds as example:
>LGKD01000001.1 Octopus bimaculoides isolate UCB-OBI-ISO-001 Scaffold4_contig_1, whole genome shotgun sequence
GAACAGCATGAATGTTAAAACtgaaatggatgatgatgatgatgatgatgatgatgatggcagcaacAGCCatgattatatttaatatgttgttagttataatcataataatgatgataatgttgataacaaTAATGGTTGCAATAATG
>KQ415657.1 Octopus bimaculoides isolate UCB-OBI-ISO-001 unplaced genomic scaffold Scaffold5, whole genome shotgun sequence
tatatatatatagtcaattcgagGATGTTAGATCGACAATGGGGATTATAGAATCCCACAAAAAATTCCACTGGT
>LGKD01000032.1 Octopus bimaculoides isolate UCB-OBI-ISO-001 Scaffold12_contig_1, whole genome shotgun sequence
GAAGTGGTAAAGAGTgcgatgcgctgaaaaaagagagaacagtacttgaaatGTGGTTTCATTCTagtagtaaat
>LGKD01000033.1 Octopus bimaculoides isolate UCB-OBI-ISO-001 Scaffold16_contig_1, whole genome shotgun sequence
ctgaTCAACAGAatagggccaatcattcttcatgacaatgctcgaccacacgttttaCTAATGA
>LGKD01000034.1 Octopus bimaculoides isolate UCB-OBI-ISO-001 Scaffold22_contig_1, whole genome shotgun sequence
TTATCTATATACGagaatattatctatatataaaggaataccaaaaaaacaagaacaacgggtcattcggaattttcttt
There is a script able to do this?
You would need to collapse the entire genome sequence into one long fasta and then sample if you truly need 1/3 of total.
Edit: Can you provide exact command line you used?