Hello, I have a couple of fastq files of approximately 20 million reads (transcriptomes) and I want to extract 5-10 thousand reads to test assemble them on my laptop, is it possible?
Use reformat.sh from BBMap suite. Relevant options for sampling.
reformat.sh -Xmx4g in=file.fq.gz out=sampled.fq.gz RELEVANT_OPTIONS_BELOW
reads=-1 Set to a positive number to only process this many INPUT reads (or pairs), then quit.
skipreads=-1 Skip (discard) this many INPUT reads before processing the rest.
samplerate=1 Randomly output only this fraction of reads; 1 means sampling is disabled.
sampleseed=-1 Set to a positive number to use that prng seed for sampling (allowing deterministic sampling).
samplereadstarget=0 (srt) Exact number of OUTPUT reads (or pairs) desired.
samplebasestarget=0 (sbt) Exact number of OUTPUT bases desired.
review all your previous questions: add a comment or validate the answers ! Warning: Mate records missing HTSEQ ; Merge common elements in R ; Kallisto abundance.tsv ; Exception in thread "main" java.lang.RuntimeException: Sequence and quality length don't match ; Trinity Insilico Normalization
we already asked for it. Trinity Insilico Normalization