The error is represented as:
INFO @ Sun, 04 Aug 2024 06:38:54: Total: 61970171.
INFO @ Sun, 04 Aug 2024 06:38:54: Mapped: 38138765.
INFO @ Sun, 04 Aug 2024 06:38:54: Parsing FASTQ file rawdata/CRISPR_screen_1/MOI001-A549-cas9_R2.fq.gz...
INFO @ Sun, 04 Aug 2024 06:38:54: Determining the trim-5 length of FASTQ file rawdata/CRISPR_screen_1/MOI001-A549-cas9_R2.fq.gz...
INFO @ Sun, 04 Aug 2024 06:38:54: Possible gRNA lengths:20
INFO @ Sun, 04 Aug 2024 06:38:54: Processing 0M reads ...
INFO @ Sun, 04 Aug 2024 06:38:58: Read length:150
INFO @ Sun, 04 Aug 2024 06:38:58: Total tested reads: 100001, mapped: 50(0.0004999950000499995)
ERROR @ Sun, 04 Aug 2024 06:38:58: Cannot automatically determine the --trim-5 length. Only 0.049999500004999954 % of the reads can be identified.
Where is the gRNA supposed to be in your reads? In first 50 bp? You may want to trim your original reads downs to that size and try. Are you providing a file with the expected gDNA sequences as reference?
This is an example of my reference data:
I generate this data using the content file. This is an example of my fq data:
I'm not sure that where the gRNA supposed to be in my reads.
In CRISPR and shRNA screens you don't sequence the guide but attached barcodes. Unless this is not super custom it's the first bases of every read but how many exactly you need to determine by either checking protocols or talk to the scientist who produced the library.
Easiest way to find what location they are in the reads is to take a few of those sequences and grep for them in your fastq file.