The answer to your question is using awk
.
See, for counting the read length, use :
awk '{if(NR%4==2) print NR"\t"$0"\t"length($0)}' file.fastq > readLength
Output:
2 ATCTCATTTTTCACGTTTTTTAGTGATTTCATAATTTTTCAAGTCGTAAAGTGGATGTTTCTCATTTTTCATGATT 76
6 GTGTTAGGTGCCAAGATCCTTCTGTGTTTAAAGTGCCTTCAGGGGTGAGTGGTTTTAGTGCCATGTACTGTGGAAA 76
10 TGTATGGAGTGTTTAGATGCTCCCTTCTAGCTTCTTTTCAGACAATTTACTAGGATGATTGACAGAGAATTGCTTT 76
14 ACTGTAGCCCCGCCACACTGCCACTGGTATAGGACACCAAAAAGCACTCTCTAATAAATACAGAAAGATTAGCAGA 76
18 ACGCTGGAAACTTAGACTAGCAGTTAGAATGTAGTGGTTGATAGAGCATCCAGCAACCAGGGGTAAACTCGGCCTC 76
22 TTTTTAGTGATTTCGTCATTTTTCAAGTCGTCAAGTGGATGTTTCTCATTTTCCATGATTTTCAGTTTTCCTCGCC 76
26 CAGCAAGCGAAATAGTTATGGCCCATGTTTTCTGGATATCAATTACACAACGGGGAGATTTGGAAGGAGAACATGC 76
30 CATTTCACATGAGTGAGACTCAAGGCTTGGGATGTGTAATACAAGATCAACACTTGGGACAAATTCCAGCATGGGG 76
where first column is the line number in the original fastq file, 2nd is the read in light and third the length of this read
and for the quality length, use :
awk '{if(NR%4==0) print NR"\t"$0"\t"length($0)}' file.fastq > qualityLength
4 1==DDBEEHHHHDHIHIIIIIFHEHAGGHGIHIGHIIIIGGIIIGAFIIIGDGIHIIHCIIIHEHGEIII@HHGC? 76
8 BBBFFFFFFHHHHHIGIJJIJIJJJIJIJJIJIHHGHIIIIJJJJ?DFHDFHHIIJJJIIIIJJJJJJJHFHGFGC 76
12 ??@D?D?DFFFDFGGIGH@>IHG@FGAA+CFF>9CGH@9DDH@GGICH<B@FHIBEH<GG@BF<BB=CA;F)7@GC 76
16 @@@DDDDDF>FFHIIGIGEGG??BFEC?9CFFE9DDGAGDDDF<CBE@FEHHGGHEDGHGHBA>EEEDCCBCCCCA 76
20 CCCFFFFFGHHHHJJJJJJIHIJHHGJIIIJHIJFHGADHGHGHDGIIIIIIJJIJJJJJJJI@GIHHHHH=BE?@ 76
24 BCCFFDFDHHHHHJJJJJJJJJJGIIJJGHIIJIJ?FHCEHDGIJJIJJJJJIJJJJIJJJJIGJIIJJJIIGIGB 76
28 @CCDDFFDHHFHHJFGEGIIIIJAHGEEDGHGICHEEIIJ@HEEGIIJDCHHIIJ>EACEHFF?@DCEDCB>ACCD 76
32 @@@FFDDDHHDDDFHIJGIIGCGGIIIIIJJGHHCF1:FGFGIGGGHHIJJJDHGHIGHIIIGIJIIJIGHGHHHH 76
and finally, use awk
again to compare the columns, here the column number 3
awk 'NR==FNR{a[$3]++;next}!a[$3]' readLength qualityLength
This will output the lines where the length is not matched, the output will be from the 2nd file.
FastQC will tell you the overall picture but this will exactly tell you what you want.
HTH
Edit : The code is better now as no need to use diff and to calculate the line number back (as suggested by @Istvan)
Fastq With Format Errors answer to a similar question may be helpful if you do find formatting errors.