Hi Quade,
My first issue was with the display of the failed files details in the failed files tab being incorrectly displayed.
My second issue was the duplicate checking causing the failure - and I agree with your reasoning and the 4 main causes,
but I usually suffer from a 5th failure cause, or a variation of case 1, where an info file is included in the file set that contains boiler-plate text.
This common info file has been downloaded hundreds of times and is in the dupe-database, and every time a file set is
downloaded that contains this standard info file, the whole download of the file set gets failed, but not until some or even most of the files are downloaded and then the info file is hit and the download fails
- which then causes my first issue to come into play, so that I don't easily know what files have been downloaded or failed.
I then have to either hunt around and work out which files are missing, or re-add the fileset and ignore dups, but then it re-downloads all the stuff that is already down ok - could be multiple GB!
The new optional dup detector would maybe help with some of the causes, but I don't think it will help with what I have described above,
and I would hate to invalidate a dup database that has taken 20 years to build up
I like the dup checker as it is useful for individual file downloads like JPGs, EPUBs, PDFs etc, but I also agree that it is less useful now than it used to be.
Would it be possible, perhaps, to have a user maintained list of file types to be included/excluded from the dup check?
Maybe defaulting to everything included in the dup check to start with, but allowing for fine tuning?
Or, my preferred option, maybe in a large file set, with a mixture of large files, PARs and other small files (.info, .sfv, .jpg, .png etc),
an option to only include large files and PARs as the reason for failing the whole file set download?
As you said, you're going to download the whole of the small files anyway, before you can know if they are dupes or not,
and I think you can't easily stop the download from the server anyway once it is decided that a file is actually a dup,
so saving bandwidth isn't possible, and nowadays, is a lot less important for the smaller files - as long as they don't amplify the failure rate
and increase the bandwidth usage out of proportion.
Anyway, thanks for reading. If you could please look at Issue 1, that will make it easier for me to deal with issue 2, even if you can't fix them both.
regards
Stavros.