SFV (Simple File Verificator) files are used to ensure if files are:
- not corrupt
- not missing
For example, with RARs it's important that we got all the parts and that none of the RAR files is corrupt… If it is, RAR will tell us (if its new RAR that is). However to know that rar file 020 is corrupt we need to have previous 19 files to test… SFV allows us to test any file at any time without the need to have other files.
Further more for MP3s we might never notice a file missing (after all, other songs will play alright) but thanks to SFV it's easy to spot which file is missing.
Using your SFV checker point to an SFV file on which to base your checking. Some SFV checkers then will ask you which files you want to test, usually we say all files, but sometimes there is a need to test just one file. (i.e. After testing all files we see that one file was screwed. After redownloading we just want to check that single file for corruption, not the whole release.)
Then somewhere you click next or process or check button which will check all files for you and in the end report to you:
- which file is ok
- which file needs to be redownloaded
- which file is missing
based on that you can know what to do next…
PS: most people use SFV checking on their ftps and it can be that you could have downloaded hundreds of files without the need to rely on sfv…
Many release packagers will create an sfv for you (e.g. Morgoth's MP3Releaser) however, most sfv checkers should be able to create sfv files (since it is the same algorithm).
Usually you head to create/generate sfv file section, mark the files which are to be included in sfv check (all the rars/nfo/mp3s) and click a button to generate.
The program could ask you for filename of the sfv file…
CRC32 stands for an algorithm: Cyclic Redudancy Check and 32 represents how many bits does the final calculation represents. The algorithm is a sped up way of finding remainders of division…
Say we got a number: 123 and we divide it by 5… the answer is: 24 + 3… the 3 is the remainder.
When we divide it by 5, we know the remainder is between 0 and 4.
Say we got a number: 12342878729874120987 and we divide it by 7, the remainder is 6.
Now let's assume the contents of a file (a file is basically bits and bytes which are just numbers) are the number we want to divide… and we remember just the remainder.
The remainder will be your CRC number.
But you will say: "it's impossible to divide a number with 15,000,000 digits that fast, otherwise we would be cracking all those credit card numbers…" Well, we are not really dividing, we are just trying to find out the remainder… and we don't call SFV a remainder algorithm but Cyclic Redudance Check after all… This has to do with the way HOW simple the CRC can be calculated (simple being relative term). The thing is that it's just a simple set of XOR gates and left shift operations.
That's why CRC check is so popular. It can be easily implemented in the hardware…
Thats why RARs, Zips, Arjs, Internet: TCP use it… ;D
Ahh.. the old age question… Yes there are better algorithms…
You see, CRC32 generates 32bits… 32bits are not enough to ensure that our 15 MB file is unique.
You see, 32 bits gives us 4,294,967,296 bit combinations, while a simple 15 MB file has of them: 5,287 x 103084 (thats in scientific notation, it means the number has 3084 digits)
As you can see there are many different 15 MB files and there is a possibility that our CRC32 will generate same code for two completely different files.
However this is not really a problem. Usually when files get corrupt they don't have A LOT of changes being made (few bits/bytes) which are not enough to be corrupt and still generate a valid code… Another thing that saves us is that we generate SFV files on relatively small files (15 MB) not complete ISOs (700 MB)…
But as I said, better algorithms do exist… For example large files (actually, irony, Linux ISOs) are verified with md5 algorithm as opposed to SFV… MD5 uses 128 bits and has better avalanching than crc32. (this means even a single bit of difference will make significant impact on the output). Why is md5 not used? SFV became popular, md5 is catching on however, but has not as wide use as SFV.
Further reading recommended:
- RFC 1320 is about the MD4 Message-Digest Algorithm.
- RFC 1321 is about the MD5 one.
Also you might want to head to local bookstore and take some Compression books (CRC checks are used in compressions a lot) and Networking books (again, CRC is popular). Also books on discrete math might be a good help (data integrity issues, and data mapping spread, error detection, automatic error correction). Local university/College is always a plus… ;D
Here's an example of a BAD SFV to use with win-sfv32.exe (however it works on Microsoft, Linux server systems and hoopy's pd-sfv (pdsfv.isonews.com); use gltpad hoopy iso toolkit or comparable checker like PSi's asfv)
; Generated by HoopySFV filename.r00 C46D96FF filename.r01 1C03CA69 filename.rar 12550B24
Here's one that is GOOD to use with win-sfv32.exe
; Generated by SFV32 v1.0a filename.r00 C46D96FF filename.r01 1C03CA69 filename.rar 12550B24
; Generated by WIN-SFV32 v1.1a on 1999-09-27 at 04:55.56 filename.r00 C46D96FF filename.r01 1C03CA69 filename.rar 12550B24
- more info on this can be read at pdsfv.isonews.com Basically winsfv author deliberately made his sfv checker behave like this.
If you looked into sfv file (sample) you will notice that there are lines starting with a semicolon. This line indicates that it is NOT part of the CRC checksum.
Whatever is a comment usually can be deleted (check first line problems).
However there are two camps of what should be contained in the sfv comments.
As one, you could put there your info making your sfv look like:
; Generated by WIN-SFV32 v1.1a on 1999-09-27 at 04:55.56 filename.r00 C46D96FF filename.r01 1C03CA69 filename.rar 12550B24 ; My Release of Filename.. ; Disks: 3 ; ; this filename release is what your mommy warned you not to get ; get it if you only have the nerve...
Some sfv checkers (usually ftpd based) have options to remove such comments as well as to put our own comments instead… ;D
The other camp says we should include more information about the files just in case… so we end up with:
; Generated by WIN-SFV32 v1.1a on 1999-09-27 at 04:55.56 ; 15000000 06:43.08 1911-11-02 filename.r00 ; 5123401 06:44.24 1984-11-02 filename.r01 ; 15000000 06:45.40 2001-11-02 filename.rar filename.r00 C46D96FF filename.r01 1C03CA69 filename.rar 12550B24
Some sfv checkers can use this additional information to speed up checking. (aka: whats the point of checking a 14 MB file when we need a 15 MB file?)
Some ftpds (the better ones) support checking files on upload against sfv files. This way as files are being uploaded the ftp checks for us if the upload is complete. If the upload is complete we might not want to allow any further uploads into that directory… When a file is corrupt we might want NOT to give credits to the user (this is assuming the user is on ratio), and we might want to notify the person uploading to reupload…
So which ftps do it? As of writing this most popular Windows based ftpd is gene6 and warftpd 1.6. Both support crc sfv checking through addons which can be found on the net with a bit of dedication.
glftpd has the best support for scene features (sfv/sitebot/stats/nukeing/etc) however you would need to run Linux in order to run glFTPd.
For more on this please wait for the ftpds faq to be published.
Q: How do I uncompress a .sfv file?
A: You don't. SFV is merely used for verification of other files.
Q: WinSFV error: "Unable to identify the type of the input file"?
A: It's the famous First Line problem.
Q: Is WinSFV the best? What is better?
A: Not everything that got 'win' in front of its name is really good… Sure WinAmp and WinCMD are good, but for example WinZip is not considered good by many. Similarly WinSFV does not share same glory as my other sfv checkers… There are checkers optimized for speed (QuickSFV and FastSFV), there are sfv checkers for ftps which generate crc32 code on upload (glftpd's flysfv), there are checkers that do single files only (glftpd's sfv_check), there are checkers designed to do complete releases nicely (pdsfv, asfv) and in general many others… The only recommendation I can give is to get few of them and try them out and see which one you like the best. You will like some for speed, others for interface, others for features… PS: Must warn you that WinSFV is famous for First Line problem.
Q: Are SFV files only pirate's tool?
A: This is a common misconception. Many people assume that if something is used by pirates it otherwise has no use… MP3s are for pirates, no one cares that its a great compression algorithm and has millions of other uses… Many small radio stations because backups are expensive, first mp3 their shows for future references… Similarly, many consider divx to be illegal format, even though many amateur artists use it because its just a perfectly viable tool…
Similarly, SFV files are just a handy way to ensure that file transfer went ok. If you are an ISP wondering whether you should ban SFV files because it would promote piracy should think again… SFV file should be supported by ISPs to ensure images and html files are sent properly… How many times have I seen amateur artists upload 20 MB movies just to be next day greeted by 200 emails complaining that the files are corrupt… /end rant
Q: But RARs implement CRC32 checks, why sfv files?
A: There are two issues with RAR… First the old one:
- RAR has a nasty habit of not creating a checksum in each file of a multi-volume archive. This makes it impossible to find out which archive is broken in a set. SFV makes this possible by generating this "missing" crc value for each file, if a volume set is broken, simply check it against its checksum table to find out what files are broken and then reobtain the file once again to fix up the error.
- SFV works faster than RAR and therefore can substitute RAR when testing to see whether or not a complete set is ok.
- You can use SFV for any file, not just RAR… i.e.: SFVs are popular in the mp3 scene as well.