I am using delta load in postgresql 9.2 by inserting only new data and updating only updated fields by using created_at and updated_at fields. Somebody in my company told me that using md5 or hash is a faster way to update on large tables? How would I use hash ?
I would like to apply a hash code solution on my webpage which is more compact than MD5 and SHA-1, because I want to use them as keys in a JSON hash table.
Or equivalently: how can I convert a hexadecimal MD5 hash to a higher base number system? The higher the better, till the words can be used as keys in a JSON hash. For example instead of:
"684591beaa2c8e2438be48524f555141" hexadecimal MD5 hash I would prefer "668e15r60463kya64xq7umloh" which is a base 36 number and the values are equal.
I made the calculation in Ruby:
Because it handles the big decimal value of the hexadecimal MD5 hash (138600936100279876740703998180777611585)
I've been looking for a method to create a bat file to generate the MD5 checksum of a file.
fciv and a few others, but they all generate a file with additional info such as path and file name etc. I just need the MD5, nothing else.
Anyone point me in the right direction?
This command line
fciv new.xml -md5 -r -xml new.xml.md5
creates a file with the following contents:
<?xml version="1.0" encoding="utf-8"?><FCIV> <FILE_ENTRY><name>new.xml</name><MD5>OuX4jSQyl91+M1fUQZeGtw==</MD5></FILE_ENTRY></FCIV>
I just need the MD5 checksum.
As to one file (not a dir), the results of "md5sum filename" and "tar c filename | md5sum" are different, what's the reason? I know the second command is aim to a dir, but it still works for a file.
Worst case, I have 180 million values in a cache(15 minute window before they go stale) and an MD5 has 2^128 values. What is my probability of a collision? or better yet, is there a web page somewhere to answer that question or a rough estimate thereof? That would rock so I know my chances.