Short hash explained
When SnowDDL compares desired config vs. actual metadata in Snowflake, it normally checks every property of every object.
But in some cases it is not practical due to:
lack of all object properties in output of
SHOWandDESCcommands;a large number of properties and frequent changes in format of such properties;
SnowDDL implements a simplified method to compare objects of the following types:
EXTERNAL_FUNCTIONEXTERNAL_TABLEFILE_FORMATFUNCTIONPIPEPROCEDURESTAGETASK
How does short hash work
When object is created for the first time, SnowDDL builds
CREATE OR REPLACE <object>SQL and executes it.Right after that SnowDDL calculates a short hash based on SQL text and stores it at the end of
commentof created object. The originalcommentis preserved, but it is now a bit longer.Later on, when existing object is "compared" by SnowDDL, it builds
CREATE <object>SQL again, calculates a new short hash and tries to match it with existing short hash stored in objectcommentearlier. Unlike other properties, allSHOWcommands always returncomment.If calculated short hash from SQL and existing short hash from
commentare the same, SnowDDL assumes that object remains the same and returnsNOCHANGEresult.If calculated short hash from SQL and existing short hash are different, SnowDDL assumes that at least one property was changed, and object is fully re-created.
How short hash is calculated
import base64
import hashlib
def _short_hash(self):
sha1_digest = sha1(str(self).encode('UTF-8')).digest()
return f"#{urlsafe_b64encode(sha1_digest[:12])}"Short hash is the first 12 bytes (out of 20) from SHA1, additionally encoded with url-safe base64 encode. It is a good balance between length and quality of hash.
The collisions are possible, but are extremely unlikely in real-world scenarios.
If Snowflake makes SHOW commands better in future, it will be possible to replace short hashes with full property checks, but it is not practical at this moment.
Last updated