Wednesday, June 18, 2008

Checksums in 3ds Max (part 1 of 2)

In my Calling Python from MaxScript post I mentioned the usefulness of checksums in Tech Art work. I was hoping to elaborate on that a bit.

In short, a checksum is a number computed from a larger piece of data. The checksum is (ideally) guaranteed to be unique for that data. Let's say that data is this string: "Tech Art is A-#1 Supar", and the checksum you've computed is "30532". If any character in that string changes, the computed checksum for it will be different, like "18835" or "1335"... basically anything other than "30532".

Checksums are most useful in cases where you need to compare two sets of data to see if they differ, but don't care where or how they differ. If you have a short number that uniquely identifies a huge piece of data, you can compare it to other data sets much faster/easier than comparing every element of the original data. If you're hip to how slow n-squared searches can be (especially in languages like MEL or MaxScript), this is a classic method for avoiding them.

Here's a MaxScript function that takes a string of any length and returns a checksum for it.

------------------------------------------------------------
-- (str)getChecksum (string)val (int)size:256
--
-- Description:
-- Calculates simple checksum value from supplied string (or
-- any value convertable to a string).  Default size is 256,
-- but can be changed with the optional "size" parameter.
------------------------------------------------------------
fn getChecksum val size:256 = (
   if (classof val != String) then (
      try (
         val = val as String
      ) catch (
         return false
      )
   )
   alphaKey = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890 !@#$%^&*()[]\\{};':\",./<>?"
   total = 0
   for i in 1 to val.count do (
      thisVal = findString alphaKey val[i]
      if (thisVal == undefined) then (
         thisVal = 0
      )
      -- Multiply the alphanumeric value by its position in
      -- the input string, add to running total
      total += (thisVal * i)
   )
      -- make sure divisor is smaller than dividend
   while (total < size) do (
      total = total * 2
   )

   -- Return final checksum value
   checksum = mod total size
   return (checksum as string)
)
We used the above function in several of the Saints Row tools, primarily to help remove identical materials in 3ds Max scenes. It's very unscientific, however, and can generate collisions in rare cases (two different input strings that generate the same output checksum) **.

If you don't mind a little more setup, I would recommend an alternate checksum method. Python natively offers more robust checksum tools, and can be set up to be called directly from MaxScript. The steps for doing this, and the actual MD5 checksum function, are covered in my earlier blog, Calling Python from MaxScript.

Start with the Python script from that blog that defines and registers the COM server. Then you're able to use a far-shorter MaxScript function to get checksums:

fn getChecksum val = (
   comObj = createOLEObject "PythonCom.Utilities" 
   checksum = comObj.checksumMD5 val
   return checksum
)
That's it. The checksums you get from this function will create fewer collisions than the pure-MaxScript one above, and can be made to use any alternative method available in Python.

Now you know more about checksums, and how to generate them in 3ds Max. Next time (in Part 2) I'll discuss how you can use them to save memory in both Max and your game engine.