Monday, May 12, 2014

How Masks Work

Cappuccino makes use of a masking system to make the objects easy to define and customize, but how does it work?  Understanding how a masking system operates helps us understand exactly what is happening as we tell Cappuccino what it is that we want.  Understanding this, makes the code less foreign and more memorable.

Masks are a programming concept that extends to any programming language.  Almost every programming language has two methods for comparing values with OR and AND.  One method is the "Logical" method, and the other is the "Bitwise" method.  Beginning programmers often use these methods interchangeably, and for the beginning programmer, this is acceptable, because the result is the same for the kinds of comparisons that that programmer is working with.  However, advanced programmers are able to use the important differences in how these operators function to store and extract information in bit-strings.  We will learn more about that here.

If you looked at the constants such as the Window Masks, you will notice that they are defined in the following manner:

   ConstVariable = 1 << Offset

The binary left shift operator is used to "flip" the 0 to a 1 at a specific location in a binary string.  Let's look at a few examples.

The Binary representation of 0 is 0, 00, 0000, 0000000 and so on.  We can add any number of 0's in front of a binary number without changing its value.

Similarly, the Binary representation of 1 is 1, 01, 001, 0001.  Because we can add any number of zero's in front of a binary number without changing its value, we will just assume they are there in this discussion and write all numbers with these leading zeros omitted.

This may seem like a silly thing to think about, but it is important, because the left shift operator moves values into these unseen spaces.  So binary 1, when left shifted one space, becomes 10.  That is "1 << 1" = "10" (binary one-zero, not decimal ten).  Also, this is important, because the bitwise-operators OR and AND don't compare values outside of their offset.  So bitwise-or of value 1 and value 2 results in a new value which is the "OR" result of the values at offset 0, the values at offset 1, the values at offset 2 and so on; as apposed to the logical-or, which looks at the value as a whole and results in true or binary 1 if any offset in either operator is set to 1.

By assigning different offsets to different meanings, we are able to create flags that indicate whether the value is activated (1) or not activated (0).

For example, if we wanted to say that offset 0 was "fat" and offset 1 was "tall" and offset 2 was "dark hair", then we could describe people as being:
  • "not fat, not tall, and having dark hair", or 100
  • "fat, not tall, and not having dark hair", or 1
  • "fat, tall, and having dark hair", or 111
  • or any combination of these values.
Using bitwise "OR", we keep all "active" bits and discard all inactive bits.  Bitwise-or combining fat (1) with dark-hair (100) creates a bit sequence of (101), which we would interpret to mean fat and dark hair.  

                               0       0        1
                       or     1       0        0
                       ------------------------
                               1       0        1
    (each bit offset is compared on its own, independent of the results from other bit offsets)

This is how combining masks work.  The "mask" is an active bit at a specific binary offset.  By using Bitwise-or, "|", we are able to join it with other masks to define that many different aspects are true within a single value.

For the programmer to extract information out of the resulting value, it is as simple as using the bitwise-and, "&", operator to find all locations where the bits are activated.  For example, let us take the "fat and dark hair" mask that we created above: 101.  If we use an "&" operator of this value with the dark hair value "100", then the resulting value looks at each offset independent of the others and produces "100", which we know is equal to the mask for "dark hair".  The 1 at offset 0 dissappears, because the bitwise-and compares that 1 bit to the 0 bit at offset 0 in the mask that it is being combined with: 1 and 0 is 0.  The 1 at offset 3 is kept, because it is compared with the 1 at offset 3 in the mask that it is being combined with: 1 and 1 is 1.

                                 1       0        1  (MysteryMask)
                       and     1       0        0  (KnownMask)
                       ------------------------
                                 1       0        0   (Result)
                    //this result indicates that KnownMask is contained within MysteryMask
    (each bit offset is compared on its own, independent of the results from other bit offsets)

                                 1       0        1  (MysteryMask)
                       and     0      1        0  (KnownMask)
                       ------------------------
                                 0       0        0   (Result)
                    //this result indicates that KnownMask is NOT contained within MysteryMask
    (each bit offset is compared on its own, independent of the results from other bit offsets)

The truth of the matter is that if the MysteryMask can be bitwise-&'ed to the KnownMask, and the result is the KnownMask, then we know that the KnownMask is contained within the MysteryMask. In practice, some programmers tend to be more liberal and simply checks that the result evaluates to a true statement, so it is not uncommon to see the following:

if( USERMASK & DARK_HAIR ){
 // actions to do if dark hair is found
}
if( USERMASK & FAT ){
 //actions to do if user mask indicates that the fat flag should be used
}
if( USERMASK & TALL){
 //actions to do if user mask indicates that tall flag should be used.
}

This simplified practice is fine, if all masks being tested are defined by a single bit, but if we expand the practice to test for flags that are combinations of other flags, we can run into problems by skipping the equality check.

Let us define FAT_N_TALL as FAT | TALL or 11.

If we have Bob who is Fat and Tall with Dark Hair (111), and Jim who is Fat and not Tall with Dark Hair (101), then we can test them for the quality "Fat and Tall" by the following:

BOB                   111
FAT_N_TALL  011
Bitwise-And       &&&
RESULT            011   //True AND FAT_N_TALL == RESULT

JIM                      101
FAT_N_TALL   011
Bitwise-And        &&&
RESULT             001  //True BUT FAT_N_TALL !== RESULT

In this case, the value 001 would evaluate as true in every programming language, but the test of whether the flag FAT_N_TALL is contained within the flags for JIM should actually evaluate to False.

What we have learned about masks is the following:
  • If properly defined, Masks can be combined into single values and make code very versatile and more communicative with very little additional processing.
  • If you are implementing Masks of your own definition, you should not use the shortcut: if(MASK & FLAG){ //action}. Instead use the much less bug prone: if(MASK & FLAG === FLAG){//action}
Not addressed, but very important to note, if you are implementing your own flag system in a program you are writing, make absolutely certain that the Flag offset is only ever used for one meaning.  As long as there is consistency in your system and the flag groups are not cross combined, you will have no problems.  When offset 23 indicates "red hair" in one masking system and offset 23 means "studies english" in a second masking system, then you may run into problems when a person attempts to use your masks to describe a person who has "red hair" and who also "studies english".  This often happens when the flags are part of masks that are never meant by the designer to be combined, but fails to consider the possibility that the user doesn't know that restriction.  In this case, the value 1<<23 would be the result of this combination, which is exactly the same as the original flags and says nothing new about the subject.

No comments:

Post a Comment