Editors Note, February 14th 2022: This project is now ABANDONED and no longer supported or updated.

I am proud to announce the addition of a set data type to the asherwunk/phabstractic project.

Set Data Type

A set data type can be summed up by Wikipedia:

In computer science, a set is an abstract data type that can store certain values, without any particular order, and no repeated values. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

In many respects, I have tried to implement the most useful parts of the Set data type description by Wikipedia.  Not all methods listed in the article are necessarily available in this implementation.

As well, it is possible to store multiple copies of the same value in the Set data type given the correct configuration (see below) making it a ‘multi-set’:

A generalization of the notion of a set is that of a multiset or bag, which is similar to a set but allows repeated (“equal”) values (duplicates). (Wikipedia)

Basic Sets (Variable Type)

The core of the Set data type is the data private property.  All the following methods act on this data.  This is implemented in PHP as a simple array.  However, rather than just add set elements using $this->data[] = ...;  the Set data type implements the Identity Trait Feature which assigns a unique handle to any added element.  This handle is then assigned as the key in the $data property array.  The handle is returned by the Phabstractic\Data\Type\Set::addReference()  method call, allowing direct reference from outside the class.  Although it is not encouraged to keep track of exact indices in a Set data type, this comes in handy when there are duplicate values and you want to remove only one of them.  Following are the abridged class definition, constructor, and adding methods:

<?php

namespace Phabstractic\Data\Types
{
        
    use Phabstractic\Features;
    use Phabstractic\Features\Resource as FeaturesResource;
    use Phabstractic\Data\Types\Exception as TypesException;
    use Phabstractic\Data\Types\Resource as TypesResource;
    use Phabstractic\Resource as PhabstracticResource;
    
    /**
     * @version 3.0.1
     * 
     */
    class Set implements
        TypesResource\SetInterface,
        FeaturesResource\ConfigurationInterface
    {
        use Features\ConfigurationTrait;
        use Features\IdentityTrait;
        
        protected $data = array();
        
        // ...
        
        /**
         * Options -
         * 
         *   'unique' => bool, do we check the incoming data for uniqueness?
         *                     If data is not unique an exception is raised.
         *   'strict' => bool, do we raise exceptions when the set is used
         *                     improperly, such as removing a value that doesn't
         *                     exist?
         *   'reference' => bool, do we add in constructor by reference?
         *                     defaults to true
         */
        public function __construct(
            array $values = array(),
            $options = array()
        ) {
            // version 2.1 of ConfigurationTrait handles configuraiton formatting
            $this->configure($options);
            if (!isset($this->conf->reference)) {
                $this->conf->reference = true;
            }
            
            if (!isset($this->conf->unique)) {
                $this->conf->unique = false;
            }
            
            if ($this->conf->unique) {
                $this->checkUniqueValues($values);
                // more object compatible as opposed to array_unique SORT_REGULAR
                PhabstracticResource\ArrayUtilities::returnUniqueByReference($values);
            }
            
            // version 3 uses psuedo-FQN
            $this->identityPrefix = 'Phabstractic\\Data\\Types\\Set::Element';
            
            // v3: we should generate identifiers for the elements even in construction
            foreach ($values as $key => &$value) {
                if ($this->conf->reference) {
                    $this->addReference($values[$key]);
                } else {
                    $this->add($values[$key]);
                }
            }
            
        }
        
        // ...

        private function checkUniqueValues(array $values = array())
        {
            $check = array();
            foreach ( $values as $value ) {
                if (in_array($value, $check, true)) {
                    /* it might seem prudent to use conf->strict, but we're
                       we might want to run this function if the conf->unique
                       option is set, but strict is not */
                    if ($this->conf->unique) {
                       throw new TypesException\RangeException(
                           'Phabstractic\\Data\\Types\\Set->checkUniqueValues: ' .
                           'Given Value Not Unique' );
                    } else {
                        return false;
                    }
                    
                } else {
                    $check[] = $value;
                }
                
            }
            
            return true;
        }
        
        public function add($value)
        {
            if ($this->conf->unique) {
                $this->checkUniqueValues(array_merge($this->data, array($value)));
            }
            
            $identity = $this->getNewIdentity();
            $this->data[$identity] = $value;
            
            if ($this->conf->unique) {
                PhabstracticResource\ArrayUtilities::returnUniqueByReference($this->data);
            }
            
            return $identity;
        }
        
        public function addReference(&$value)
        {
            if ($this->conf->unique) {
                $this->checkUniqueValues(array_merge($this->data, array($value)));
            }
            
            $identity = $this->getNewIdentity();
            $this->data[$identity] = &$value;
            
            if ($this->conf->unique) {
                PhabstracticResource\ArrayUtilities::returnUniqueByReference($this->data);
            }
            
            return $identity;
        }
        
        // ...

}

Notice in Phabstractic\Data\Types\Set::checkUniqueValues()  that we don’t use ArrayUtilities, but array_unique() .  Future versions may incorporate ArrayUtilities::elementComparison() .  We also only call the uniqueness check if the option ‘unique’ is set.  You won’t get a Phabstractic\Data\Types\Exception\RangeException  on a pre-existing element value unless ‘unique’ is set.

Mutable Capabilities of the Set

In addition to Phabstractic\Data\Types\Set::add()  and Phabstractic\Data\Types\Set::addReference(), which make the sets mutable to begin with, there are other methods for altering the set as it stands:

public function add($value)
{
    // ...  (see above)
    
    return $identity;
}

public function addReference(&$value)
{
    // ... (see above)
    
    return $identity;
}

public function remove($value)
{
    if ($this->conf->strict) {
        if (!in_array($value, $this->data)) {
            throw new TypesException\RangeException(
                'Phabstractic\\Data\\Types\\Set->remove: Value Not In Set');
        }
    }
    
    
    /* This closure is more compatible with set values as opposed to
       built in comparison */
    $this->data = array_udiff($this->data, array($value),
        array('Phabstractic\\Resource\\ArrayUtilities', 'elementComparison'));
}

public function removeByIdentifier($identity)
{
    if (array_key_exists($identity, $this->data)) {
        unset($this->data[$identity]);
        return true;
    }
    
    return false;
}

Note how ->add and ->addReference return an identifier?  See above:

However, rather than just add set elements using $this->data[] = ...;  the Set data type implements the Identity Trait Feature which assigns a unique handle to any added element.

This is very important when it comes to Phabstractic\Data\Types\Set::removeByIdentifier($identity) .  When you add a value to a set you get back a unique identifier that points to that element in that set object.  This is handy if say, you add a duplicate element and then want to get rid of that element.  That’s where Phabstractic\Data\Types\Set::removeByIdentifier($identity)  comes in.  As you can see in this Unit Test (github), here’s what it would look like in implementation:

/**
 * @depends testReturnPlainArray
 * 
 */
public function testRemoveValueByIdentifier() {
    $set = new Types\Set(array(1,2,3,4,5,6,7));
    
    $set->remove(4);
    
    $this->assertEquals(array(1,2,3,5,6,7), $set->getPlainArray());
    
    $set->add(9);
    $identifier = $set->add(9);
    
    // removes specific identifier element
    
    $set->removeByIdentifier($identifier);
    
    $this->assertEquals(array(1,2,3,5,6,7,9), $set->getPlainArray());
}

Set Operations

If you have a set, it’s likely that you’ll want to do some basic operations with it.  Basic operations for sets, as say in set theory, include such things as a union, an intersection, and a difference.  Other mathematical/algorithmic things you might want to do are map, filter, or reduce.  The Set abstract data type has support for all of these things.  You can find more about core Set operations on wikipedia.

Union, Intersection, and Difference

All operations for our sets are defined in static methods declared on the Set data type class.  Below is the code for the union operation, intersection operation, and difference operation. What’s nice about these operations is that they are variadic functions, and operate on a mixture of arrays, scalars, and Phabstractic\Data\Types\Resource\SetInterface ‘s.

/**
 * Set Union
 * 
 * "returns the union of sets S and T."
 * 
 * ...
 * 
 */
public static function union()
{
    $result = array();
    foreach(func_get_args() as $arg) {
        if ($arg instanceof TypesResource\SetInterface) {
            $result = array_merge($result, $arg->getPlainArray());
        } elseif (is_array($arg)) {
            $result = array_merge($result, $arg);
        } else{
            $result = array_merge($result, array($arg));
        }
        
    }
    
    // More object compatible rather than array_unique SORT_REGULAR
    return PhabstracticResource\ArrayUtilities::returnUnique($result);
}

/**
 * Set Intersection
 * 
 * "returns the intersection of sets S and T."
 * 
 * ...
 * 
 */
public static function intersection()
{
    // This closures more compatible with set values
    
    $result = null;
    foreach (func_get_args() as $arg) {    
        if ($arg instanceof TypesResource\SetInterface) {
            if ($result === null) {
                $result = $arg->getPlainArray();
            } else {
                $result = array_uintersect(
                              $result,
                              $arg->getPlainArray(),
                              array(
                    'Phabstractic\\Resource\\ArrayUtilities', 'elementComparison')
                );
            }
            
        } elseif (is_array($arg)) {
            if ($result === null) {
                $result = $arg;
            } else {
                $result = array_uintersect(
                              $result,
                              $arg,
                              array(
                    'Phabstractic\\Resource\\ArrayUtilities', 'elementComparison')
                );
            }
                
        } else {
            if ($result === null) {
                $result = array($arg);
            } else {
                $result = array_uintersect(
                              $result,
                              array($arg),
                              array(
                    'Phabstractic\\Resource\\ArrayUtilities', 'elementComparison')
                );
            }
            
        }
        
    }
    
    // More object compatible than array_unique SORT_REGULAR
    return PhabstracticResource\ArrayUtilities::returnUnique($result);
}

/**
 * Set Difference
 * 
 * "returns the difference of sets S and T."
 * 
 * ...
 * 
 */
public static function difference()
{
    // This closures are more compatible with set values
    
    $result = null;
    foreach (func_get_args() as $arg) {
        if ($arg instanceof TypesResource\SetInterface) {
            if ($result === null) {
                $result = $arg->getPlainArray();
            } else {
                $result = array_udiff(
                    $result,
                    $arg->getPlainArray(),
                    array('Phabstractic\\Resource\\ArrayUtilities', 'elementComparison')
                );
            }
            
        } elseif (is_array($arg)) {
            if ($result === null) {
                $result = $arg;
            } else {
                $result = array_udiff(
                    $result,
                    $arg,
                    array('Phabstractic\\Resource\\ArrayUtilities', 'elementComparison')
                );
            }
            
        } else {
            if ($result === null) {
                $result = array($arg);
            } else {
                $result = array_udiff(
                    $result,
                    array($arg),
                    array('Phabstractic\\Resource\\ArrayUtilities', 'elementComparison')
                );
            }
            
        }
    }
    
    // More object compatible than array_unique SORT_REGULAR
    return PhabstracticResource\ArrayUtilities::returnUnique($result);
}

You’ll notice of course that the standard measure of comparison between elements in the set has been standardized in the latest version of Set to use Phabstractic\Resource\ArrayUtilities::elementComparison(), a move I hope to continue when revising the other abstract data types as I move forward.  Refer to it’s own blog post, Array Utilities, for more information.

Map, Fold, Filter, and Walk

Sometimes you just want to do some functional programming on your Sets, you know what I mean?  Mapping will apply a function to each element, Folding will reduce a Sets elements, Filter will return only qualifying elements, and Walking is like Mapping and Folding at the same time.  For more information on these additional Set operators, take a gander at Wikipedia.  Well, we have you covered:

/**
 * Apply a function to each element of the set, reducing to a single function
 * 
 * "returns the value A|S| after applying Ai+1 := F(Ai, e) for each
 * element e of S."
 * 
 * ...
 * 
 */
public static function fold($F, TypesResource\SetInterface $S)
{
    return array_reduce($S->getPlainArray(), $F);
}

/**
 * Returns only elements that satisfy a 'predicate'
 * 
 * The predicate here is a function that returns true or false on any
 * given item "returns the subset containing all elements of S that
 * satisfy a given predicate P."
 * 
 * ...
 * 
 */
public static function filter($F, TypesResource\SetInterface $S)
{
    return array_filter($S->getPlainArray(), $F);
}

/**
 * Apply a function to each element of the set
 * 
 * "returns the set of distinct values resulting from applying function
 * F to each element of S."
 * 
 * ...
 * 
 */
public static function map($F, TypesResource\SetInterface $S)
{
    // More object compatible, rather than array_unique SORT_REGULAR
    return PhabstracticResource\ArrayUtilities::returnUnique(
        array_map($F, $S->getPlainArray()));
}

/**
 * Walk (non-recursively) the array
 * 
 * Applies the given function, plus any additional arguments
 * ($args) past the given function argument to each
 * element of the set.
 * 
 * Note: Unlike Map, this operates on a reference of the set
 * 
 * ...
 * 
 */
public static function walk(TypesResource\SetInterface &$S, $F, $D)
{
    return array_walk($S->getArrayReference(), $F, $D);
}

I know, I had to do a lot of heavy lifting there… right? Did you see?  Right.

Individual Set Operations and Comparisons (Subsets)

Sets themselves have operations that are specific to their own instances.  Here’s a rundown:

public function in($value)
{
    return in_array($value, $this->data);
}

public function isEmpty()
{
    return !$this->data;
}

public function size()
{
    return count($this->data);
}

public function iterate()
{
    return new \ArrayIterator($this->data);
}

public function enumerate()
{
    return $this->getPlainArray();
}

public function pop()
{
    return array_pop($this->data);
}

public function clear()
{
    $this->data = array();
}

public function hash()
{
    return md5(implode(',', $this->data));
}

There is one VERY handy static element that compares two given sets.  I didn’t include it in the core Set operations I outlined earlier because I tend to think of this type of operation as acting on specific individual sets.  Here is that code:

/**
 * Subset
 * 
 * "a predicate that tests whether the set S is a subset of set T."
 * 
 * This function tests whether all elements of S are in T
 * 
 * ...
 * 
 */
public static function subset(
    TypesResource\SetInterface $S,
    TypesResource\SetInterface $T
) {
    // This is more compatible value wise (objects)
    $sa = $T->getPlainArray();
    $ta = $S->getPlainArray();
    reset($sa);
    while (current($sa)) {
        foreach ($ta as $td) {
            $c = false;
            if (current($sa) === $td) {
                $c = true;
                next($sa);
                break;
            }
            
        }
        
        if ($c == false) {
            return false;
        }
        
    }
    
    return true;
}

NOTE: As of this writing subset doesn’t support or use Phabstractic\Resource\ArrayUtilities::elementComparison(), as the above operations do.  This may change in the future.

Static Instantiation

Want a set in a real jiffy?  Just build one!

public static function build(array $values = array())
{
    return new Set($values);
}

Build-A-Bear, Build-A-Set, a gift for your math lover.

Conclusion

Sets are very important mathematical constructions and enable the computation of many other abstract data types.  A set can have singularly exclusive values or can have multiples of the same value in them.  Sets can be compared using algebraic operators and can exist as subsets of other sets.  Set abstract data types can hold references as the data of an element, enabling some interesting behavior.

Thanks for reading!

View on GitHub

This is part of the Phabstractic Library.

photo credit: Television Set at European Parliament via photopin (license)

Asher Wolfstein

Metaverse Resident

About the Author

A metaverse resident, you can find me on Second Life (kadar.talbot) and other online platforms. I write about my digital life, my musings, and my projects as a programmer, webmaster, artist, and game designer. (exist (be wunk) (use rational imagination) (import artist coder maker furry) (conditional (if (eq you asshole) (me (block you))))

View Articles