The JavaTM Tutorial
Previous Page Lesson Contents Next Page Start of Tutorial > Start of Trail > Start of Lesson Search

Trail: Collections
Lesson: Interfaces

The Map Interface

A Map(in the API reference documentation)is an object that maps keys to values. A map cannot contain duplicate keys: Each key can map to at most one value. The Map interface is shown below:
public interface Map {
    // Basic Operations
    Object put(Object key, Object value);
    Object get(Object key);
    Object remove(Object key);
    boolean containsKey(Object key);
    boolean containsValue(Object value);
    int size();
    boolean isEmpty();

    // Bulk Operations
    void putAll(Map t);
    void clear();

    // Collection Views
    public Set keySet();
    public Collection values();
    public Set entrySet();

    // Interface for entrySet elements
    public interface Entry {
        Object getKey();
        Object getValue();
        Object setValue(Object value);
    }
}
The JDK contains two new general-purpose Map implementations. HashMap(in the API reference documentation), which stores its entries in a hash table, is the best-performing implementation. TreeMap(in the API reference documentation), which stores its entries in a red-black tree, guarantees the order of iteration. Also, Hashtable(in the API reference documentation)has been retrofitted to implement Map. For more information on implementations, see the Implementations lesson.

Comparison to Hashtable

If you've used Hashtable, you're already familiar with the general flavor of Map. (Of course Map is an interface, while Hashtable is a concrete implementation.) Here are the major differences:

Further, Map fixes a minor deficiency in the Hashtable interface. Hashtable has a method called contains, which returns true if the Hashtable contains a given value. Given its name, you'd expect this method to return true if the Hashtable contained a given key, as the key is the primary access mechanism for a Hashtable. The Map interface eliminates this source of confusion by renaming the method containsValue. Also, this improves the consistency of the interface: containsValue parallels containsKey nicely.

Basic Operations

The basic operations (put, get, remove, containsKey, containsValue, size, and isEmpty) behave exactly like their counterparts in Hashtable. Here's a simple program to generate a frequency table of the words found in its argument list. The frequency table maps each word to the number of times it occurs in the argument list.
import java.util.*;

public class Freq {
    private static final Integer ONE = new Integer(1);

    public static void main(String args[]) {
        Map m = new HashMap();

        // Initialize frequency table from command line
        for (int i=0; i<args.length; i++) {
            Integer freq = (Integer) m.get(args[i]);
            m.put(args[i], (freq==null ? ONE :
                            new Integer(freq.intValue() + 1)));
        }

        System.out.println(m.size()+" distinct words detected:");
        System.out.println(m);
    }
}
The only thing even slightly tricky about this program is the second argument of the put statement. It's a conditional expression that has the effect of setting the frequency to one if the word has never been seen before, or one more than its current value if the word has already been seen. Let's run the program:
% java Freq if it is to be it is up to me to delegate

8 distinct words detected:
{to=3, me=1, delegate=1, it=2, is=2, if=1, be=1, up=1}
Suppose you'd prefer to see the frequency table in alphabetical order. All you have to do is change the implementation type of the Map from HashMap to TreeMap. Making this four character change causes the program to generate the following output from the same command line:
8 distinct words detected:
{be=1, delegate=1, if=1, is=2, it=2, me=1, to=3, up=1}
Are interfaces cool, or what?

Like the Set(in the API reference documentation)and List(in the API reference documentation)interfaces, Map strengthens the requirements on the equals and hashCode methods so that two Map objects can be compared for logical equality without regard to their implementation types. Two Map objects are equal if they represent the same key-value mappings.

By convention, all Map implementations provide constructors that take a Map object and initialize the new Map to contain all of the key-value mappings in the specified Map. This standard Map constructor is entirely analogous to the standard collection constructor for Collection implementations. It allows the caller to create a Map of a desired implementation type that initially contains all of the mappings in another Map, regardless of the other Map's implementation type. For example, suppose you have a Map, named m. The following one-liner creates a new HashMap initially containing all of the same key-value mappings as m:

Map copy = new HashMap(m);

Bulk Operations

The clear operation does exactly what you think it does: it removes all of the mappings from the Map. The putAll operation is the Map analogue of the Collection interface's addAll operation. In addition to its obvious use of dumping one Map into another, it has a second, more subtle, use. Suppose a Map is used to represent a collection of attribute-value pairs; the putAll operation, in combination with the standard Map constructor, provides a neat way to implement attribute map creation with default values. Here's a static factory method demonstrating this technique:
static Map newAttributeMap(Map defaults, Map overrides) {
    Map result =  new HashMap(defaults);
    result.putAll(overrides);
    return result;
}

Collection Views

The Collection-view methods allow a Map to be viewed as a Collection in three ways: The Collection-views provide the only means to iterate over a Map. Here's an example illustrating the standard idiom for iterating over the keys in a Map:
for (Iterator i=m.keySet().iterator(); i.hasNext(); )
    System.out.println(i.next());
The idiom for iterating over values is analogous. Here's the idiom for iterating over key-value pairs:
for (Iterator i=m.entrySet().iterator(); i.hasNext(); ) {
    Map.Entry e = (Map.Entry) i.next();
    System.out.println(e.getKey() + ": " + e.getValue());
}

When first presented with these idioms, many people worry that they may be slow because the Map has to create a new Collection object each time a Collection-view operation is called. Rest easy: This is not the case. There's no reason that a Map can't always return the same object each time it is asked for a given Collection-view. This is precisely what all of the JDK's Map implementations do.

With all three Collection-views, calling an Iterator's remove operation removes the associated entry from the backing Map (assuming the Map supports element removal to begin with). With the entrySet view, it is also possible to change the value associated with a key, by calling a Map.Entry's setValue method during iteration (again, assuming the Map supports value modification to begin with). Note that these are the only safe ways to modify a Map during iteration; the behavior is unspecified if the underlying Map is modified in any other way while the iteration is in progress.

The Collection-views support element removal in all its many forms: the remove, removeAll, retainAll, and clear operations, as well as the Iterator.remove operation. (Yet again, this assumes that the backing Map supports element removal.)

The Collection-views do not support element addition under any circumstances. It would make no sense for the keySet and values views, and it's unnecessary for the entrySet view, as the backing Map's put and putAll provide the same functionality.

Fancy Uses of Collection-Views: Map Algebra

When applied to the Collection-views, the bulk operations (containsAll, removeAll and retainAll) are a surprisingly potent tool. For starters, suppose you want to know whether one Map is a submap of another, that is, whether the first Map contains all of the key-value mappings in the second. The following idiom does the trick:
if (m1.entrySet().containsAll(m2.entrySet())) {
    ...
}
Along similar lines, suppose that you want to know if two Map objects contain mappings for all the same keys:
if (m1.keySet().equals(m2.keySet())) {
    ...
}
Suppose you have a map that represents a collection of attribute-value pairs, and two sets, representing required attributes and permissible attributes. (The permissible attributes include the required attributes.) The following snippet determines whether the attribute map conforms to these constraints, and prints a detailed error message if it doesn't:
boolean valid = true;
Set attributes = attributeMap.keySet();
if(!attributes.containsAll(requiredAttributes)) {
    Set missing = new HashSet(requiredAttributes);
    missing.removeAll(attributes);
    System.out.println("Missing required attributes: "+missing);
    valid = false;
}

if (!permissibleAttributes.containsAll(attributes)) {
    Set illegal = new HashSet(attributes);
    illegal.removeAll(permissibleAttributes);
    System.out.println("Contains illegal attributes: "+illegal);
    valid = false;
}

if (valid)
    System.out.println("OK");
Suppose you want to know all the keys common to two Map objects:
Set commonKeys = new HashSet(a.keySet());
commonKeys.retainAll(b.keySet());
A similar idiom gets you the common values, and the common key-value pairs. Extra care is needed if you get the common key-value pairs, as the elements of the resulting Set, which are Map.Entry objects, may become invalid if the Map is modified.

All the idioms presented thus far have been "non-destructive": They don't modify the backing Map. Here are a few that do. Suppose you want to remove all the key-value pairs that one Map has in common with another:

m1.entrySet().removeAll(m2.entrySet());
Suppose you want to remove from one Map all the keys that have mappings in another:
m1.keySet().removeAll(m2.keySet());
And what happens when you start mixing keys and values in the same bulk operation? Suppose you have a Map called managers that maps each employee in a company to the employee's manager. We'll be deliberately vague about the types of the key and value objects. It doesn't matter, so long as they're the same. Now suppose you want to know who all the "individual contributors" are. (This is corporate-speak for employees who are not managers.) The following one-liner tells you exactly what you want to know:
Set individualContributors = new HashSet(managers.keySet());
individualContributors.removeAll(managers.values());
Suppose you want to fire all of the employees who report directly to some manager (we'll call him herbert):
Employee herbert = ... ;
managers.values().removeAll(Collections.singleton(herbert));
Note that this idiom makes use of Collections.singleton, a static factory method that returns an immutable Set with the single, specified element.

Once you've done this, you may have a bunch of employees whose managers no longer work for the company (if any of herbert's direct-reports were themselves managers). The following code tells you all of the employees whose manager no longer works for the company:

Map m = new HashMap(managers);
m.values().removeAll(managers.keySet());
Set slackers = m.keySet();
This example is a bit tricky. First it makes a temporary copy of the Map. Then it removes from the temporary copy all entries whose (manager) value is a key in the original Map. Recall that the original Map contains an entry for every employee. Thus, the remaining entries in the temporary Map comprise all the entries from the original Map whose (manager) values are no longer employees. The keys in the temporary copy, then, represent precisely the employees that we're looking for. If you stare at the example for a little while, this should all become clear. If it doesn't, now would be a good time to get a steaming hot mug of freshly brewed Java.

There are many, many more idioms like the ones contained in this section but it would be impractical and tedious to list them all. Once you get the hang of it, it's not that hard to come up with the right one when you need it.

Multimaps

A multimap is like a map, except it can map each key to multiple values. The Collections Framework doesn't include an interface for multimaps, because they aren't used all that commonly. It's a fairly simple matter to use a Map whose values are List objects as a multimap. This technique is demonstrated in the next code example, which reads a dictionary containing one word per line (all lowercase) and prints out all of the permutation groups in the dictionary that meet a size criterion. A permutation group is a bunch of words all of which contain exactly the same letters, but in a different order. The program takes two arguments on the command line: the name of the dictionary file, and the minimum size of permutation group to print out. Permutation groups containing fewer words than the specified minimum are not printed.

There is a standard trick for finding permutation groups: for each word in the dictionary, alphabetize the letters in the word (that is, reorder the word's letters into alphabetical order) and put an entry into a multimap, mapping the alphabetized word to the original word. For example, the word "bad" causes an entry mapping "abd" into "bad" to be put into the multimap. A moment's reflection will show that all of the words to which any given key maps form a permutation group. It's a simple matter to iterate over the keys in the multimap, printing out each permutation group that meets the size constraint.

The following program is a straightforward implementation of this technique. The only tricky part is the alphabetize method, which returns a string containing the same characters as its argument, in alphabetical order. This routine (which has nothing to do with the Collections Framework) implements a slick bucket sort. It assumes that the word being alphabetized consists entirely of lowercase alphabetic characters.

import java.util.*;
import java.io.*;

public class Perm {
    public static void main(String[] args) {
        int minGroupSize = Integer.parseInt(args[1]);
 
        // Read words from file and put into simulated multimap
        Map m = new HashMap();
        try {
            BufferedReader in =
                   new BufferedReader(new FileReader(args[0]));
            String word;
            while((word = in.readLine()) != null) {
                String alpha = alphabetize(word);
                List l = (List) m.get(alpha);
                if (l==null)
                    m.put(alpha, l=new ArrayList());
                l.add(word);
            }
        } catch(IOException e) {
            System.err.println(e);
            System.exit(1);
        }

        // Print all permutation groups above size threshold
        for (Iterator i = m.values().iterator(); i.hasNext(); ) {
            List l = (List) i.next();
            if (l.size() >= minGroupSize)
                System.out.println(l.size() + ": " + l);
        }
    }

    private static String alphabetize(String s) {
        int count[] = new int[256];
        int len = s.length();
        for (int i=0; i<len; i++)
            count[s.charAt(i)]++;
        StringBuffer result = new StringBuffer(len);
        for (char c='a'; c<='z'; c++)
            for (int i=0; i<count[c]; i++)
                result.append(c);
        return result.toString();
    }
}
Running the program on an 80,000 word dictionary takes about 16 seconds on an aging UltraSparc 1. With a minimum permutation group size of eight, it produces the following output:
% java Perm dictionary.txt 8

 9: [estrin, inerts, insert, inters, niters, nitres, sinter,
     triens, trines]
 8: [carets, cartes, caster, caters, crates, reacts, recast, traces]
 9: [capers, crapes, escarp, pacers, parsec, recaps, scrape, secpar,
     spacer]
 8: [ates, east, eats, etas, sate, seat, seta, teas]
12: [apers, apres, asper, pares, parse, pears, prase, presa, rapes,
     reaps, spare, spear]
 9: [anestri, antsier, nastier, ratines, retains, retinas, retsina,
     stainer, stearin]
10: [least, setal, slate, stale, steal, stela, taels, tales, teals,
     tesla]
 8: [arles, earls, lares, laser, lears, rales, reals, seral]
 8: [lapse, leaps, pales, peals, pleas, salep, sepal, spale]
 8: [aspers, parses, passer, prases, repass, spares, sparse, spears]
 8: [earings, erasing, gainers, reagins, regains, reginas, searing,
     seringa]
11: [alerts, alters, artels, estral, laster, ratels, salter, slater,
     staler, stelar, talers]
 9: [palest, palets, pastel, petals, plates, pleats, septal, staple,
     tepals]
 8: [enters, nester, renest, rentes, resent, tenser, ternes, treens]
 8: [peris, piers, pries, prise, ripes, speir, spier, spire]
Many of these words seem a bit bogus, but that's not the program's fault; they're in the dictionary file. Here's the dictionary file we used. It is derived from the Public Domain ENABLE benchmark reference word list, at http://personal.riverusers.com/~thegrendel/

Previous Page Lesson Contents Next Page Start of Tutorial > Start of Trail Search