org.apache.xerces.util
Class SymbolTable

java.lang.Object
  |
  +--org.apache.xerces.util.SymbolTable
Direct Known Subclasses:
ShadowedSymbolTable, SoftReferenceSymbolTable, SynchronizedSymbolTable

public class SymbolTable
extends java.lang.Object

This class is a symbol table implementation that guarantees that strings used as identifiers are unique references. Multiple calls to addSymbol will always return the same string reference.

The symbol table performs the same task as String.intern() with the following differences:

An instance of SymbolTable has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the SymbolTable, and the initial capacity is simply the capacity at the time the SymbolTable is created. Note that the SymbolTable is open: in the case of a "hash collision", a single bucket stores multiple entries, which must be searched sequentially. The load factor is a measure of how full the SymbolTable is allowed to get before its capacity is automatically increased. When the number of entries in the SymbolTable exceeds the product of the load factor and the current capacity, the capacity is increased by calling the rehash method.

Generally, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the time cost to look up an entry (which is reflected in most SymbolTable operations, including addSymbol and containsSymbol).

The initial capacity controls a tradeoff between wasted space and the need for rehash operations, which are time-consuming. No rehash operations will ever occur if the initial capacity is greater than the maximum number of entries the Hashtable will contain divided by its load factor. However, setting the initial capacity too high can waste space.

If many entries are to be made into a SymbolTable, creating it with a sufficiently large capacity may allow the entries to be inserted more efficiently than letting it perform automatic rehashing as needed to grow the table.

Version:
$Id: SymbolTable.java 1358350 2012-07-06 19:04:35Z mrglavas $
Author:
Andy Clark, John Kim, IBM
See Also:
SymbolHash

Inner Class Summary
protected static class SymbolTable.Entry
          This class is a symbol table entry.
 
Field Summary
protected  SymbolTable.Entry[] fBuckets
          Buckets.
protected  int fCollisionThreshold
          A new hash function is selected and the table is rehashed when the number of keys in the bucket exceeds this threshold.
protected  int fCount
          The total number of entries in the hash table.
protected  int[] fHashMultipliers
          Array of randomly selected hash function multipliers or null if the default String.hashCode() function should be used.
protected  float fLoadFactor
          The load factor for the SymbolTable.
protected  int fTableSize
          actual table size
protected  int fThreshold
          The table is rehashed when its size exceeds this threshold.
protected static int MAX_HASH_COLLISIONS
          Maximum hash collisions per bucket for a table with load factor == 1.
protected static int MULTIPLIERS_MASK
           
protected static int MULTIPLIERS_SIZE
           
protected static int TABLE_SIZE
          Default table size.
 
Constructor Summary
SymbolTable()
          Constructs a new, empty SymbolTable with a default initial capacity (101) and load factor, which is 0.75.
SymbolTable(int initialCapacity)
          Constructs a new, empty SymbolTable with the specified initial capacity and default load factor, which is 0.75.
SymbolTable(int initialCapacity, float loadFactor)
          Constructs a new, empty SymbolTable with the specified initial capacity and the specified load factor.
 
Method Summary
 java.lang.String addSymbol(char[] buffer, int offset, int length)
          Adds the specified symbol to the symbol table and returns a reference to the unique symbol.
 java.lang.String addSymbol(java.lang.String symbol)
          Adds the specified symbol to the symbol table and returns a reference to the unique symbol.
 boolean containsSymbol(char[] buffer, int offset, int length)
          Returns true if the symbol table already contains the specified symbol.
 boolean containsSymbol(java.lang.String symbol)
          Returns true if the symbol table already contains the specified symbol.
 int hash(char[] buffer, int offset, int length)
          Returns a hashcode value for the specified symbol information.
 int hash(java.lang.String symbol)
          Returns a hashcode value for the specified symbol.
protected  void rebalance()
          Randomly selects a new hash function and reorganizes this SymbolTable in order to more evenly distribute its entries across the table.
protected  void rehash()
          Increases the capacity of and internally reorganizes this SymbolTable, in order to accommodate and access its entries more efficiently.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TABLE_SIZE

protected static final int TABLE_SIZE
Default table size.

MAX_HASH_COLLISIONS

protected static final int MAX_HASH_COLLISIONS
Maximum hash collisions per bucket for a table with load factor == 1.

MULTIPLIERS_SIZE

protected static final int MULTIPLIERS_SIZE

MULTIPLIERS_MASK

protected static final int MULTIPLIERS_MASK

fBuckets

protected SymbolTable.Entry[] fBuckets
Buckets.

fTableSize

protected int fTableSize
actual table size

fCount

protected transient int fCount
The total number of entries in the hash table.

fThreshold

protected int fThreshold
The table is rehashed when its size exceeds this threshold. (The value of this field is (int)(capacity * loadFactor).)

fLoadFactor

protected float fLoadFactor
The load factor for the SymbolTable.

fCollisionThreshold

protected final int fCollisionThreshold
A new hash function is selected and the table is rehashed when the number of keys in the bucket exceeds this threshold.

fHashMultipliers

protected int[] fHashMultipliers
Array of randomly selected hash function multipliers or null if the default String.hashCode() function should be used.
Constructor Detail

SymbolTable

public SymbolTable(int initialCapacity,
                   float loadFactor)
Constructs a new, empty SymbolTable with the specified initial capacity and the specified load factor.
Parameters:
initialCapacity - the initial capacity of the SymbolTable.
loadFactor - the load factor of the SymbolTable.
Throws:
java.lang.IllegalArgumentException - if the initial capacity is less than zero, or if the load factor is nonpositive.

SymbolTable

public SymbolTable(int initialCapacity)
Constructs a new, empty SymbolTable with the specified initial capacity and default load factor, which is 0.75.
Parameters:
initialCapacity - the initial capacity of the hashtable.
Throws:
java.lang.IllegalArgumentException - if the initial capacity is less than zero.

SymbolTable

public SymbolTable()
Constructs a new, empty SymbolTable with a default initial capacity (101) and load factor, which is 0.75.
Method Detail

addSymbol

public java.lang.String addSymbol(java.lang.String symbol)
Adds the specified symbol to the symbol table and returns a reference to the unique symbol. If the symbol already exists, the previous symbol reference is returned instead, in order guarantee that symbol references remain unique.
Parameters:
symbol - The new symbol.

addSymbol

public java.lang.String addSymbol(char[] buffer,
                                  int offset,
                                  int length)
Adds the specified symbol to the symbol table and returns a reference to the unique symbol. If the symbol already exists, the previous symbol reference is returned instead, in order guarantee that symbol references remain unique.
Parameters:
buffer - The buffer containing the new symbol.
offset - The offset into the buffer of the new symbol.
length - The length of the new symbol in the buffer.

hash

public int hash(java.lang.String symbol)
Returns a hashcode value for the specified symbol. The value returned by this method must be identical to the value returned by the hash(char[],int,int) method when called with the character array that comprises the symbol string.
Parameters:
symbol - The symbol to hash.

hash

public int hash(char[] buffer,
                int offset,
                int length)
Returns a hashcode value for the specified symbol information. The value returned by this method must be identical to the value returned by the hash(String) method when called with the string object created from the symbol information.
Parameters:
buffer - The character buffer containing the symbol.
offset - The offset into the character buffer of the start of the symbol.
length - The length of the symbol.

rehash

protected void rehash()
Increases the capacity of and internally reorganizes this SymbolTable, in order to accommodate and access its entries more efficiently. This method is called automatically when the number of keys in the SymbolTable exceeds this hashtable's capacity and load factor.

rebalance

protected void rebalance()
Randomly selects a new hash function and reorganizes this SymbolTable in order to more evenly distribute its entries across the table. This method is called automatically when the number keys in one of the SymbolTable's buckets exceeds the given collision threshold.

containsSymbol

public boolean containsSymbol(java.lang.String symbol)
Returns true if the symbol table already contains the specified symbol.
Parameters:
symbol - The symbol to look for.

containsSymbol

public boolean containsSymbol(char[] buffer,
                              int offset,
                              int length)
Returns true if the symbol table already contains the specified symbol.
Parameters:
buffer - The buffer containing the symbol to look for.
offset - The offset into the buffer.
length - The length of the symbol in the buffer.


Copyright 1999-2018 The Apache Software Foundation. All Rights Reserved.