|
| |
The standard parser configuration for the Xerces2 reference
implementation of XNI is defined in the
org.apache.xerces.parsers.StandardParserConfiguration
class. This configuration is comprised of a number of
components. Some of these components are configurable and some
are shared within the configuration but do not implement the
XMLComponent
interface.
The following list details the set of components used in the
Xerces2 standard configuration. The components marked with an
asterisk (*) are configurable.
- Symbol Table
- Error Reporter (*)
- Document Scanner (*)
- DTD Scanner (*)
- Entity Manager (*)
- DTD Validator (*)
- Namespace Binder (*)
- Schema Validator (*)
|
There are additional components other than those in the above
list, such as the "Grammar Pool" and "Datatype Validator
Factory". However, the validation engine in the Xerces2
reference implementation is currently being re-designed and
re-implemented. Therefore, these components are subject to
change and should not be used or relied upon.
|
In general, there are levels of dependency among the components
in the standard configuration. Some components are required by
all of the configurable components, where as there are certain
components required by other components. The following diagram
illustrates these basic levels of dependency.
The dependencies of each component are detailed in subsequent
sections of this document but the basic dependencies are
listed below:
Each configurable component queries the components that it
depends on before each document is parsed. The configuration
is required to call the
XMLComponent's
"reset" method. From the
XMLComponentManager
object that is passed to the "reset" method, the component
can query the other components that it needs. Therefore, each
component is assigned a unique property identifier used to
query the components from the component manager.
The following example source code shows how one of the
standard Xerces2 components is queried within a configurable
component. However, for complete dependency details and the
property identifiers defined for each component, refer to the
appropriate sections of this document.
| | | | import org.apache.xerces.xni.parser.XMLComponent;
import org.apache.xerces.xni.parser.XMLComponentManager;
import org.apache.xerces.xni.parser.XMLConfigurationException;
public class MyComponent
implements XMLComponent {
// Constants
public static final String SYMBOL_TABLE =
"http://apache.org/xml/properties/internal/symbol-table";
// XMLComponent methods
public void reset(XMLComponentManager manager)
throws XMLConfigurationException {
SymbolTable symbolTable =
(SymbolTable)manager.getProperty(SYMBOL_TABLE);
}
} | | | | |
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/symbol-table
|
Type
|
org.apache.xerces.util.SymbolTable
|
For performance reasons, the Xerces2 reference implementation
uses a custom symbol table in order to re-use common strings
that appear in the document. The symbol table is responsible
for keeping track of these common strings and always return
the same java.lang.String reference for lexically
equivalent strings. This not only reduces the amount of unique
objects created while parsing, it also allows components (e.g.
the validators, etc) to perform comparisons directly on the
references for certain string objects without having to call
the "equals" method.
Note:
Nearly all of the standard components depend on this component.
Therefore, if you write a parser configuration that re-uses any
of the standard components, you must have an instance of this
component registered with the appropriate property identifier.
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/error-reporter
|
Type
|
org.apache.xerces.impl.XMLErrorReporter
|
Recognized features:
- http://apache.org/xml/features/continue-after-fatal-error
Recognized properties:
- http://apache.org/xml/properties/internal/error-handler
In any parser instance, there must be a way for components to
report errors in a uniform way. The "Error Reporter" component
serves this purpose and simplifies the process of localizing
error messages and notifying the registered
XMLErrorHandler.
In general, errors are identified by the domain of the error
and a unique key within that domain. The XMLErrorReporter class
allows message formatters to be set for each domain and then
delegates the formatting of error messages (with replacement
text) to the message formatter assigned to that error domain.
The localized error message is then sent to the registered
error handler.
An error message formatter is any class that implements the
org.apache.xerces.util.MessageFormatter interface.
If you write a new parser component for use with the existing
Xerces2 components, you should implement your own message
formatter and register it with the Error Reporter. For
example:
| | | | import java.util.Locale;
import java.util.MissingResourceException;
import org.apache.xerces.util.MessageFormatter;
public class MyFormatter
implements MessageFormatter {
// MessageFormatter methods
public String formatMessage(Locale locale, String key, Object[] args)
throws MissingResourceException {
// localize and format message based on locale, key,
// and replacement text arguments
return "MY ERROR ("+key+")";
}
} | | | | |
| | | | import org.apache.xerces.impl.XMLErrorReporter;
import org.apache.xerces.xni.parser.XMLComponent;
import org.apache.xerces.xni.parser.XMLComponentManager;
import org.apache.xerces.xni.parser.XMLConfigurationException;
public class MyComponent
implements XMLComponent {
// Constants
public static final String ERROR_REPORTER =
"http://apache.org/xml/properties/internal/error-reporter";
public static final String DOMAIN = "http://example.com/mydomain";
// XMLComponent methods
public void reset(XMLComponentManager manager)
throws XMLConfigurationException {
XMLErrorReporter reporter =
(XMLErrorReporter)manager.getProperty(ERROR_REPORTER);
if (reporter.getMessageFormatter(DOMAIN) == null) {
reporter.putMesssageFormatter(DOMAIN, new MyFormatter());
}
}
} | | | | |
Note:
It is strongly encouraged that any new error domains
that you create follow the standard URI syntax. While there is
no requirement that the URI must point to an actual resource on
the Internet, it is a common way to separate domains and it
provides more useful information to the application.
Note:
Nearly all of the standard components depend on this component.
Therefore, if you write a parser configuration that re-uses any
of the standard components, you must have an instance of this
component registered with the appropriate property identifier.
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/document-scanner
|
Type
|
org.apache.xerces.xni.parser.XMLDocumentScanner
|
Required properties:
- http://apache.org/xml/properties/internal/symbol-table
- http://apache.org/xml/properties/internal/error-reporter
- http://apache.org/xml/properties/internal/entity-manager
- http://apache.org/xml/properties/internal/dtd-scanner
Recognized features:
- http://xml.org/sax/features/namespaces
- http://xml.org/sax/features/validation
- http://apache.org/xml/features/nonvalidating/load-external-dtd
- http://apache.org/xml/features/scanner/notify-char-refs
- http://apache.org/xml/features/scanner/notify-builtin-refs
The org.apache.xerces.impl.XMLDocumentScannerImpl
class implements the XNI document scanner interface and is
implemented so that it can also function as a "pull" parser.
A pull parser allows the application to drive the parsing of
the document instead of having all of the document events
"pushed" to the registered handlers.
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/dtd-scanner
|
Type
|
org.apache.xerces.xni.parser.XMLDTDScanner
|
Required properties:
- http://apache.org/xml/properties/internal/symbol-table
- http://apache.org/xml/properties/internal/error-reporter
- http://apache.org/xml/properties/internal/entity-manager
Recognized features:
- http://xml.org/sax/features/validation
- http://apache.org/xml/features/scanner/notify-char-refs
The org.apache.xerces.impl.XMLDTDScannerImpl
class implements the XNI DTD scanner interface and is
implemented so that it can also function as a "pull" parser.
A pull parser allows the application to drive the parsing of
the DTD instead of having all of the DTD events
"pushed" to the registered handlers.
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/entity-manager
|
Type
|
org.apache.xerces.impl.XMLEntityManager
|
Required properties:
- http://apache.org/xml/properties/internal/symbol-table
- http://apache.org/xml/properties/internal/error-reporter
Recognized features:
- http://xml.org/sax/features/validation
- http://xml.org/sax/features/external-general-entities
- http://xml.org/sax/features/external-parameter-entities
- http://apache.org/xml/features/allow-java-encodings
Recognized properties:
- http://apache.org/xml/properties/entity-resolver
Both the Document Scanner
and the DTD Scanner depend
on the Entity Manager. This component handles starting and
stopping entities automatically so that the scanners can continue
operation transparently even when entities go in and out of
scope.
The Entity Manager implements an Entity Scanner which is a
low-level scanner for document and DTD information. Because
the document and DTD scanners interact only with the Entity
Scanner to scan the document, the scanners are shielded from
changes caused by starting and stopping entities. Changes in
the entities being scanned happens transparently within the
Manager/Scanner combination but the scanner components are
notified of the start and end of the entity by implementing
the XMLEntityHandler interface that is only part of the
Xerces2 reference implementation.
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/validator/dtd
|
Type
|
org.apache.xerces.impl.dtd.XMLDTDValidator
|
Required properties:
- http://apache.org/xml/properties/internal/symbol-table
- http://apache.org/xml/properties/internal/error-reporter
Recognized features:
- http://xml.org/sax/features/namespaces
- http://xml.org/sax/features/validation
- http://apache.org/xml/features/validation/dynamic
The DTD Validator performs validation of the document events that
it receives which may augment the streaming information set with
default attribute values and normalizing attribute values.
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/namespace-binder
|
Type
|
org.apache.xerces.impl.XMLNamespaceBinder
|
Required properties:
- http://apache.org/xml/properties/internal/symbol-table
- http://apache.org/xml/properties/internal/error-reporter
Recognized features:
- http://xml.org/sax/features/namespaces
The Namespace Binder is responsible for detecting namespace bindings
in the startElement/emptyElement methods and emitting appropriate
start and end prefix mapping events. Namespace binding should always
occur after DTD validation (since namespace bindings may
have been defaulted from an attribute declaration in the DTD) and
before Schema validation.
|
| |
Property information:
Property Id
|
http://apache.org/xml/properties/internal/validator/schema
|
Type
|
org.apache.xerces.impl.xs.XMLSchemaValidator
|
Required properties:
- http://apache.org/xml/properties/internal/symbol-table
- http://apache.org/xml/properties/internal/error-reporter
Recognized features:
- http://xml.org/sax/features/namespaces
- http://xml.org/sax/features/validation
- http://apache.org/xml/features/validation/dynamic
The Schema Validator performs validation of the document events that
it receives which may augment the streaming information set with
default simple type values and normalizing simple type values.
|
|
|