grep
and sort
programs could be combined as follows:
grep 'pylon' | sort < input.txt > output.txtThe
grep
program reads its input from the file
input.txt
and outputs only those lines that contain the
word 'pylon'. The output from the grep
program is then
passed as input to the sort
program, which sorts the
lines of its input and outputs them to the file output.txt
.
The filter classes presented here allow classes derived from them to be combined with each other in a similar fashion. For example, assuming the existence of filter classes analogous to each of the programs in the above example, it could be written using filter classes as
Grep grep("pylon"); Sort sort; Append input = Append::file("input.txt"); WriteTo output = WriteTo::file("output.txt", WriteTo::OVERWRITE); Filter f = grep | sort; f.connect(input, output);or more concisely as
Filter f = Grep("pylon") | Sort(); f.connect(Append::file("input.txt"), WriteTo::file("output.txt", WriteTo::OVERWRITE));Note that filter classes can also be used by themselves. The code fragment
Grep grep("pylon"); grep.connect(cin, cout);reads its input from the C++ standard input stream
cin
and outputs only those lines containing the word 'pylon' to cout
,
the C++ standard output stream.
The remainder of this document discusses the use and construction of filters in more detail. Section 2 describes how to use and combine existing filters; section 3 explains how to write new types of filters; and section 4 presents the hierarchy of filter base classes.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
connect()
member function. For example, the following code
fragment constructs a filter and then connects the standard input stream
to the standard output stream through it:
Take take10(10); take10.connect(cin, cout);The first line constructs a filter, specifically a filter that outputs the first 10 lines of its input (or all of the lines if there are fewer than 10). The second line connects the standard input stream (
cin
) to the standard output stream
(cout
) through the filter named take10
. This
causes the take10
filter to read data from cin
,
output the first 10 lines to cout
, and discard the rest.
The above code fragment could be written more concisely as
Take(10).connect(cin, cout); // outputs first 10 lines of cin to coutThis form is usually more convenient, provided that the filter isn`t going to be used again. Note that the constructors of some filters may throw an exception if the filter cannot be constructed, so this last form may make it more difficult to handle any such exceptions. This is especially true when several filters are combined, as discussed below.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
Take take12(12); // keeps first 12 lines on input Drop drop4(4); // discards first 4 lines of input Filter extractRange = take12 | drop4; // extracts lines 5-12 extractRange.connect(cin, cout);The first line constructs the Take filter
take12
that
outputs the first 12 lines of its input, and the second line constructs
the Drop filter drop4
that outputs all but the first 4
lines of its input. In the third line the take12
and
drop4
filters are combined using the filter
concatenation operator (|) to create the extractRange
filter,
a filter that outputs the fifth through the twelfth line of its input.
The fourth line of code connects the standard input stream to the
standard output stream through the extractRange
filter:
the fifth through the twelfth line read from the standard input stream
is written to the standard output stream.
As can be seen in the previous example, the result of combining two
or more filters is a filter object of class Filter. A Filter object
stores a copy of each of its component filters (i.e. the filters that
are combined to create the Filter object) and connects these copies
together in the proper order. Then when the Filter object is connected
to an input stream and an output stream it reads data from the input
stream and passes it on to the first of its component filters, which
processes it and outputs it to the next component filter for use as
input, and so on until the last component filter writes its output to
the output stream to which the Filter object is connected. Thus, in the
last line of the previous example, connecting the standard input stream
and the standard output stream through the extractRange
filter causes the following to occur:
extractRange
filter reads data from
cin
and passes it to the take12
filter as input;
take12
passes the first 12 lines of its output to
the drop4
filter as input; and
drop4
writes all but the first 4 lines of its input
to cout
.
Filter extractRange = buildLineExtractor(5, 12); extractRange.connect(cin, cout); . . . Filter buildLineExtractor(int startLine, int endLine) { // assumes startLine > 0 and startLine <= endLine Take take(endLine); Drop drop(startLine - 1); return take | drop; }The filter returned by
buildLineExtractor()
will work
correctly despite the fact that the filters take
and
drop
are destroyed at the end of
buildLineExtractor()
.
The buildLineExtractor()
function could be rewritten as
Filter buildLineExtractor(int startLine, int endLine) { // assumes startLine > 0 and startLine <= endLine return Take(endLine) | Drop(startLine - 1); }but, as mentioned above, the constructors of some filters throw an exception if the filter can`t be constructed, and this form of combining filters makes it difficult to determine which constructor threw a given exception.
Also note that a Filter object can be constructed from just one filter.
The following version of buildLineExtractor()
, which deals
sensibly with all possible parameter values, illustrates this:
Filter buildLineExtractor(int startLine, int endLine) { if (endLine <= 0 || startLine > endLine) // empty line range return Take(0); else if (startLine <= 1) // start at line 1 return Take(endLine); else return Take(endLine) | Drop(startLine - 1); }The
if
and else if
clauses construct a
Filter object from a single Take filter to handle the degenerate cases.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
connect()
that connects an input stream and an output
stream through a filter, there are three other versions of
connect()
that connect, respectively,
In actual fact, filters can only connect together other
filters: the three versions of connect()
that connect
a stream just create a filter that reads from or writes to each of
the streams that it connects, then connects the two filters.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
The FilterException class has a public member function named
getMessage()
that returns a String containing an explanation
of why the exception was thrown. This message should make it easier
to determine why a filter that threw a FilterException couldn`t be
constructed.
A common reason for the construction of a filter to fail is that its constructor has been passed an invalid argument. In such cases the filter`s constructor will throw an exception of class InvalidFilterArgumentException, a class derived from FilterException.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
Usually all of the characters from the null character to the character before the next newline character, inclusive, are lost, though this is not guaranteed.
This limitation may be eliminated in later version of the filter classes.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
DEFINE_FILTER_COPIER()
macro function
in its class declaration, passing it the name of the class; and
processLine()
.
class MinimalFilter: public SingleFilter { public: MinimalFilter(const MinimalFilter &f); void processLine(const String &line); DEFINE_FILTER_COPIER(MinimalFilter); private: const MinimalFilter &operator =(const MinimalFilter &f); };A filter like this wouldn`t be very useful, though, especially in the absence of another constructor besides the copy constructor. Thus most filters also do one or more of the following:
beforeInput()
, afterInput()
,
handleError()
and minimize()
The remainder of this section concentrates on when and how to override the virtual member functions mentioned above.
f
, or a filter of which f
is a component, has one of its connect()
member functions
called, the following sequence of events occurs:
f.beforeInput()
is called
f.processLine(inputLine)
is called for each line
inputLine
passed to f
as input
f.afterInput()
is called
beforeInput()
is used to do any set-up necessary
before input is processed, processLine()
processes or stores
each line of input as it is received, and afterInput()
is
used to do any post-input cleanup and/or process and output any
remaining input. processLine()
is declared to be pure
virtual, and so must be overridden; the default implementations of
beforeInput()
and afterInput()
do nothing.
Note that the sequence of events listed above can occur several times
during the existence of a filter, so it is important that
beforeInput()
and afterInput()
ensure that
the filter works the same way in subsequent uses as it does the first
time it is used.
The processLine()
, beforeInput()
and
afterInput()
member functions can use the
outputLine()
member function
to output a line of data to the next filter. The argument to
outputLine()
must be a line: it can contain
the end-of-line character ComponentFilter::EOL
only
as its last character, and all but the last line of output must
have the end-of-line character as its last character. The following
processLine()
function, which simply outputs each line of
input unchanged, demonstrates the use of outputLine()
:
void PassThrough::processLine(const String &line) { outputLine(line); }If an error occurs in a filter`s
beforeInput()
,
processLine()
or afterInput()
function,
use the outputError()
member function to pass a FilterError object, constructed from a
message describing the error, to any filters following this one in
a sequence of filters. Alternatively, you could use the version of
outputError()
that accepts the error message directly.
handleError()
member function receives
and handles any FilterError objects output by a previous filter in a
sequence of filters. The default implementation of
handleError()
simply calls
outputError()
to
pass the FilterError on to the next filter, and usually isn`t overridden.
If a filter does override handleError()
, it should either
handle the error (which is usually difficult or impossible) or
(eventually) pass the FilterError object to the next filter using
outputError()
.
An example of a filter that would override
handleError()
is a filter that overwrites a file that
might already exist: it could write its output to a temporary file,
and if no errors occurred then it would replace the file with the
temporary file. It would pass on any FilterError objects using
outputError()
.
minimize()
function. Some filters have resources that
expand depending on the input it receives. For example, the size of a
StringBuffer used to store the previous line of input will expand to
be about the size of the longest input line. If a filter is going to be
reused in the near future (a common occurrence), then it would be
wasteful to reduce the size of such resources only to expand them again
the next time the filter is used. Thus such resources are usually left
at their expanded size at the end of the input (that is, they are not
reduced in the filter`s beforeInput()
or
afterInput()
functions).
But there may be times when the user of a filter wants to minimize
the resources used by the filter (for example, because the filter won`t
be used for a while, the system`s resources are constrained, and/or
abnormal input to the filter increased its reducible resources to
wastefully large sizes). In such situations the user can call the filter`s
minimize()
function to minimize the filter`s resource usage.
If you override minimize()
in your filter, then it should
call its parent class` version of minimize()
at some point.
A filter must also be able to perform its function after its
minimize()
function is called: resources necessary to a
filter`s correct operation should not be eliminated by its
minimize()
function.
One common action that a filter`s minimize()
function
performs is to minimize the size of any StringBuffers it uses during
processing. Passing the filter member function
minimizeBuffer()
an empty
(i.e. length() == 0
) StringBuffer will minimize the
amount of memory the StringBuffer uses.
The virtual member functions discussed above are the only SingleFilter virtual member functions that should be overridden when creating a new filter class. Overriding any of the others could cause filters of that filter class, as well as any filters combined with such filters, to work incorrectly or not at all.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
DEFINE_FILTER_COPIER()
macro function
in its class declaration, passing it the name of the class.
beforeInput()
function is called after its
implementation filter`s beforeInput()
function is
called, and its afterInput()
function is called
before its implementation filter`s afterInput()
function is called.
ShellFilter`s default implementation of processLine()
just passes its line
parameter on to its implementation
filter. If you override processLine()
in a class you derive
from ShellFilter, you may need to access the implementation filter
directly. This can be accomplished by using the ShellFilter member
function getImplementation()
, which returns a pointer to
the implementation filter, or NULL if no implementation filter has been
selected.
The implementation filter for a ShellFilter filter is usually selected
in the filter`s constructor and must be left unchanged for the rest of
the filter`s existence. To select an implementation filter, call
ShellFilter`s select()
member function with the
implementation filter as the parameter. The ShellFilter will make and
use a copy of the filter passed to select()
, so that filter
can be destroyed after select()
returns.
The following is an example of a constructor for a filter class derived from ShellFilter:
ExtractRange::ExtractRange(int startLine, int endLine): ShellFilter() { if (endLine <= 0 || startLine > endLine) // empty line range select(Take(0)); else if (startLine <= 1) // start at line 1 select(Take(endLine)); else select(Take(endLine) | Drop(startLine - 1)); }Note that the filters passed to
select()
in the above
example are local to the constructor.
While rarely done, the implementation filter can be selected after
the ShellFilter has been constructed, so long as it is selected before
one of the ShellFilter`s connect()
member functions is
called, and before the ShellFilter is used as a component of another
filter.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
connect()
member
functions, the only filter classes that should be derived directly from
ComponentFilter are those that shouldn`t be used directly, but only as
components of other filters. Filter classes derived directly from
ComponentFilter are written the same way as filter classes derived
directly from SingleFilter.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
new
operator when it fails) do not have to be of a class
derived from FilterException.
FilterException objects are constructed from a String explaining why the exception was thrown. This explanation should be designed to help the user of the filter to determine why the filter couldn`t be constructed.
If a filter`s constructor is passed an invalid argument (or an invalid combination of arguments), it should throw an exception of class InvalidFilterArgumentException, which is a class derived from FilterException.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
In the following, the end-of-line character is the character
ComponentFilter::EOL
. It is usually the newline character.
in
and copies the
line, including the end-of-line character if present, into
buf
. It returns true
if a line was successfully
copied into buf
, and false
if there is no more
data to read or an error occurred when reading from in
.
line
to the next filter. Note that
line
must actually be a line: it can only contain the
end-of-line character as its last character, and it must have
the end-of-line character as its last character unless it is the last
line to be output.
str
to buf
and outputs
each complete line of the result to the next filter. If the last part of
str
is not a complete line (that is, if str
doesn`t end with the end-of-line character), then this last, partial line
is copied into buf
; otherwise buf
will be
empty when this function returns. Note that both str
and
buf
may contain end-of-line characters (though
buf
won`t when this function returns).
filename
to buf
and outputs each complete line
of the result to the next filter. If the last part of the file is not
a complete line (that is, if the file doesn`t end with the end-of-line
character), then this last, partial line
is copied into buf
; otherwise buf
will be
empty when this function returns. Note that both the file and
buf
may contain end-of-line characters (though
buf
won`t when this function returns).
in
to
buf
and outputs each complete line of the result to the
next filter. If the last part of the data read from in
is not a complete line (that is, if the last character read from
in
isn`t the end-of-line character), then this last,
partial line is copied into buf
; otherwise buf
will be empty when this function returns. Note that both buf
and the data read from in
may contain end-of-line characters
(though buf
won`t when this function returns).
fe
to
the next filter. The second version constructs a FilterError object
from the message msg
that describes the error, then
passes that FilterError object to the next filter.
buf
is empty (that is, if
buf.length() == 0
), then this function minimizes
buf
`s capacity, and hence the amount of memory that
buf
uses. This function is usually called from a
filter class` minimize()
member function.
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]
ComponentFilter | +----SingleFilter | +----Filter | +----ShellFilter
[ Contents | Introduction | Using Filters | Writing Filters | Hierarchy ]