I/O

IO Overview
Peter Komisar _©Conestoga College version 5.6 / 2010

IO Streams & Encoding

IO is a abbreviation that stand for input and output. The use
of the computer depends on input and output processs.

Input

Information has to been entered into the machine for it to be
processed. Data might be entered into the computer from the
keyboard. Data can also be supplied to a computer program
by loading a file.

// files, keyboards,

Output

In the early days, the standard means of outputing data from
a computer program was to send the processed information
to a printer. Long term storage was supplied by reels of
magnetic tape. The monitors we view have become the
commonest means of viewing a computer's output. Now data
is output to files which are stored on hard drives and other
media such as CDs, Flash drives and DVDs.

// files in turn stored on persistent medium, printers, monitors

The java.io & java.nio Packages

Java supplies the java.io package of classes to provide Java
programs with the capability to input and output data.

Later version of Java have added the java.nio package which
adds an alternate framework for doing input and output.

Streams

It is easy to picture a stream of water running from a lake down
a grade to a small pond. The lake is the source for the stream of
water. The pond is the destination. Drawing the comparison to
the computer world, the water represents data or information.
The data source might be a file or a data structure in a program.
The destination might be another file, or an internet connection.

In this 'watery' metaphor, the stream runs through a trench or
perhaps is carried along in pipes or aquifers. In Java, the conduit
for data is supplied by the different classes of the java.io package.

// water ( or data ) streams from a source like a lake ( file or net
// connection) to a destination like a pond ( a program or another file.)

Encoding

Before streams can be used effectively, there has to be agreement
on both ends of a transmission just what the data means. A series
of 1s and 0s can be intepreted in a number of ways. Do you count
them in groups of 8, 16, 24 or 32 bits per character? In what order
are the bits interpreted within the byte? Is the first bit the most
significant bit or the least. If more than one byte represents a
character which byte holds the bits of greater significance?

// a 'word' stored in big endian format places the least significant byte at the
// higher address and the most significant byte at the lower address -> 'big ends'

ASCII and ISO-8859-1

Encoding describes the schemes used to translate characters
into binary bit patterns, represented by 0s and 1s. Character sets
are composed of letters, numbers and symbols. The ASCII letter
capital 'A' has a decimal value, 65. In binary this can be described
in a single byte as 1000001. Using the first seven bits of a byte to
describe the 128 characters is the original ASCII character set.

Using all 8 bits to describe 256 characters is a character set
called ISO-8859-1 by the ISO standards organization.

As time went on, different characters were mapped to the different
numeric values of the byte to derive other character sets, for example
ISO-8859-9. A bewildering array of these character sets have been
created describing not just different languages but also different
platform versions of each of these languages. Soon schemes were
being sought to bring these character sets into single manageable
groupings.

Unicode and UCS-2

Unicode uses 2 bytes to describe 65,535 characters. This is the
character set Java uses. From inception, Java was 'internationalized'.
Unicode currently has over 35,000 of it's values assigned to characters
that make up the world's languages. Unicode is the more or less the
same set described by a different name as UCS-2. UCS-2 is an
abbreviation for Universal Character Set and is a ISO standard. Like
Unicode, USC-2 uses two bytes to describe each character.

UCS-4 // aka ISO-10646 or BMP-10646

Though it is likely that Unicode will supply everyone's character set
needs for some time to come, ISO recognizes that in addition to the
spoken languages there are non-spoken languages such as those
used in Mathematics, Science and Commerce. As well, there are
experimental invented languages. There are also many dialects,
both ancient and modern that are just now being discovered and
committed to script.

Considering these facts, it became apparent that Unicode would
ultimately not be big enough! The great character set endorsed by
ISO is UCS-4 which uses 4 bytes to represent every character. It
is also referred to as ISO-10646 character set or BMP-10646
where BMP is an abbreviation of Basic Multilingual Plane.

UTF-8 & UTF-16

As you can imagine, if you were sending a stream of data half way
around the world, in UCS-4 encoding, but all your characters were
ASCII values, you would be sending 3 blank bytes along with each
byte that held the ASCII character. This would be an inefficient use
of bandwidth. ( Bandwidth in this context relates to the number of
bytes of data that are used per character.)

// ASCII characters in UCS-4 --> 3 of 4 bytes would be null

UTF-8 and UTF-16 are clever schemes that allow a variable number
of bytes of data to be used depending on what type of character is
being sent. UTF stands for Universal Character Set Transformation
Format. UTF first determines what character encoding type it is sending.

Because ASCII always has the most significant bit empty, UTF_8
is able to use this fact to send a corresponding single byte for each
ASCII character it is converting. For Unicode, one to three bytes are
used. For ranges higher, a fourth byte is used. As such, UTF-8 may
be more or less efficient in terms of bandwidth used, than transmissions
based solely on fixed-length character sets.

// see the DataInput Interface in the Java Documentation for a description of
// how Java implements UTF-8

   UTF-8 Notes      // for reference

 Following is an interesting note, regarding UTF-8. It was  invented by Ken Thompson, 
the inventor of Unix.  He also  wrote the 'B' programming language, the precursor of 
 Dennis Ritchie's 'C' language. 

 // Quoted from "UTF-8, a transformation format of  ISO 10646, 
  // http://www.ietf.org/rfc/rfc3629.txt 

"UTF-8 was devised in September 1992 by Ken Thompson, guided by design criteria 
specified by Rob Pike, with the objective of defining a UCS transformation format 
usable in the Plan9 operating system in a non-disruptive manner.  Thompson's design 
was stewarded through standardization by the X/Open Joint Internationalization Group. 
__ "
 
 "The table below summarizes the format of these different octet types. The letter x 
indicates bits available for encoding bits of the character number."

 Char. number range  |        UTF-8 octet sequence
    (hexadecimal)    |              (binary)
 --------------------+---------------------------------------
 0000 0000-0000 007F | 0xxxxxxx
 0000 0080-0000 07FF | 110xxxxx 10xxxxxx
 0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
 0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
.

Fortunately, character sets were deviced to be backwards compatible with
earlier character sets, so ASCII is a subset of ISO-8859-1. ISO-8859-1 is a
subset of UCS-2 or Unicode. UCS-2 or Unicode are subsets of UCS-4.

Table depicting characters set relationships

    UCS-4 // ISO 10646

    UCS-2// Unicode

ISO-8859-1

   ASCII

Common Data Formats

ASCII American Standard Code for
Information Interchange 7-bits,
[1 byte] 128 mostly readable characters

ISO 8859-1 256 ISO character code 8-bits,
[1 byte] adds many non-English characters

Unicode synonymous with UCS-2 16 bits,
[2 bytes] most of the world's characters

UCS-2 Universal Character Set
two byte encoding 16 bits,
[2 bytes] 1st plane of ISO/IEC 10646
in two bytes (0 to 64K)

USC-4 Universal Character Set
four byte encoding 32 bits,
[4 bytes] Full ISO/IEC implementation
in 4 bytes // ISO 10 // 2 ^{31
characters}

UTF-8 * UCS Transformation Format
versatile but complex [1 to 6
bytes] if bit 1 is 0,-->1 byte ASCII
if 1st bits are 110,-->2 bytes
if 1st bits are1110,->3 bytes etc.

UTF-16 extended variant of UCS-2 [2 to 4
bytes]

binary data transfer in numeric form [1 to 8
bytes] binary version of Java chars

Java's own Object Oriented Streaming Object Type

objects streaming java objects [variable length] the serialization process

// ASCII is a subset of ISO 8859-1 which is a subset of Unicode/(UCS-2) which is
// a subset of UCS-4/(ISO/IEC 10646)

* A Table Describing Details of UTF-8 encoding

Bits Hex Min Hex Max UTF-8 Binary Encoding

7 00000000 0000007F 0xxxxxxx

11 00000080 000007FF 110xxxxx 10xxxxxx

16 00000800 0000FFFF 1110xxxx 10xxxxxx 10xxxxxx

21 00010000 001FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

26 00200000 03FFFFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

31 04000000 7FFFFFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

from 'Unicode support in Solaris Operation Systems', a white paper at the Sun Site.

// Note the original UTF-8 used four bits, perhaps no one envisioned using more.
// The above Sun example describes the use of 6. The use of 7 bytes would give
// you the capacity to encode a full UCS-4. 6 bytes gives a 2 billion character
// capacity which might explain the lack of interest in a 7th byte.

Character Set Translation between Java & the Operating System

As mentioned earlier, Java uses Unicode internally. The operating system
that Java is running on may not be using Unicode. Solaris for instance may
be using ASCII, ISO8859-1 or UTF. Mac uses ASCII coupled with a proprietary
character format. NT uses Unicode, ASCII or UCS. The Java IO functions take
care of translating between the the character set(s) being used by the underlying
operating system and the Unicode character set Java uses.

Quote from Microsoft Site

"Windows is developed with English as the default user interface. From its
initial design, the Microsoft Windows NT® operating system incorporated
international support through the Unicode character encoding system."

For more Microsoft Encoding Info go to:Quote from Microsoft Site. This
page is for XP features. You might want to check Windows 7.

http://www.microsoft.com/technet/prodtechnol/winxppro/evaluate/muiovw.mspx

The following link shows you the many encodings that Java in JDK1.5.x
supports with reference to different popular platforms.

http://java.sun.com/j2se/1.5.0/docs/guide/intl/encoding.doc.html

Levels of Organization

The Original Package Expanded to Include char Types

Very early in Java' life, at Java 1.0 a 'snafu' was discovered
that required a cumbersome but effective fix.

When Java was first released it had one set of IO classes
that were based on working with bytes of data. Unfortunately,
when this byte-based system of classes was put into service
it was discovered that there were problems associated with
translating the character encoding based on using multiple
bytes per character. The fix was to add a set of parallel classes.

Wherever applicable, for each byte-based stream class, a
char-based counterpart was created. The new set of classes
were designed to work with 2-byte Unicode characters instead
of bytes. Now in addition to the 8 bit Input/OutputStream classes
of JDK1.0 the Reader/Writer classes of JDK1.1 were added.

This has made the java.io package larger by 13 classes. The
IO package had already been criticized for having a design
that involved too many classes, and this augmented this criticism.

In anycase, it is what it is. Although it is a little more work to
learn than perhaps it needed to be, once you get use to the
IO package, you will find it easy to use.

To summarize, the original IO classes descend from Input and
OutputStream which work with bytes. Reader and Writer are
the superclasses of the classes that were added in JDK 1.1
and work with char primitive types.

IO Class Subgroupings

One nice feature of the stream classes is they follow predictable
naming patterns. The stream class names relate to the categories
that they fall into.

Stream Width
/* based on 8 ore 16 bit primitive data types */

The IO classes are built to work with 8-bit bytes or 16-bit characters.
Input & OutputStreams are the parents of the set of classes that work
with 8-bit byte streams while the newer Reader & Writer superclass
are subclassed to produce a set of IO classes that work with 16-bit
character based streams. One works with the 'byte' primitive type
while the other works with the 'char' type.

Source / Destination or Function
/* What is the target source or destination i.e. File */

A second way that IO classes are categorized is based on what
they do, or, what vessel serves as a source or destination for their
functionality. For example, FileOutputStream streams data to File,
CharArrayWriter writes to character arrays.

Direction of Input or Output
/* Does the class stream data in or out */

The final categorization scheme is based on the direction of flow.
Input classes will be suffixed with InputStream or Reader, depending
on if they work on bytes or char types and output classes will be
suffixed as OutputStream or Writer depending if they write bytes
or char types. Classes that are not named according to this system
are designed to do some other kind of work beyond just streaming
data.

Most of the time you can deduce the name of the class you need by
deciding the source/destination or function, whether you need an 8
or 16 bit stream, and which I/O direction you are going in.

To write an array of bytes out to stream you would use the following.

Example

ByteArray + OutputStream --> ByteArrayOutputStream.

To read a file of 16 bit, char types in you would formulate the
name of the class as follows.

Example

File + Reader --> FileReader.

Abstract Super Classes

InputStream & OutputStream / Reader & Writer

In general, the methods of the stream classes throw IOException.
The methods that do not cause exceptions, are involved with
processes other than streaming. IOException is the parent of
15 specialized Exception classes in the java.io package and
described in the JDK documentation.

Normally, the try{ } catch( ) { } construct is used to handle the
potential of exceptions being thrown, catching IOException.

Note the abstract superclasses are not all that abstract! Only
one or two methods in each class is abstract and require
implementation. Both Reader and Writer have an additional
constructor which take an Object instance as an argument.
The object's lock is used to synchronize thread access to
shared code in a multithreaded environments.

The functions contained in the 8-bit classes are mirrored in the
16 bit classes, distinguished primarily by the argument types,
being byte for Input/OutputStream and char for Reader/Writer
methods.

InputStream

abstract int read( ) reads one byte from a source returning the
value in the low-order 8 bits of an int type

int read(byte[ ] dest) reads bytes from source into dest array, in
this case returning an int value describing
the number of bytes read

int read(byte[ ] dest,
int offset, int length) reads length bytes into dest array, starting at
offset All three forms of read return -1 when
no more data is available

void close( ) releases system resources associated with
the source i.e. the file descriptor

int available( ) returns the number of bytes that can be read
or skipped from the given input stream without
blocking

long skip(long nbytes) attempts to skip and discard nbytes, returns
the number actually skipped

boolean markSupported( )* returns true if mark/reset mechanism is supported

void mark( int readlimit )* sets a mark in the input stream

void reset( ) resets the stream to repeat the read from mark

*doesn't throw IOException

OutputStream

abstract void write
(int b) writes the byte in the low-order
8-bits of b, discarding the high 24

void write(byte b[ ]) writes an array of bytes, b

void write
(byte b[ ] , int offset,
int length) writes an array subset, b, from offset,
length bytes long

void flush( ) writes out any bytes which may have
been buffered

void close( ) releases system resources associated
with the data source

1) Output Streams methods are declared throwing IOException
2) flush( ) buffers before close( )

Reader

int read( ) reads one character from source, returned
in the low-order 16 bits of an int

int read(char[] dest) reads characters from source into dest
array, returns the number read

abstract int read
(char[] dest, int offset, int length) reads length chars into array dest
beginning at offset all three forms of
read return -1 when no more data is
available

abstract void close( ) releases system resources associated
with source i.e. the file descriptor

long skip(long nchars) attempts to skip and discard nchars,
returns the number actually skipped

boolean markSupported( )* returns true if mark/reset mechanism
is supported

void mark(int readlimit)* sets a mark in the input stream

void reset( ) resets the stream to repeat the read
from mark

void ready( ) returns true if stream has data immediately
available (so read( ) won't block)

* don't throw IOException

Writer // note Writer has two extra writes that take String or a String subset

void write(int c) writes char in low order 16-bits of
argument

void write(char[] c) writes an array of characters

void write
(char[] c, int offset, int length) writes a subset of an array of characters

void write(String s) writes a string

void write
(String s, int offset, int length) writes a subset of String s of given
length from offset

abstract void flush( )* writes out any characters the stream
has buffered

abstract void close( ) releases system resources associated
with source

* flush( ) buffers before close( )

Miscellaneous // for reference

1) available( ) is replaced by ready( ) in Reader.

2) Because Reader and Writer methods convert native codesets
properly, they should be used when processing character data.

3) read( ) methods returning an int allowing for 16-bit char values,
(0 to 0xFFFF) and EOF int value, -1 ( 0xFFFFFFFF )

4) The number of bytes read by a method is dictated by the system's default encoding.

( i ) if ASCII, one byte is read and promoted internally to two-byte Unicode.

( ii ) if Unicode two bytes are read and no conversion is needed

(iii) if UTF is in effect, 1 to 3 bytes are read, and the corresponding
Unicode (Java) char is assembled

5) Currently the encoding in effect for a file cannot be changed.
// should be checked periodically

Using the IO Classes

Basic Methodology

The basic approach to using the IO classes is to open a stream
by instantiating one of the IO classes, then using a read or write
methods to stream data, in or out. Often, a while loop is used to
process the information until there is nothing left to read or write.

To use the IO classes the following steps are followed:

1 ) Select the appropriate I/O class for the objective.

2 ) Instantiate the class using the most appropriate constructor

3 ) High level streams may be layered over low-level streams
    by making the low-level stream objects the arguments to
    the high-level stream's constructor

4 ) Call the appropriate read( ) or write( ) methods on the top
    level stream.

5 ) Use a while loop if needed to process the information.

Note as a rule the methods will require IOException to be
caught. You will also notice in some of the following examples
FileNotFoundException is caught. This exception may arise
when opening a stream on system file.

FileNotFoundException is a subclass of IOException, so catching
IOException suffices for catching the FileNotFoundException.

Example

try{
        FileInputStream fis=new FileInputStream("HT.html");
       // other stuff
        }
catch(IOException io){
        io.printStackTrace( );
        }

Opening an Input Stream

The following example shows the role performed by the stream
classes. In practice, the action of streaming data is accomplished
by instantiating a stream class on a source or destination then
calling a read or write method. Here a FileInputStream is being
set up to get bytes from a stored file.

Example

Computer memory <-- FileInputStream ( read ) --< File bytes

The first part in the coding sets up the 'pipe' and gets it ready to
transfer data. This is the instantiation of the steam class.

Example

FileInputStream fis = new FileInputStream("disk_file.txt");
// ready to call a read or write method on the reference

Once created, the read methods can be called on the stream
object reference to stream data from the file into computer
memory available to the program in the form of some Java
data type.

Example

int hold= fis.read( );

The following code sample shows the simple case of where
one byte is read from the stream. Instantiating FileInputStream
requires that FileNotFoundException be caught. Using the
read( )method requires IOException to be caught. Keeping
catch clauses for both exception types is good form in that
it provides an opportunity for a fine-tuned response for each
error condition.

Code For Reading a Character & the EOF Symbol

Create in the same directory as the following code will be
compiled into, a text file with a single letter in it such as "A"
(without quote marks). We will read this one character and
then attempt to read another. Since there are no others, we
will instead read the EOF symbol which is the int value, -1.

Code Sample

// needs a file with a single character in it, here a 'Z'

import java.io.*;

class ReadOne{
public static void main(String[]args){
    int oneCharacter=0;
    int EOFSymbol=0;
      try{
        FileInputStream fis=new FileInputStream("Z.txt");
        oneCharacter=fis.read( );
        EOFSymbol=fis.read( );


        }
     catch(FileNotFoundException fnfe){
        fnfe.printStackTrace();
        }
    catch(IOException io){
        io.printStackTrace();
        }
System.out.println((char)oneCharacter);
System.out.println(EOFSymbol);

   }
}

OUTPUT

> java ReadOne
A
-1

Opening an Output Stream

FileOutputStream class is one of a few classes in the IO package
that will automatically create a file. If the file exists it will overwrite it.
The process of outputing data is essentially similar but the reverse
or reading data from a file. Here a FileOutputStream is being set
up to write data from a program to file for storage.

Example

From program -->
            via FileOutputStream
                                         --> bytes
                                                   --> to File

The instantiation that represents setting up the stream
class in the above process is as follows.

Example

FileOutputStream fos = new FileOutputStream("disk_file.txt");

Once created, the object's write methods can be called
to send bytes to file. The following lines represent the key
code that outputs to a file. The file T.txt will be created
containing the characters 'X' and Y'.

Example

FileOutputStream fos=new FileOutputStream("T.txt");
int i='Y';
fos.write(i);
byte b='X';
fos.write(b);

After running the following code do a directory check and
observe a new file, "T.txt" has been created with the letters
X and Y as contents.

Complete Code Sample Showing Characters Output to File

import java.io.*;

class WriteTwo
     {
public static void main(String[]args)
      {
    try {
       FileOutputStream fos=new FileOutputStream("T.txt");
        int i='X';
        fos.write(i);
        byte b='Y';
   // byte works as it is promoted to an int
        fos.write(b);

     }
   catch(FileNotFoundException fnfe)
        {
        fnfe.printStackTrace( );
        }
    catch(IOException io)
        {
        io.printStackTrace();
        }
      }
     }

// to demonstrate the file is overwritten, change the values
// that are sent to file, recompile, rerun and read file.

Opening a High Order Stream on a Low Order Stream

Low order streams open directly on sources or destinations.
High order streams are designed to supply higher functionality.
For instance a DataInputStream has methods that can extract
different data types from the raw bytes of a file.

Partial List of DataInputStream Methods

readByte( )	readChar ( )
readShort( )	readInt( )
readLong()	readFloat( )
readDouble( )	readUTF(DataInput in)
readUnsignedByte( )	readUnsignedShort( )
readUTF( )	readFully(byte[] b)

// DataInput's readLine( ) is deprecated as it doesn't always work right

The following shows the idea behind layering. First an
input example is shown.

Example

ints, doubles etc
                    <-- FileInputStream
<-- bytes
                                 <-- DataInputStream
                                             <-- bytes in a disk file

Following is how the layering is set up in Java. Notice the
DataInputStream is opened on the FileInputStream object.

Example

FileInputStream fis = new FileInputStream("disk_file.txt");
DataInputSteam dis = new DataInputStream(fis);

Here the layered streams show an process of outputing specific
Java types via the DataOutputStream to a FileOutputStream
which sends them to a disk file as raw bytes.

Example

ints, doubles etc --> DataOutputStream--->bytes
--> FileOutputStream -->bytes in a disk file

Following is how the layering is represented in java. Here
the DataOutputStream is opened on the FileOutputStream
object.

Example

FileOutputStream fos = new FileOutputStream("disk_file.txt");
DataOutputStream dos = new DataOutputStream(fos);

Following are two concrete examples of writing primitive types to
and from file. Notice there is an append flag that can be used with
FileOutputStream writes which we example shortly in this note.

Writing Primitive Types Example

import java.io.*;

class WriteTypes
{
public static void main(String[] args)
{
boolean bee = true;
byte    bite =7;
char    car='X';
int     i =1234567;

try
{

FileOutputStream fos= new FileOutputStream("Data.txt");
DataOutputStream dos= new DataOutputStream(fos);

dos.writeBoolean(bee);
dos.writeByte(bite);
dos.writeChar(car);
dos.writeInt(i);

dos.flush();
dos.close();
fos.flush();
fos.close();
}
catch(IOException io)
{
System.out.println(io);
}
}
}

The following code reads back the primitives that were
stored in the Data.txt file created above.

Read Primitive Types Example

timport java.io.*;

class ReadTypes
{
public static void main(String[] args)
{
boolean boo;
byte    by;
char    car;
int     i;

try
{

FileInputStream fis= new FileInputStream("Data.txt");
DataInputStream dis= new DataInputStream(fis);

boo = dis.readBoolean();
System.out.println("boolean: " + boo);

by = dis.readByte();
System.out.println("byte: " + by);

car = dis.readChar();
System.out.println("char: " + car);

i = dis.readInt();
System.out.println("int: " + i);

// use flush on output buffers not incoming

dis.close();
fis.close();
}
catch(IOException io)
{
System.out.println(io);
}
}
}

System Streams

There are three streams that are automatically opened
when a program is started.

1) System.in // reads bytes from the keyboard
2) System.out // writes bytes to the screen famous for System.out.println( )
3) System.err // seperate out to screen to report errors

We take advantage of System.out frequently whenever we use
the statement System.out.println( ). Here we are using a static
PrintStream object defined in the System class. Printstream
has a number of print( ) and println( ) methods that are available
for us to use to print to console.

Reading From System.in

System.in may serve some useful purposes. Unfortunately it is
a static instance of InputStream that doesn't have a convenient
'readString( )' method defined. The following formulation uses the
BufferedReader class, readLine( ) method.

Before it can be used however the 8-bit stream that InputStream
generates must be converted to a 16-bit character stream. This
function is provided by the special stream conversion class,
InputStreamReader. A BufferedReader object can be layered on
an InputStreamReader object which in turn takes the System.in
object. The following shows this construction.

Example

BufferedReader in=
new BufferedReader(new InputStreamReader(System.in));

The following code showsBufferedReader's readLine( ) method
\may be used. The program prompts the user to input a line. When
the carriage return is entered the line is sent to file and the program
exits.

Notice int the following code, the FileOutputStream constructor
form includes a boolean parameter. This decides if the target file
is written over or if the output is appended to the file's contents.

Code Example

import java.io.*;

public class ToBlog{
public static void main(String[]args){
   String line;
    // need to catch IOException
    PrintWriter out=null;
    try{
        out=new PrintWriter(new FileOutputStream("Blog",true));
         // boolean decides if file is appended or written over
        }
     catch(FileNotFoundException f){
     System.out.println("file not found");
     System.exit(0);
     }
     BufferedReader in=new BufferedReader
      (new InputStreamReader(System.in));

      System.out.println
      ("Type in a line of any length. Carriage return ends it.");
      try{
          line=in.readLine( );
          out.println(line);
          out.close();
          in.close();
          }
       catch(IOException io){
          System.out.println("IOException");
          }
       }
}

Setting in, out, and err streams

Each of System.in, 'out' and 'err' have an accompanying
set method that permits redirecting the output to another
stream. These methods are setIn( ), setOut( ) and setErr( ).

Setting Out

The following example shows output being set to deliver
to a file via a FileOutputStream class.

Example

import java.io.*;

class SetOut{
public static void main(String[]args){
try{
System.out.println
("You won't see screen output after this line as it is redirected to file." );

FileOutputStream fs1 = new FileOutputStream("log.txt",true);
System.setOut(new PrintStream(fs1));
}
catch(FileNotFoundException nf)
      {
      nf.printStackTrace();
      }

// this is the line that gets redirected
System.out.println
("Hello There! System.out. has been redirected to the log.txt file");
}
}
// check your current directory to find the mylog.txt file

Setting In

The next example shows the input stream being set to
receive data from a file. To read this stream out, a
BufferedReader is built on an InputStreamReader which
takes System.in as an argument. The BufferedReader
class has the very useful readLine( ) method.

This method considers a line anything that terminated by
a line feed ('\n'), a carriage return ('\r'), or a combination
of a carriage return followed directly by a linefeed. If there
are no more lines the method returns the null value. This
can be conveniently used in a while loop to end reading
when a file has no more lines to read.

The following example shows a file read using System's
setIn( )method and the readLine( )method of the
BufferedReader class.

Example

import java.io.*;

class SetIn{
public static void main(String[]args){
String line;

try{
    FileInputStream fi = new FileInputStream("Read.txt");
    System.setIn(fi);
    BufferedReader in=new BufferedReader
    (new InputStreamReader(System.in));
// while in.readLine( ) returns not null continue
    while( (line=in.readLine())!=null){
      System.out.println(line);
     }
       in.close( );
    }
catch(FileNotFoundException io){
    System.out.println("File Not Found");
    }
catch(IOException io){
    System.out.println("IO Exception: ");
    }

}
}

Self Test Self Test With Anwers

1) Basic Multilingual Plane 10646 is synonomous with which of the following?

a) ASCII
b) ISO-8859-1
c) UCS-2
d) UCS-4

2 ) Which of the following is a 16-bit stream

a) ASCII
b) ISO-8859-1
c) Unicode
d) UCS-4

3 ) The primary reason the io package was made bigger was

a) to add new functionality to the package
b) to overcome problems that arose in translating some of the character sets
c) to make it easier to use
d) to accomodate new developments in character set technology.

4) Pick the incorrect statement. IO Classes get their names from

a ) whether they handle byte or String type.
b) the width or the streams they handle, one or two bytes
c) where the stream is being sent or taken from or what they do
d) what direction the stream is going, whether being read or written.

5) Which of the following statements is not correct?

a) InputStream, OutputStream, Reader & Writer are all abstract classes.
b) Generally the exception that must be caught when using io classes is the
     IOException.
c) Both 8-bit and 16-bit abstract io superclasses have methods defined to
    read and write arrays.
d) 16-bit abstract io superclasses have methods defined to read and write
     String type.

6 ) True or False. System.out and System.in represent static instances of
different classes.

Exercise

1 ) Adapt the code in the System.in section of the note to create a
command line program that queries an individual for his or her name,
phone number and e-mail address. This could be concatenated
together as a String. Use BufferedReader's readLine( ) method to
accept input.

Finally save the information to a file. (There is a constructor of
FileWriter that allows data to be appended to the file. )

Example

public FileWriter(String fileName, boolean append)

Optional

// if you have done some Swing recently you might like to do this

2) Create a JFrame with a JMenu that has Open and Save menu items.
Add listeners to these menu items where in their action method they will
open or send a String to or from file. Use a JTextArea to create or receive
the output or input that will be streamed to or from file.

ASCII	American Standard Code for Information Interchange	7-bits, [1 byte]	128 mostly readable characters
ISO 8859-1	256 ISO character code	8-bits, [1 byte]	adds many non-English characters
Unicode	synonymous with UCS-2	16 bits, [2 bytes]	most of the world's characters
UCS-2	Universal Character Set two byte encoding	16 bits, [2 bytes]	1st plane of ISO/IEC 10646 in two bytes (0 to 64K)
USC-4	Universal Character Set four byte encoding	32 bits, [4 bytes]	Full ISO/IEC implementation in 4 bytes // ISO 10 // 2 ^{31 characters}
UTF-8 *	UCS Transformation Format versatile but complex	[1 to 6 bytes]	if bit 1 is 0,-->1 byte ASCII if 1st bits are 110,-->2 bytes if 1st bits are1110,->3 bytes etc.
UTF-16	extended variant of UCS-2	[2 to 4 bytes]
binary	data transfer in numeric form	[1 to 8 bytes]	binary version of Java chars

Bits	Hex Min	Hex Max	UTF-8 Binary Encoding
7	00000000	0000007F	0xxxxxxx
11	00000080	000007FF	110xxxxx 10xxxxxx
16	00000800	0000FFFF	1110xxxx 10xxxxxx 10xxxxxx
21	00010000	001FFFFF	11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
26	00200000	03FFFFFF	111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
31	04000000	7FFFFFFF	1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

abstract int read( )	reads one byte from a source returning the value in the low-order 8 bits of an int type
int read(byte[ ] dest)	reads bytes from source into dest array, in this case returning an int value describing the number of bytes read
int read(byte[ ] dest, int offset, int length)	reads length bytes into dest array, starting at offset All three forms of read return -1 when no more data is available
void close( )	releases system resources associated with the source i.e. the file descriptor
int available( )	returns the number of bytes that can be read or skipped from the given input stream without blocking
long skip(long nbytes)	attempts to skip and discard nbytes, returns the number actually skipped
boolean markSupported( )*	returns true if mark/reset mechanism is supported
void mark( int readlimit )*	sets a mark in the input stream
void reset( )	resets the stream to repeat the read from mark

abstract void write (int b)	writes the byte in the low-order 8-bits of b, discarding the high 24
void write(byte b[ ])	writes an array of bytes, b
void write (byte b[ ] , int offset, int length)	writes an array subset, b, from offset, length bytes long
void flush( )	writes out any bytes which may have been buffered
void close( )	releases system resources associated with the data source

int read( )	reads one character from source, returned in the low-order 16 bits of an int
int read(char[] dest)	reads characters from source into dest array, returns the number read
abstract int read (char[] dest, int offset, int length)	reads length chars into array dest beginning at offset all three forms of read return -1 when no more data is available
abstract void close( )	releases system resources associated with source i.e. the file descriptor
long skip(long nchars)	attempts to skip and discard nchars, returns the number actually skipped
boolean markSupported( )*	returns true if mark/reset mechanism is supported
void mark(int readlimit)*	sets a mark in the input stream
void reset( )	resets the stream to repeat the read from mark
void ready( )	returns true if stream has data immediately available (so read( ) won't block)

void write(int c)	writes char in low order 16-bits of argument
void write(char[] c)	writes an array of characters
void write (char[] c, int offset, int length)	writes a subset of an array of characters
void write(String s)	writes a string
void write (String s, int offset, int length)	writes a subset of String s of given length from offset
abstract void flush( )*	writes out any characters the stream has buffered
abstract void close( )	releases system resources associated with source