Joshua
open source statistical hierarchical phrase-based machine translation system
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends
joshua.util.io.LineReader Class Reference
Inheritance diagram for joshua.util.io.LineReader:
[legend]
Collaboration diagram for joshua.util.io.LineReader:
[legend]

List of all members.

Public Member Functions

 LineReader (String filename) throws IOException
 LineReader (InputStream in)
 LineReader (BufferedReader reader)
void close () throws IOException
boolean ready () throws IOException
String readLine () throws IOException
Iterator< String > iterator ()
boolean hasNext ()
String next () throws NoSuchElementException
void remove () throws UnsupportedOperationException
int countLines () throws IOException

Static Public Member Functions

static final InputStream getInputStream (String filename) throws IOException
static void main (String[] args)

Protected Member Functions

void finalize () throws Throwable

Private Attributes

BufferedReader reader
String buffer
IOException error

Static Private Attributes

static final Charset FILE_ENCODING = Charset.forName("UTF-8")

Detailed Description

This class provides an Iterator interface to a BufferedReader. This covers the most common use-cases for reading from files without ugly code to check whether we got a line or not.

Author:
wren ng thornton wren@.nosp@m.user.nosp@m.s.sou.nosp@m.rcef.nosp@m.orge..nosp@m.net
Version:
LastChangedDate:
2009-03-26 15:06:57 -0400 (Thu, 26 Mar 2009)

Constructor & Destructor Documentation

joshua.util.io.LineReader.LineReader ( String  filename) throws IOException

Opens a file for iterating line by line. If the file name ends in ".gz" then we automatically open it with GZIP. File encoding is assumed to be UTF-8.

Parameters:
filenamethe file to be opened

Here is the call graph for this function:

Here is the caller graph for this function:

Wraps an InputStream for iterating line by line. Stream encoding is assumed to be UTF-8.

joshua.util.io.LineReader.LineReader ( BufferedReader  reader)

Uses a BufferedReader for iterating line by line.


Member Function Documentation

void joshua.util.io.LineReader.close ( ) throws IOException

This method will close the file handle, and will raise any exceptions that occured during iteration. The method is idempotent, and all calls after the first are no-ops (unless the thread was interrupted or killed). For correctness, you must call this method before the object falls out of scope.

Here is the caller graph for this function:

int joshua.util.io.LineReader.countLines ( ) throws IOException

Iterates over all lines, ignoring their contents, and returns the count of lines. If some lines have already been read, this will return the count of remaining lines. Because no lines will remain after calling this method, we implicitly call close.

Returns:
the number of lines read

Here is the call graph for this function:

void joshua.util.io.LineReader.finalize ( ) throws Throwable [protected]

We attempt to avoid leaking file descriptors if you fail to call close before the object falls out of scope. However, the language spec makes no guarantees about timeliness of garbage collection. It is a bug to rely on this method to release the resources. Also, the garbage collector will discard any exceptions that have queued up, without notifying the application in any way.

Having a finalizer means the JVM can't do "fast allocation" of LineReader objects (or subclasses). This isn't too important due to disk latency, but may be worth noting.

See also:
Performance Tips
Techniques

Here is the call graph for this function:

static final InputStream joshua.util.io.LineReader.getInputStream ( String  filename) throws IOException [static]

Returns an InputStream for a filename, using Joshua's canonical means for interpreting that name (e.g\ detecting gzipped files). This is used by the LineReader constructor that accepts a String argument.

Deprecated:
This method is provided in order for joshua.decoder.DecoderThread to open files in the canonical way for handing off to joshua.decoder.segment_file.SegmentFileParser. The SegmentFileParser interface can't be made more liberal (e.g. to accept a java.io.Reader) because javax.xml.parsers.SAXParser can't parse that argument and no common java.io.Reader gives access to the underlying InputStream. This method is considered a hack which should be removed once a better solution presents itself.

Here is the caller graph for this function:

Returns true if the iteration has more elements. (In other words, returns true if next would return an element rather than throwing an exception.)

Here is the caller graph for this function:

Iterator<String> joshua.util.io.LineReader.iterator ( )

Return self as an iterator.

static void joshua.util.io.LineReader.main ( String[]  args) [static]

Example usage code.

Here is the call graph for this function:

String joshua.util.io.LineReader.next ( ) throws NoSuchElementException

Return the next line of the file. If an error is encountered, NoSuchElementException is thrown. The actual IOException encountered will be thrown later, when the LineReader is closed. Also if there is no line to be read then NoSuchElementException is thrown.

Here is the call graph for this function:

Here is the caller graph for this function:

String joshua.util.io.LineReader.readLine ( ) throws IOException

This method is like next() except that it throws the IOException directly. If there are no lines to be read then null is returned.

Here is the call graph for this function:

Here is the caller graph for this function:

boolean joshua.util.io.LineReader.ready ( ) throws IOException

Determine if the reader is ready to read a line.

Here is the caller graph for this function:

void joshua.util.io.LineReader.remove ( ) throws UnsupportedOperationException

Unsupported.


Member Data Documentation

IOException joshua.util.io.LineReader.error [private]
final Charset joshua.util.io.LineReader.FILE_ENCODING = Charset.forName("UTF-8") [static, private]
BufferedReader joshua.util.io.LineReader.reader [private]