Dynamic string substitution in reader streams
I
have a (text) file on disk that I need to read into a library with a Reader object.
While reading this file, I want to perform regular expression string substitution on the data.
My current solution is to read the entire file into memory as a String, do String replacement, and then create a StringReader for that String and pass it back to the library as a Reader.
This works, but for large files, especially when running in multithreaded, performance is an issue.
What I wanted to do was have it read each line from the file at once, replace this substring, and silently return it to the consumer of Reader – but I couldn’t think of how to do that.
Is there a better way to accomplish this task?
I’m using Java 7
Here’s an example of my current solution – reading from “file”, replacing all “a” with “b”, and passing Stream to the consumer.
public void loadFile(final File file) throws Exception
{
final Pattern regexPattern = Pattern.compile("a");
final String replacementString = "b";
try (BufferedReader cleanedBufferedReader = new BufferedReader(new StringReader(replaceInBufferedReader(new BufferedReader(new FileReader(file)),
regexPattern, replacementString))))
{
new StreamSource(cleanedBufferedReader).doSomething();
}
}
private static String replaceInBufferedReader(final BufferedReader reader, final Pattern pattern, final String replacement) throws IOException
{
final StringBuilder builder = new StringBuilder();
String str;
while ((str = reader.readLine()) != null)
{
builder.append(str).append(System.lineSeparator());
}
return pattern.matcher(builder.toString()).replaceAll(replacement);
}
Solution
You only want to subclass BufferedReader.
class MyBufferedReader extends BufferedReader {
MyBufferedReader(Reader r) {
super(r);
}
@Override
String readLine() {
String line = super.readLine();
perform replacement here
return line;
}
}
Open your file as usual, but wrap it in your subclass instead of wrapping it in BufferedReader.
try ( Reader r = ...;
BufferedReader br = new MyBufferedReader(r)) {
String line;
while ((line = br.readLine()) != null) {
use returned line
}
}
Update
Below is a Reader that allows you to replace the input stream line by line while still presenting the user with a
Reader
interface flow.
Internally, the raw stream is wrapped in a BufferedReader
, reading one line at a time. You can perform any required transformations on the rows that have already been read. The converted line then becomes a StringReader
. When the user of the stream invokes any read(...)
operation, the request is directed to a buffered StringReader
to satisfy. If the StringReader
runs out of characters, the next line of the BufferedReader
is loaded and converted to continue with read(...) < Provide input
abstract public class TranslatingReader extends Reader {
private BufferedReader input;
private StringReader output;
public TranslatingReader(Reader in) {
input = new BufferedReader(in);
output = new StringReader("");
}
abstract public String translate(String line);
@Override
public int read(char[] cbuf, int off, int len) throws IOException {
int read = 0;
while (len > 0) {
int nchars = output.read(cbuf, off, len);
if (nchars == -1) {
String line = input.readLine();
if (line == null) {
break;
}
line = tranlate(line);
line += "\n"; Add the newline which was removed by readLine()
output = new StringReader(line);
} else {
read += nchars;
off += nchars;
len -= nchars;
}
}
if (read == 0)
read = -1;
return read;
}
@Override
public void close() throws IOException {
input.close();
output.close();
}
}