Java – Unexpected StreamTokenizer behavior in Android

Unexpected StreamTokenizer behavior in Android… here is a solution to the problem.

Unexpected StreamTokenizer behavior in Android

I’m having this weird issue: the same code produces different results in native Java than it does in Android.

InputStreamReader reader = new InputStreamReader(in, "UTF-8");
BufferedReader m_reader = new BufferedReader(reader);
StreamTokenizer m_tokenizer = new StreamTokenizer(m_reader);
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
int c = m_reader.read();
System.out.println(c);
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());

Given the following input stream (read from file)

(;FF[4]CA[UTF-8]

native Java printouts

Token['('], line 1
Token[';'], line 1
Token[FF], line 1
Token['['], line 1
52
Token[']'], line 1
Token[CA], line 1

As expected. But in Android I get:

Token['('], line 1
Token[';'], line 1
Token[FF], line 1
Token['['], line 1
93
Token[n=4.0], line 1
Token[CA], line 1

Why does it behave differently in Android Java? In Android, the character ‘]’ is somehow taken out of the stream before the tokenizer gets there. I’ve read the Java docs and the Android docs and the classes seem to be the same.

My API level is set to 7. I’ve tried to get the same result on both the Android 2.1 emulator and the Android 4.0 emulator. I also tried running it on a real device and I got the same result.

Best Solution

Basically, the Android StreamTokenizer implementation is a mess. From the source code,nextToken() parse the character read by the preceding unless it is the first character in thenextToken() stream. In my case, the ‘[‘ character has been read by the third nextToken() one. When the 4th nextToken() is called, the number 4 is read but “[” is printed. read() Then read ‘]’ as expected. Then the 5th prints out “4”, which has been read in by the nextToken() 4th nextToken() , and continues like this. Therefore, given the current implementation, and cannot read() nextToken() be mixed together.

Related Problems and Solutions