Java – How to unit test bidirectional text processing

How to unit test bidirectional text processing… here is a solution to the problem.

How to unit test bidirectional text processing

I’m trying to write unit tests for some string format code. In some cases, the formatted output may contain bidirectional text, that is, a mix of left-to-right and right-to-left.

I’ve verified empirically that when running on an Android device or emulator, the output looks correct for all combinations of LTR and RTL inputs and outputs. I’m trying to catch this in unit tests, though; I’m not sure how to properly specify the expected output in my test case.

For example, I want to assert that the returned string will be rendered like this:

-د.ك.123,456.78

That is, the glyphs should appear in order from left to right:

-

1

2

3

,

4

5

6

7

8

bl

د

(You don’t know how hard it is to edit it to the correct sequence in the SO edit box!)

I have tried using the standard string comparison method in my test case as follows:

assertEquals("-د.ك.123,456.78", formattedOutput);

But this fails because the expected output string in the test code is reordering its text. In fact, depending on the tool I used to view the source code (Android Studio vs Github-with-Chrome), it appears differently in the source code, so I don’t believe it’s testing the right thing.

I also tried building the expected output value step by step to avoid confusion in the editor, although it would result in the same string literal being constructed behind the scenes:

assertEquals("-123,456.78" + SYMBOL, formattedOutput);

How can I compare the visual order of glyphs instead of the logical order? Can BidiFormatter for Android help me?

Solution

Typical assertEquals on that string and methods that return the same String (i.e. the new String object) work fine for me, using IntelliJ 14.0.3 Ultimate, with UTF-8 set to the actual text file.

Another possible approach, and one you can try, is to take the bytes of the string you want to compare with (perhaps manually specifying a specific encoding) and then use assertArrayEquals to compare the returned byte arrays to each other. At least I don’t understand why getting bytes of a string with a manually set encoding doesn’t work.

If you wish, you can check out this page for inspiration: https://docs.oracle.com/javase/tutorial/i18n/text/string.html

Related Problems and Solutions