Java – Removes special characters from Java strings

Removes special characters from Java strings… here is a solution to the problem.

Removes special characters from Java strings

I’m trying to solve the problem of removing symbols and special characters from the original text in Java, but can’t find a workaround. The text is taken from the free text field of the website, which may contain any literal meaning. I’m getting this text from an external source and have no control over changing the settings. So I have to solve it at the end.
Some examples are

1) Belem ???? should be –> belem

2) Ariana should be -> Ariana ????

3) Harlem should be -> Harlem ????

4) Yz should be –> Yz ????️ ????

5) ここさけは7回は見り行くぞ

應該–>ここさけは7回見り行くぞ ????????

6) دمي اcreami وطني اcreami ???????????????????? should be – >

Does it help?

Solution

You can try this regular expression that finds all emojis in one string:

regex = "[\\ud83c\\udc00-\\ud83c\\udfff]|[ \\ud83d\\udc00-\\ud83d\\udfff]| [\\u2600-\\u27ff]"

Then use the replaceAll() method to remove all emojis in it:

String text = "ここさけは7回は見に行くぞ ???????? ";
String regex = "[\\ud83c\\udc00-\\ud83c\\udfff]| [\\ud83d\\udc00-\\ud83d\\udfff]| [\\u2600-\\u27ff]";
System.out.println(text.replaceAll(regex, ""));

Output:

こさけは7回は見に行くぞ 

Related Problems and Solutions