How do I automatically generate a umlaut table?
I want vowel tables diacritics, but don’t want to search symbol tables manually.
Is it possible to generate this table from a list of cross-vowels and a list of diacritics using some of the following languages: Java, PHP, Wolfram Mathematica, .NET Language and so on?
I need to use characters (unicode) as output.
Java solutions
I found a special Unicode feature: http://en.wikipedia.org/wiki/Unicode_normalization
Java supports it since 1.6 http://docs.oracle.com/javase/6/docs/api/java/text/Normalizer.html
So, the sample code is:
public static void main(String[] args) {
String vowels = "aeiou";
char[] diacritics = {'\u0304', '\u0301', '\u0300', '\u030C'};
StringBuilder sb = new StringBuilder();
for(int v=0; v<vowels.length(); ++v) {
for(int d=0; d<diacritics.length; ++d) {
sb.append(vowels.charAt(v));
sb.append(diacritics[d]);
sb.append(' ');
}
sb.append(vowels.charAt(v));
sb.append('\n');
}
String ans = Normalizer.normalize(sb.toString(), Normalizer.Form.NFC);
JOptionPane.showMessageDialog(null, ans);
}
i.e. we just put the combined diacritic after the vowel letter and then apply normalization to the string.
Solution
To be honest, I haven’t fully understood what Szabolcs’ code is doing, but in this particular case, this seems to produce the same result with slightly less code in Mathematica
data = Import["http://unicode.org/Public/UNIDATA/NamesList.txt", "Lines"];
codes = Cases[data,
b_String /; StringMatchQ[
b, ___ ~~ "LATIN " ~~ "CAPITAL" | " SMALL" ~~ " LETTER " ~~
"A" | "E" | "I" | "O" | "U" ~~ " WITH " ~~ ___] :>
FromDigits[StringTake[b, 4], 16], Infinity];
FromCharacterCode[codes]
Produces
"ÀÁÂÃÄÅÈÉÊËÌÍÎÏÒÓÔÕÖØÙÚÛÜàáâãäåèéêëìíîïòóôõöøùúûüĀāĂ㥹ĒēĔĕĖėĘęĚěĨĩĪīĬ\
ĭĮįİŌōŎŏŐőŨũŪūŬŭŮůŰűŲųƗƟƠơƯưǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǞǟǠǡǪǫǬǭǺǻǾǿȀȁȂȃȄȅȆȇȈȉȊȋȌȍ\
ȎȏȔȕȖȗȦȧȨȩȪȫȬȭȮȯȰȱȺɆɇɨᶏᶒᶖᶙḀḁḔḕḖḗḘḙḚḛḜḝḬḭḮḯṌṍṎṏṐṑṒṓṲṳṴṵṶṷṸṹṺṻẚẠạẢảẤấẦầẨ\
ẩẪẫẬậẮắẰằẲẳẴẵẶặẸẹẺẻẼẽẾếỀềỂểỄễỆệỈỉỊịỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợỤụỦủỨứỪừỬửỮ\
ữỰựⱥⱸⱺꝊꝋꝌꝍ"