Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to retrieve matching macrolanguage locale from a given iso language code?

Tags:

java

locale

Given a ISO 639-2/T language code of scope individual, how can i programmatically find the matching macrolanguage code, if such as match exists?

For example, how to go from "nob" (Norwegian Bokmål, scope individual) to "nor" (Norwegian, scope macrolangauge)?

In general, there can be multiple individual languages that are not part of the same macrolanguage in the same country, so grouping by country alone will give false positives.

java.util.locale knows about ISO 639 three letter language codes and recognizes both codes in the example above, but doesn't have the concept of scope nor macrolanguage.

A heuristic, without false positives is also helpful in my case.

like image 482
Per Christian Henden Avatar asked Dec 06 '25 14:12

Per Christian Henden


1 Answers

You could make a list of macro language of your own, and corresponding individual languages.

Here's the list: https://iso639-3.sil.org/code_tables/639/data/all?title=&field_iso639_cd_st_mmbrshp_639_1_tid=All&name_3=&field_iso639_element_scope_tid=76&field_iso639_language_type_tid=All&items_per_page=200

Here's a selection I made some time ago:

public static final Map<String, String> macroLanguages = new HashMap<>();
static {
    macroLanguages.put("aao", "ara"); //https://iso639-3.sil.org/code/ara
    macroLanguages.put("abh", "ara");
    macroLanguages.put("abv", "ara");
    macroLanguages.put("acm", "ara");
    macroLanguages.put("acq", "ara");
    macroLanguages.put("acw", "ara");
    macroLanguages.put("acx", "ara");
    macroLanguages.put("acy", "ara");
    macroLanguages.put("adf", "ara");
    macroLanguages.put("aeb", "ara");
    macroLanguages.put("aec", "ara");
    macroLanguages.put("afb", "ara");
    macroLanguages.put("ajp", "ara");
    macroLanguages.put("apc", "ara");
    macroLanguages.put("apd", "ara");
    macroLanguages.put("arb", "ara");
    macroLanguages.put("arq", "ara");
    macroLanguages.put("ars", "ara");
    macroLanguages.put("ary", "ara");
    macroLanguages.put("arz", "ara");
    macroLanguages.put("auz", "ara");
    macroLanguages.put("avl", "ara");
    macroLanguages.put("ayh", "ara");
    macroLanguages.put("ayl", "ara");
    macroLanguages.put("ayn", "ara");
    macroLanguages.put("ayp", "ara");
    macroLanguages.put("bbz", "ara");
    macroLanguages.put("pga", "ara");
    macroLanguages.put("shu", "ara");
    macroLanguages.put("ssh", "ara");

    macroLanguages.put("ekk", "est"); //https://iso639-3.sil.org/code/est
    macroLanguages.put("vro", "est");

    macroLanguages.put("bos", "hbs"); //https://iso639-3.sil.org/code/hbs
    macroLanguages.put("hrv", "hbs");
    macroLanguages.put("srp", "hbs");
    macroLanguages.put("cnr", "hbs");

    macroLanguages.put("ltg", "lav"); //https://iso639-3.sil.org/code/lav
    macroLanguages.put("lvs", "lav");

    macroLanguages.put("nno", "nor"); //https://iso639-3.sil.org/code/nor
    macroLanguages.put("nob", "nor");

    macroLanguages.put("aae", "sqi"); //https://iso639-3.sil.org/code/sqi
    macroLanguages.put("aat", "sqi");
    macroLanguages.put("aln", "sqi");
    macroLanguages.put("als", "sqi");

    macroLanguages.put("ydd", "yid"); //https://iso639-3.sil.org/code/yid
    macroLanguages.put("yih", "yid");

    macroLanguages.put("ccx", "zha"); //https://iso639-3.sil.org/code/zha
    macroLanguages.put("ccy", "zha");
    macroLanguages.put("zch", "zha");
    macroLanguages.put("zeh", "zha");
    macroLanguages.put("zgb", "zha");
    macroLanguages.put("zgm", "zha");
    macroLanguages.put("zgn", "zha");
    macroLanguages.put("zhd", "zha");
    macroLanguages.put("zhn", "zha");
    macroLanguages.put("zlj", "zha");
    macroLanguages.put("zln", "zha");
    macroLanguages.put("zlq", "zha");
    macroLanguages.put("zqe", "zha");
    macroLanguages.put("zyb", "zha");
    macroLanguages.put("zyg", "zha");
    macroLanguages.put("zyj", "zha");
    macroLanguages.put("zyn", "zha");
    macroLanguages.put("zzj", "zha");

    macroLanguages.put("cdo", "zho"); //https://iso639-3.sil.org/code/zho
    macroLanguages.put("cjy", "zho");
    macroLanguages.put("cmn", "zho");
    macroLanguages.put("cpx", "zho");
    macroLanguages.put("czh", "zho");
    macroLanguages.put("czo", "zho");
    macroLanguages.put("gan", "zho");
    macroLanguages.put("hak", "zho");
    macroLanguages.put("hsn", "zho");
    macroLanguages.put("lzh", "zho");
    macroLanguages.put("mnp", "zho");
    macroLanguages.put("nan", "zho");
    macroLanguages.put("wuu", "zho");
    macroLanguages.put("yue", "zho");
    macroLanguages.put("cnp", "zho");
    macroLanguages.put("csp", "zho");

    macroLanguages.put("pes", "fas"); //https://iso639-3.sil.org/code/fas
    macroLanguages.put("prs", "fas");
}
like image 181
Alfred Faltiska Avatar answered Dec 08 '25 08:12

Alfred Faltiska



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!