Jump to content

Wikivoyage:UTF-8 text entry

From Wikivoyage

You can enter any Unicode character up to FFFF as follows:

  1. Find the character's ordinal number in http://www.unicode.org/charts/.
  2. Enter the number in http://www.lojban.org/jbovlaste/util/hextoutf8.html and hit the button.

Directional embedding

[edit]

In Hebrew and Arabic, words are written from right to left, but numbers are written from left to right (except when written in Hebrew numerals, which are letters assigned numerical values). Punctuation marks are neutral, meaning that they do not have an intrinsic direction. This can lead to messes such as the following:

Shlomo said, "אתן לך 200 אגורות?" to Rina.

The question mark, which should be to the left of "agorot", is in the wrong place because the overall direction of the sentence is left-to-right, so the question mark and second quotation mark are placed left of "to". To fix this sort of problem, Unicode has five characters to control the direction of text:

‪U202ABegins an embedded left-to-right piece of text.
‫U202BBegins an embedded right-to-left piece of text.
‬U202CEnds an embedded piece of text.
‭U202DBegins an embedded piece of text where the direction is forced left-to-right.
‮U202EBegins an embedded piece of text where the direction is forced right-to-left.

You probably will not need the overrides; they are used when the numbers as well as the letters are written right-to-left. Don't try to insert the actual Unicode characters; they are invisible, their ends are far apart, and they will confuse the dickens out of anyone trying to figure out what you did. Use the HTML code instead. It looks confusing (the ampersand and number sign are neutral, so they will end up in reverse order somewhere else), but at least it's visible.

With the embedded codes, the above sentence looks like:

Shlomo said, "אתן לך 200 אגורות?" to Rina.

Now the question mark is in the right place.