Wikivoyage talk:Romanization/Archive 2004-2023
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Inspired by some creative messes for both Chinese and Japanese place names, here's my attempt at setting a standard for the two languages in question, strongly modeled on Wikipedia (which does a pretty good job). Comments welcome, and additional languages (Thai, Hindi, Korean...) even more so. (WT-en) Jpatokal 03:44, 3 Aug 2004 (EDT)
- So, first of all, great idea for a guideline. We needed it.
- Second, I moved the link from Project:policies and guidelines to Project:manual of style. In general, interaction stuff is policies, and content stuff goes in the manual of style.
- Lastly, it'd be good to reconcile this with Project:foreign words and Project:article naming conventions. --(WT-en) Evan 11:33, 3 Aug 2004 (EDT)
- One more thing: we don't follow Wikipedia guidelines for anything else by default, and I don't see why we should do so here. So, I took that comment out. Also, I removed the point about article titles, and simply referenced the article naming conventions. --(WT-en) Evan 12:23, 3 Aug 2004 (EDT)
- Objection -- the nutshell summaries I provided on the page are not sufficient for transliterating the more difficult cases. (WT-en) Jpatokal 21:45, 3 Aug 2004 (EDT)
- Lastly, can I ask for some kind of justification for the romanization standards used? What are they, and why should we use one rather than another for one place or another? Are travelers more likely to see Wade-Giles transliterations in Taiwan? --(WT-en) Evan 12:26, 3 Aug 2004 (EDT)
- This, you see, is why I want to reference Wikipedia, who have thought about this long and hard! But in a nutshell, Hanyu pinyin is the system used on mainland China, while Wade-Giles remains the more common system on Taiwan (although tongyong pinyin is slowly supplanting it). In Hong Kong, Yale is the most common system for Cantonese (a separate language). For Japanese, Hepburn (written by an American for foreigners) has been the de facto standard of romanization for the past 100 years esp. in publications geared to foreigners, while official standard Kunrei (written by Japanese for Japanese) is used very little. Kunrei matches Japan's kana syllabary nicely but is impossible to pronounce without prior knowledge, whereas Hepburn is based on English phonology and lets most people give a fair stab at it. (WT-en) Jpatokal 21:45, 3 Aug 2004 (EDT)
- I agree with the second thing you just said about Hepburn, but what you said about it being the de-facto standard isn't exactly true. What you will see in subways, street signs, etc. in Japan is Kunrei-shiki, and Kunrei is used by most Japanese people.
- Sorry, this is just not correct. JR and every private railway/subway/monorail I have ever seen uses Hepburn. The official road standard for signage is also Hepburn, see . (WT-en) Jpatokal 13:15, 13 Jun 2005 (EDT)
- Perhaps the most difficult issue regarding Japanese romanisation is the n vs m (shimpo vs shinpo, shimbun vs shinbun, sambyakuen vs sanbyakuen, panpu vs pampu)... What are your thoughts?
- Both revised Hepburn and Kunrei say always use 'n', so I think the answer is obvious, except in a few cases where the m-spelling is firmly established (Namba, Asahi Shimbun, etc.) This is also Wikipedia's policy and I'll add a note here as well. (WT-en) Jpatokal 13:15, 13 Jun 2005 (EDT)
- Re Taiwan, Tongyong Pinyin is official nationwide. Some places are most commonly known in English by their Minnan or Hakka names, which makes it especially difficult.
- In Taipei alone, their street signs are mixed and matched between Tongyong Pinyin, Hanyu Pinyin (gasp! yes, the komunist leterz!!!), and Wade-Giles.
- Taiwan has, incomprehensibly, decided that every municipality can decide their own standard. sigh I think all we can do is stick with Wade-Giles until hits for Taibei on Google outnumber those for Taipei... (WT-en) Jpatokal 13:28, 13 Jun 2005 (EDT)
- In Hong Kong, Jyutping (yueyu pinyin) is official now, but as you said Yale is still the most common. Penkyamp has also gained a following, especially of people who feel that Jyutping follows to closely the hanyu pinyin of the mainland. --(WT-en) Node ue 12:36, 13 Jun 2005 (EDT)
Romanization of Hebrew
I think we need to set a standard for romanizing Hebrew before it's too late. WikiPedia:Romanization of Hebrew and WikiPedia:Hebrew alphabet provide useful starting points.
In my opinion (and please note that I only have a passing acquaintance with Hebrew), the LOC system is too academic as the middle dots are never used in casual writing. I'd be inclined to favor what the Hebrew alphabet article calls "Israeli" transliteration, because this seems to be what they use on the expressway signs; based on a quick eyeballing this seems quite close to the UN system. (WT-en) Jpatokal 08:51, 14 Mar 2005 (EST)
- For lack of better ideas I've plunged ahead and adopted the UN system, except that I've replaced the weirdo underscores for ח and צ with 'ch' and 'tz'. I'm also tempted to keep the spelling of ק as 'k' (as normally pronounced) instead of the rather artificial 'q'. Opinions? (WT-en) Jpatokal 09:59, 7 Jun 2005 (EDT)
- I think that 'ch' should be 'kh' and 'tz' should be 'ts'. k vs q is a difficult question -- there was a real difference between the two in old hebrew, but in revived hebrew it doesn't exist in most pronunciations. Similarly, tov is generally pronounced as a "t", while ancient hebrew had a distinction between a "t" and a "th" sound, which are distinguished in Askhenazic religious pronunciation, but not in Sephardic. --MW
- MW, I very strongly disagree. In the entire Western world outside of the US and Britain, we pronounce KH as K. We pronounce CH as you pronounce a Chet. That's the modern way of the modern world. Further, TZ is easier to read than TS. And K is also MUCH better than Q. In short: ch, tz, k. --(WT-en) Daniel575 07:22, 11 July 2006 (EDT)
- I think that 'ch' should be 'kh' and 'tz' should be 'ts'. k vs q is a difficult question -- there was a real difference between the two in old hebrew, but in revived hebrew it doesn't exist in most pronunciations. Similarly, tov is generally pronounced as a "t", while ancient hebrew had a distinction between a "t" and a "th" sound, which are distinguished in Askhenazic religious pronunciation, but not in Sephardic. --MW
Chinese romanization
Edited this section since
- It is not possible to "use Mandarin" as we are not reading this aloud.
- Oh yes we are! So 上海 is "shànghǎi" (putonghua), not "zɑ̃ hɛ" (shanghainese). (WT-en) Jpatokal 10:35, 20 Jun 2005 (EDT)
- Most people would romanise it as "zanhe" or "zanghe", or occasionally "zoenghe". I do believe though that we should include the local pronunciation. And that, where more widely spread, we should prefer it. (Hong Kong vs Xianggang, Quemoy [ie, Kinmng], Amoy (-> e-mng, but it's fallen out of favour) vs Xiamen) --(WT-en) Node ue 18:32, 23 Jun 2005 (EDT)
- Oh yes we are! So 上海 is "shànghǎi" (putonghua), not "zɑ̃ hɛ" (shanghainese). (WT-en) Jpatokal 10:35, 20 Jun 2005 (EDT)
- The street names in Taiwan are now either in Tongyong or Hanyu pinyin (for the most part). For example, in Taipei, all street names are in Hanyu Pinyin, even though the old bastardized Wade-Giles "Taipei" is still being used. --(WT-en) Jiang 08:11, 20 Jun 2005 (EDT)
- I rolled this back, and then I rolled back this rollback. I think this needs some editing, but I'm not sure how to do it. --~~
- I'm open to suggestions on how to handle Taiwan. The politics are one thing but I can't believe they can't even sort out their bloody spelling... (WT-en) Jpatokal 10:35, 20 Jun 2005 (EDT)
- I removed this line: Include Chinese characters when possible. I don't know of a definition of "romanization" that covers having Chinese characters in the romanized version.
- Err, that seems a terribly circumscribed way of looking at romanization. What the line means, and you'll note that similar exhortations can be found for other languages, is that you should enter both the original characters and their romanization. I'll try to make this more generic. (WT-en) Jpatokal 13:15, 20 Jun 2005 (EDT)
- I like the changed version better. I've removed the "Include Chinese characters" again, since you've taken it out for other languages. --(WT-en) Evan 14:47, 20 Jun 2005 (EDT)
- I also removed Use the most romanization rendered in English. This generally means:. It doesn't make any sense. If there's some more rigorous rule of thumb ("Use the most common English-language romanization"?), we should use that. --(WT-en) Evan 12:34, 20 Jun 2005 (EDT)
- That is what I meant. I accidentally left out the "commonly" after "most". --61.216.116.116 11:02, 21 Jun 2005 (EDT)
- What about addresses in China? For example, we could write Tian He Bei Lu like that. Or we could write Tianhe Beilu. Or even Tianhebeilu. Sometimes the roads are translated to English: Tian He Road North. Or how about Tian He Rd North? I tried to look at existing China articles, but there seems to be complete mixture of all kinds of different styles in there... -- (WT-en) Trsqr 11:59, 23 October 2006 (EDT)
- Write syllables together and in lowercase. 南大街 is Nándàjiē, not "Nan Da Jie" or "NanDaJie", says the guideline right now. Please feel free to expand the section, and I would not oppose an entire "Manual of Style" for Chinese destinations. (WT-en) Jpatokal 21:45, 23 October 2006 (EDT)
- Yes, I have noticed, but since this part of the guideline is followed so badly, I though that there's maybe room for discussion. I am used to the format that most of the guidebooks seem to use: Tianhe Beilu. So that the actual name of the road (Tianhe) and the type of the street (Beilu = North Road) are separated. I think that's a bit easier to read, but obviously this is arguable. -- (WT-en) Trsqr 03:59, 24 October 2006 (EDT)
- I also agree that that looks better. Howabout I change the guidelines to "write words together" and use Tianhe Beilu as the example? (WT-en) Jpatokal 04:24, 24 October 2006 (EDT)
- Yes. Using space to separate words, like Tianhe Beilu or Jiefang Dajie, is correct Pinyin and how most street signs and street maps in mainland China do Pinyin. So this is how we should do it for mainland China, too. --(WT-en) Rogerhc 00:07, 16 July 2007 (EDT)
Italic romanization?
Should we italicize romanizations? We have been doing so. Why? It strikes me as unorthodox and unnecessary. Unless we have a good reason for italicizing romanizations I think we should stop doing it. I have italicized the romanization examples in the guideline only so that they match our habit, not because I agree with it. I'm seeking clarification on this. Please comment below. Thank you. --(WT-en) Rogerhc 18:28, 16 July 2007 (EDT)
- I prefer to write the 'pinyin name' in italics. In addition, I think the name in characters and pinyin has been added to most articles (eg Chengdu, Guangzhou) in the following way: Shanghai (上海 Shànghǎi). Any objections to changing the policy accordingly? Otherwise, it will need a lot of effort to remove the italics from each article .... (WT-en) WindHorse 00:18, 16 July 2007 (EDT)
- Many were done without italics and look correct to me. Open all the cities listed on the China page to see. --(WT-en) Rogerhc 19:11, 17 July 2007 (EDT)
- Italic Pinyin Name is the way the listing tags are doing it. Why? --(WT-en) Rogerhc 18:28, 16 July 2007 (EDT)
- See earlier discussion at Talk:China#Consistency and my/WindHorse's comments below. (WT-en) Jpatokal 22:47, 17 July 2007 (EDT)
- Example:
- <sleep name="Some Name" alt="小中国 Xiǎo Zhōngguó" address="83 Some Road, Some District, Some City" phone="00-000-000-000" fax="" price="¥100">Description goes here.</sleep>
- Example:
- Italics may be unorthodox and unnecessary there. I'm not sure. Mainly, I'd like to see consistency across Wikivoyage's style on this either way. The style guide should be clear about this even if we change it later. --(WT-en) Rogerhc 00:48, 16 July 2007 (EDT)
- At the moment the policy is not clear. It states: Shanghai (上海 Shànghǎi) is a city. Place parentheses around Chinese characters and their pinyin readings. Do not use italics, bold or quote marks. The example shows the pinyin to be in italics, but the instructions state not to use italics. This is confusing so, for consistency, I'll remove the the part of the instruction that states 'Do not use italics'. Later, if the consensus moves the other way, it can be re-added and the example amended accordingly. (WT-en) WindHorse 19:38, 16 July 2007 (EDT)
The basic meaning of italics is that "this word is not English", and writing foreign (romanized) words in italics is a worldwide convention also observed on Wikipedia and Wikivoyage. However, for the specific case of pinyin, there was concern earlier that the italics make the tone marks harder to read. I don't have a strong opinion either way, but now that we're using the listing tags, it's just easier to go with the flow and italicize pinyin as well. FWIW, the guidelines for Vietnamese also suggest italics, even though Vietnamese is even more diacritic-happy than pinyin Chinese... (WT-en) Jpatokal 22:54, 16 July 2007 (EDT)
- I fully agree. Take a look at any reference book. There will be the English title followed by a transliteration of the word in a foreign language, which will be written in parentheses and in italics. If there is more than one reading, then the languages are identified in an abbreviated form and placed before the transliterations - eg: Lotus (Skt: padma; Pali: paduma; Ch: liánhūa). Using parentheses and italics is a standard and internationally used practice. (WT-en) WindHorse 01:52, 17 July 2007 (EDT)
- It's also Wikivoyage style for foreign words. --(WT-en) Evan 00:41, 18 July 2007 (EDT)
Lao
Here's a tough and obscure nut to crack: romanizing Lao. According to my very limited understanding of the topic, there's two main systems in use, namely the old French style (where everything is written together, ou is used for "u" and there are some random accents -- eg. Louangphrabang, Houayxai), and the other is LRT, which is very closely modeled on Thai RTGS and does away with the Frenchified mess (eg. Luang Prabang, Huay Xai). You can probably already guess which one I prefer, but there's no official government standard as far as I can tell. I'd wager to guess that the French style remains marginally more common, but LRT is more user-friendly for the English speaker. And for what it's worth, LP uses LRT (more or less), CIA uses Frenchy, and Wikipedia doesn't have an opinion. Does anybody else? (WT-en) Jpatokal 07:49, 22 Jun 2005 (EDT)
Other languages
So, I'm counting some other languages with non-Latin scripts:
- Arabic
- Greek
- Russian
They may be somewhat easier, since they're alphabetic, but it might be good to give them a shot. --(WT-en) Evan 16:04, 10 April 2006 (EDT)
Romanization of Japanese addresses
I have a question about how to display Japanese addresses in articles. Obviously you can display the kanji:
- Hotel Socia (ホテルソシア), 〒877-0013大分県日田市元町17-3
but that isn't much use to people who can't read Japanese. So therefore should we display the kanji and the romaji?
- Hotel Socia (ホテルソシア), 〒877-0013大分県日田市元町17-3 (Oita-ken, Hita-shi, Moto-machi, 17-3)
If so, doesn't that look a little unweildy? Is there an established policy for this? (WT-en) Bobo12345 06:40, 22 July 2006 (EDT)
- I think displaying kanji on the English edition is not an essential but in some cases very useful. Can't you imagine a scene that you show the printed out pages to a local taxi driver and asking him/her to take you to that hotel (英語版でも、こういう使い方だってあるでしょ)?-(WT-en) Shoestring 07:37, 22 July 2006 (EDT)
- Established policy is to give the romanized address only, so "Hotel Socia (ホテルソシア), Motomachi 17-3" is enough. The postcode is not needed and the -ken, -shi are obvious from the Hita article itself. That said, I think "Motomachi (元町) 17-3" could also be useful sometimes... (WT-en) Jpatokal 13:37, 22 July 2006 (EDT)
Indic languages - Hindi, Bengali, Nepali etc.
We have inconsistent and sometimes archaic systems. For example WikiTravel has articles for Birgunj and Nepalgunj (नेपालगञ्ज) which are towns in Nepal. -ganj seems a better transliteration, however Google produces more hits with -gunj. (WT-en) LADave 17:13, 10 June 2007 (EDT)
- Indeed, ganj would be the more correct transliteration, I think... the "a" is pronounced "uh", which is my guess as to why someone would mispell it as gunj – (WT-en) cacahuate talk 01:11, 17 July 2007 (EDT)
- Well, I speak Hindi, which essentially follows the same script and pronounciation system as Nepali, and the u in Nepalgunj is pronounced like allow, not like shut -- (WT-en) Upamanyuwikivoyage • ( Talk ) • ( (WT-en) Travel ) • 06:38, 3 September 2007 (EDT)
It's all Greek to me
So, I think we need a standard scheme for romanizing Greek. I know little about the topic, but Wikipedia has this. Which of these are standardized in Greece itself for road signs etc? (WT-en) Jpatokal 15:09, 9 March 2008 (EDT)
- I don't think the answer would be very useful, since the table linked to lists two or more alternate Roman characters to correspond with most of the Greek ones, with no guidelines on which to use when. I think the important thing is to be consistent, since a stylistic inconsistency in transcibing is one of those things which can make the layout look amateurish. On a related note, what would be really useful would be a standard to always put accent marks in any Greek words used, including place names, in their Greek form transcribed to Roman characters. The stress accent is a very important part of a Greek word, and many travelers who try to say Greek words they know only from print have no idea where the accents should go. (WT-en) Sailsetter 18:53, 14 March 2008 (EDT)
- The Wikipedia page I linked to has several different standards, and yes, obviously we need to pick one. The best option would be one most commonly used in Greece -- which one is it?
- And I'm all for indicating accents: the easiest way is just to capitalize the stressed syllable, as you've already done for Hydra ("EE-thra"). (WT-en) Jpatokal 12:13, 15 March 2008 (EDT)
Russian
Would something on Russian be useful in this article? Rules are quite straightforward, at least at the first sight. I can add a first draft if it can be useful here. --(WT-en) DenisYurkin 17:47, 24 October 2008 (EDT)
- It would be useful. To me it doesn't seem at all straight-forward, as the romanisation for English differs from that for e.g. German or Swedish. That means that some names have established romanisations that differ from what the rules would say. Then we have names in countries where Russian is a significant language, but another one, often with different romanisation of the same Cyrillic letters. You need to know from what language you are transcribing and what to watch out for. –LPfi (talk) 16:44, 7 February 2023 (UTC)
- I can work on a romanization of Russian section if needed. I'm not aware of any formal guidelines à la Japanese with Hepburn, but it is straightforward 99% of the time (I am fluent in Russian). It's generally harder to cyrillicize English than it is to romanize Russian/other Cyrillic-based languages, since Russian is phonetic. I can also work on one for Ukrainian as well. Tuyuhun (talk) 05:38, 12 June 2023 (UTC)
Romanisation of Bengali
As I read w:Romanisation of Bengali and its talk page, I learnt that there are several ways to romanise the language, none of which render the actual pronunciation. Should we establish our own romanisation scheme for the sake of the travellers? Sbb1413 (he) (talk • contribs) 13:28, 7 February 2023 (UTC)
- We should try to give some advice, but I know nothing about Bengali and possible pitfalls. –LPfi (talk) 16:46, 7 February 2023 (UTC)
- The problems with using existing Indic romanisation schemes in Bengali are the following:
- The inherent vowel (equivalent to Devanagari अ) is romanised as a in most romanisations. This is perfectly fine for most Indic languages where the inherent vowel is pronounced like about. However, the inherent vowel (written as অ) is pronounced like dawn or boat (monophthong) in Bengali.
- The Indic consonant equivalent to Devanagari य is romanised as y in most romanisations. This is perfectly fine for most Indic languages where the consonant is pronounced like yes. However, the consonant (written as য) is pronounced like edge in Bengali.
- Sbb1413 (he) (talk • contribs) 17:20, 7 February 2023 (UTC)
- I don't think those are real problems. Think about how to pronounce French! You need to learn something about how to pronounce the language anyway (if you don't want to sound too much like a tourist), and a few letters are easy to explain in Talk and in the phrasebook. –LPfi (talk) 17:45, 7 February 2023 (UTC)
- (And an Anglophone may pronounce "a" as in "hey" or whatever anyway.) –LPfi (talk) 17:53, 7 February 2023 (UTC)
- I don't think those are real problems. Think about how to pronounce French! You need to learn something about how to pronounce the language anyway (if you don't want to sound too much like a tourist), and a few letters are easy to explain in Talk and in the phrasebook. –LPfi (talk) 17:45, 7 February 2023 (UTC)
- The problems with using existing Indic romanisation schemes in Bengali are the following:
Tamil romanization
I have introduced the Indic section in the Romanization page without discussion, as there was no romanization of Indic words at all. However, as the Tamil script, an Indic script, cannot distinguish between voiced and voiceless consonants, I think the plosives should be romanized voiced or voiceless as per their general pronunciations, so that the travellers can pronounce Tamil wods correctly. So I've gone ahead and introduced the Tamil section. Sbb1413 (he) (talk • contribs) 04:41, 12 June 2023 (UTC)