net.sourceforge.pinyin4j
Class PinyinHelper

java.lang.Object
  extended by net.sourceforge.pinyin4j.PinyinHelper

public class PinyinHelper
extends java.lang.Object

A class provides several utility functions to convert Chinese characters (both Simplified and Tranditional) into various Chinese Romanization representations

Author:
Li Min (xmlerlimin@gmail.com)

Constructor Summary
private PinyinHelper()
           
 
Method Summary
private static java.lang.String[] convertToGwoyeuRomatzyhStringArray(char ch)
           
private static java.lang.String[] convertToTargetPinyinStringArray(char ch, PinyinRomanizationType targetPinyinSystem)
           
private static java.lang.String getFirstHanyuPinyinString(char ch, HanyuPinyinOutputFormat outputFormat)
          Deprecated. DO NOT use it again because the first retrived pinyin string may be a wrong pronouciation in a certain sentence context. This function will be removed in next release.
private static java.lang.String[] getFormattedHanyuPinyinStringArray(char ch, HanyuPinyinOutputFormat outputFormat)
          Return the formatted Hanyu Pinyin representations of the given Chinese character (both in Simplified and Tranditional) in array format.
private static java.lang.String[] getUnformattedHanyuPinyinStringArray(char ch)
          Delegate function
static java.lang.String[] toGwoyeuRomatzyhStringArray(char ch)
          Get all unformmatted Gwoyeu Romatzyh presentations of a single Chinese character (both Simplified and Tranditional)
static java.lang.String toHanyuPinyinString(java.lang.String str, HanyuPinyinOutputFormat outputFormat, java.lang.String seperater)
          Deprecated. DO NOT use it again because the first retrived pinyin string may be a wrong pronouciation in a certain sentence context. This interface will be removed in next release.
static java.lang.String[] toHanyuPinyinStringArray(char ch)
          Get all unformmatted Hanyu Pinyin presentations of a single Chinese character (both Simplified and Tranditional)
static java.lang.String[] toHanyuPinyinStringArray(char ch, HanyuPinyinOutputFormat outputFormat)
          Get all Hanyu Pinyin presentations of a single Chinese character (both Simplified and Tranditional)
static java.lang.String[] toMPS2PinyinStringArray(char ch)
          Get all unformmatted MPS2 (Mandarin Phonetic Symbols 2) presentations of a single Chinese character (both Simplified and Tranditional)
static java.lang.String[] toTongyongPinyinStringArray(char ch)
          Get all unformmatted Tongyong Pinyin presentations of a single Chinese character (both Simplified and Tranditional)
static java.lang.String[] toWadeGilesPinyinStringArray(char ch)
          Get all unformmatted Wade-Giles presentations of a single Chinese character (both Simplified and Tranditional)
static java.lang.String[] toYalePinyinStringArray(char ch)
          Get all unformmatted Yale Pinyin presentations of a single Chinese character (both Simplified and Tranditional)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PinyinHelper

private PinyinHelper()
Method Detail

toHanyuPinyinStringArray

public static java.lang.String[] toHanyuPinyinStringArray(char ch)
Get all unformmatted Hanyu Pinyin presentations of a single Chinese character (both Simplified and Tranditional)

For example,
If the input is '间', the return will be an array with two Hanyu Pinyin strings:
"jian1"
"jian4"

If the input is '李', the return will be an array with single Hanyu Pinyin string:
"li3"

Special Note: If the return is "none0", that means the input Chinese character exists in Unicode CJK talbe, however, it has no pronounciation in Chinese

Parameters:
ch - the given Chinese character
Returns:
a String array contains all unformmatted Hanyu Pinyin presentations with tone numbers; null for non-Chinese character

toHanyuPinyinStringArray

public static java.lang.String[] toHanyuPinyinStringArray(char ch,
                                                          HanyuPinyinOutputFormat outputFormat)
                                                   throws BadHanyuPinyinOutputFormatCombination
Get all Hanyu Pinyin presentations of a single Chinese character (both Simplified and Tranditional)

For example,
If the input is '间', the return will be an array with two Hanyu Pinyin strings:
"jian1"
"jian4"

If the input is '李', the return will be an array with single Hanyu Pinyin string:
"li3"

Special Note: If the return is "none0", that means the input Chinese character is in Unicode CJK talbe, however, it has no pronounciation in Chinese

Parameters:
ch - the given Chinese character
outputFormat - describes the desired format of returned Hanyu Pinyin String
Returns:
a String array contains all Hanyu Pinyin presentations with tone numbers; return null for non-Chinese character
Throws:
BadHanyuPinyinOutputFormatCombination - if certain combination of output formats happens
See Also:
HanyuPinyinOutputFormat, BadHanyuPinyinOutputFormatCombination

getFormattedHanyuPinyinStringArray

private static java.lang.String[] getFormattedHanyuPinyinStringArray(char ch,
                                                                     HanyuPinyinOutputFormat outputFormat)
                                                              throws BadHanyuPinyinOutputFormatCombination
Return the formatted Hanyu Pinyin representations of the given Chinese character (both in Simplified and Tranditional) in array format.

Parameters:
ch - the given Chinese character
outputFormat - Describes the desired format of returned Hanyu Pinyin string
Returns:
The formatted Hanyu Pinyin representations of the given codepoint in array format; null if no record is found in the hashtable.
Throws:
BadHanyuPinyinOutputFormatCombination

getUnformattedHanyuPinyinStringArray

private static java.lang.String[] getUnformattedHanyuPinyinStringArray(char ch)
Delegate function

Parameters:
ch - the given Chinese character
Returns:
unformatted Hanyu Pinyin strings; null if the record is not found

toTongyongPinyinStringArray

public static java.lang.String[] toTongyongPinyinStringArray(char ch)
Get all unformmatted Tongyong Pinyin presentations of a single Chinese character (both Simplified and Tranditional)

Parameters:
ch - the given Chinese character
Returns:
a String array contains all unformmatted Tongyong Pinyin presentations with tone numbers; null for non-Chinese character
See Also:
toHanyuPinyinStringArray(char)

toWadeGilesPinyinStringArray

public static java.lang.String[] toWadeGilesPinyinStringArray(char ch)
Get all unformmatted Wade-Giles presentations of a single Chinese character (both Simplified and Tranditional)

Parameters:
ch - the given Chinese character
Returns:
a String array contains all unformmatted Wade-Giles presentations with tone numbers; null for non-Chinese character
See Also:
toHanyuPinyinStringArray(char)

toMPS2PinyinStringArray

public static java.lang.String[] toMPS2PinyinStringArray(char ch)
Get all unformmatted MPS2 (Mandarin Phonetic Symbols 2) presentations of a single Chinese character (both Simplified and Tranditional)

Parameters:
ch - the given Chinese character
Returns:
a String array contains all unformmatted MPS2 (Mandarin Phonetic Symbols 2) presentations with tone numbers; null for non-Chinese character
See Also:
toHanyuPinyinStringArray(char)

toYalePinyinStringArray

public static java.lang.String[] toYalePinyinStringArray(char ch)
Get all unformmatted Yale Pinyin presentations of a single Chinese character (both Simplified and Tranditional)

Parameters:
ch - the given Chinese character
Returns:
a String array contains all unformmatted Yale Pinyin presentations with tone numbers; null for non-Chinese character
See Also:
toHanyuPinyinStringArray(char)

convertToTargetPinyinStringArray

private static java.lang.String[] convertToTargetPinyinStringArray(char ch,
                                                                   PinyinRomanizationType targetPinyinSystem)
Parameters:
ch - the given Chinese character
targetPinyinSystem - indicates target Chinese Romanization system should be converted to
Returns:
string representations of target Chinese Romanization system corresponding to the given Chinese character in array format; null if error happens
See Also:
PinyinRomanizationType

toGwoyeuRomatzyhStringArray

public static java.lang.String[] toGwoyeuRomatzyhStringArray(char ch)
Get all unformmatted Gwoyeu Romatzyh presentations of a single Chinese character (both Simplified and Tranditional)

Parameters:
ch - the given Chinese character
Returns:
a String array contains all unformmatted Gwoyeu Romatzyh presentations with tone numbers; null for non-Chinese character
See Also:
toHanyuPinyinStringArray(char)

convertToGwoyeuRomatzyhStringArray

private static java.lang.String[] convertToGwoyeuRomatzyhStringArray(char ch)
Parameters:
ch - the given Chinese character
Returns:
Gwoyeu Romatzyh string representations corresponding to the given Chinese character in array format; null if error happens
See Also:
PinyinRomanizationType

toHanyuPinyinString

public static java.lang.String toHanyuPinyinString(java.lang.String str,
                                                   HanyuPinyinOutputFormat outputFormat,
                                                   java.lang.String seperater)
                                            throws BadHanyuPinyinOutputFormatCombination
Deprecated. DO NOT use it again because the first retrived pinyin string may be a wrong pronouciation in a certain sentence context. This interface will be removed in next release.

Get a string which all Chinese characters are replaced by corresponding main (first) Hanyu Pinyin representation.

Special Note: If the return contains "none0", that means that Chinese character is in Unicode CJK talbe, however, it has not pronounciation in Chinese. This interface will be removed in next release.

Parameters:
str - A given string contains Chinese characters
outputFormat - Describes the desired format of returned Hanyu Pinyin string
seperater - The string is appended after a Chinese character (excluding the last Chinese character at the end of sentence). Note! Seperater will not appear after a non-Chinese character
Returns:
a String identical to the original one but all recognizable Chinese characters are converted into main (first) Hanyu Pinyin representation
Throws:
BadHanyuPinyinOutputFormatCombination

getFirstHanyuPinyinString

private static java.lang.String getFirstHanyuPinyinString(char ch,
                                                          HanyuPinyinOutputFormat outputFormat)
                                                   throws BadHanyuPinyinOutputFormatCombination
Deprecated. DO NOT use it again because the first retrived pinyin string may be a wrong pronouciation in a certain sentence context. This function will be removed in next release.

Get the first Hanyu Pinyin of a Chinese character This function will be removed in next release.

Parameters:
ch - The given Unicode character
outputFormat - Describes the desired format of returned Hanyu Pinyin string
Returns:
Return the first Hanyu Pinyin of given Chinese character; return null if the input is not a Chinese character
Throws:
BadHanyuPinyinOutputFormatCombination