编程语言
首页 > 编程语言> > java-在jdbc中将字符转换为’

java-在jdbc中将字符转换为’

作者:互联网

我正在尝试从MySql数据库中读取UTF-8字符串,该字符串是使用以下命令创建的:

CREATE DATABASE april
  DEFAULT CHARACTER SET utf8
  DEFAULT COLLATE utf8_general_ci;

我使用以下方法制作感兴趣的表:

DROP TABLE IF EXISTS `article`;
CREATE TABLE `article` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `text` longtext NOT NULL,
  `date_created` timestamp DEFAULT NOW(),
  PRIMARY KEY (`id`)
) CHARACTER SET utf8;

如果从MySql命令行实用程序的文章中选择*,则会得到:

OIL sands output at Nexen’s Long Lake project dropped in February.

但是,当我这样做

ResultSet rs = st.executeQuery(QUERY);

long id = -1;
String text = null;
Timestamp date = null;
while (rs.next()) {
    text = rs.getString("text");
    LOGGER.debug("text=" text);
}

我得到的输出是:

text=OIL sands output at Nexen’s Long Lake project dropped in February.

我通过以下方式获得连接:

DriverManager.getConnection("jdbc:" + this.dbms + "://" + this.serverHost + ":" + this.serverPort + "/" + this.dbName + "?useUnicode&user=" + this.username + "&password=" + this.password);

我也试过了,而不是useUnicode参数:

characterEncoding=UTF-8
and
characterEncoding=utf8

我也尝试过,而不是一行text = rs.getString(“ text”)

rs.getBytes("text");
String[] encodings = new String[]{"US-ASCII", "ISO-8859-1", "UTF-8", "UTF-16BE", "UTF-16LE", "UTF-16", "Latin1"};
for (String encoding : encodings) {
    text = new String(temp, encoding);
    LOGGER.debug(encoding + ": " + text);
}
// Which outputted:
US-ASCII: OIL sands output at Nexen��������s Long Lake project dropped in February.
ISO-8859-1: OIL sands output at Nexenââ¬â¢s Long Lake project dropped in February.
UTF-8: OIL sands output at Nexen’s Long Lake project dropped in February.
UTF-16BE: 佉䰠獡湤猠潵瑰畴⁡琠乥硥滃ꋢ芬ꉳ⁌潮朠䱡步⁰牯橥捴⁤牯灰敤⁩渠䙥扲畡特�
UTF-16LE: 䥏⁌慳摮⁳畯灴瑵愠⁴敎數썮겂蓢玢䰠湯⁧慌敫瀠潲敪瑣搠潲灰摥椠敆牢慵祲�
UTF-16: 佉䰠獡湤猠潵瑰畴⁡琠乥硥滃ꋢ芬ꉳ⁌潮朠䱡步⁰牯橥捴⁤牯灰敤⁩渠䙥扲畡特�
Latin1: OIL sands output at Nexenââ¬â¢s Long Lake project dropped in February.

我使用文件中的一些预定义的sql将字符串加载到DB中.该文件是UTF-8编码的.

mysql -u april -p -D april < insert_articles.sql

该文件包括以下行:

 INSERT INTO article (text) value ("OIL sands output at Nexen’s Long Lake project dropped in February.");

当我使用以下方法在应用程序中打印该文件时:

BufferedReader reader = new BufferedReader(new FileReader(new File("/home/path/to/file/sql_article_inserts.sql")));
 String str;
 while((str = reader.readLine()) != null) {
     LOGGER.debug("LINE: " + str);
 }

我得到正确的预期输出:

LINE: INSERT INTO article (text) value ("OIL sands output at Nexen’s Long Lake project dropped in February.");

任何帮助将非常感激.

一些系统细节:
我在Linux(Ubuntu)上运行

编辑:
*编辑以指定操作系统
*编辑以读取sql输入文件的详细输出.
*编辑以指定有关如何将数据插入数据库的更多信息.
*编辑以修复代码中的错字,并阐明示例.

解决方法:

您是否可能使用错误的编码读取日志文件? Windows 1252,我猜.

UTF-8: OIL sands output at Nexen’s Long Lake project dropped in February.

如果这出现在日志中,请对日志文件进行十六进制转储.如果数据为UTF-8,则您希望序列Nexen变成4E 65 78 65 6E E2 80 9973.如果其他应用程序将其读取为本机ANSI编码,则会将其解码为Nexen.

为了确认,您还可以转储返回值的各个字符,以查看它们在UTF-16中是否正确:

//untested
for(char ch : text.toCharArray()) {
   System.out.printf("%04x%n", (int) ch);
}

我假设所有数据都在BMP中,所以您可以在Unicode charts中查找结果.

标签:character-encoding,utf-8,jdbc,java,mysql
来源: https://codeday.me/bug/20191208/2092316.html