Skip to content

Commit

Permalink
Fix UTF-8 BOM handling
Browse files Browse the repository at this point in the history
  • Loading branch information
robincodex committed Mar 30, 2024
1 parent 60c71d5 commit 7c4053c
Show file tree
Hide file tree
Showing 8 changed files with 920 additions and 446 deletions.
16 changes: 1 addition & 15 deletions README-zh-cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,21 +24,7 @@ yarn add easy-keyvalues

## UTF-8 BOM

这个库遇到`UTF-8 BOM`的文件会出错,原因是`UTF-8 BOM`的文件的第一个字符会是`65279`,导致解析错误,需要提前使用如 [iconv-lite](https://github.com/ashtuchkin/iconv-lite) 这样的库去掉 `BOM`

~~该库已经对 nodejs 进行了适配,使用 [chardet](https://www.npmjs.com/package/chardet) 判断编码格式,在
读写时使用 [iconv-lite](https://github.com/ashtuchkin/iconv-lite) 进行解码和编码。~~

~~如果是自定义适配器,需要提前使用如 [iconv-lite](https://github.com/ashtuchkin/iconv-lite) 这样的库去
`BOM`~~

nodejs的适配已经去掉了`chardet``iconv-lite`,由于自动判断文件编码以及自动转换编码会带来不确定性的结果,现在去掉了这个支持,添加了一个`encoding`的参数。

```js
const buf = readFileSync(join(__dirname, 'chat_english.txt'));
const text = iconvLite.decode(buf, 'utf8');
const kv = KeyValues.Parse(text);
```
遇到这种格式的文件现在会自动去掉BOM。

# KeyValues

Expand Down
16 changes: 1 addition & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,7 @@ yarn add easy-keyvalues

## UTF-8 BOM

This library will get errors when encountering `UTF-8 BOM` files, because the first char code of
`UTF-8 BOM` files will be 65279, which causes parsing errors. So you need to use a library such as
[iconv-lite](https://github.com/ashtuchkin/iconv-lite) to remove the `BOM` in advance.

~~The library has been adapted for nodejs, uses [chardet](https://www.npmjs.com/package/chardet) to
determine the encoding format, and uses [iconv-lite](https://github.com/ashtuchkin/iconv-lite) for
decoding and encoding when reading and writing.~~

The nodejs adaptation has removed `chardet` and `iconv-lite`. Due to the uncertainty of automatically determining file encodings and automatically converting encodings, this support has been removed and a `encoding` parameter has been added.

```js
const buf = readFileSync(join(__dirname, 'chat_english.txt'));
const text = iconvLite.decode(buf, 'utf8');
const kv = KeyValues.Parse(text);
```
The BOM will be automatically removed when load this format file.

# KeyValues

Expand Down
4 changes: 2 additions & 2 deletions __tests__/KeyValues.test.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { join } from 'path';
import { KeyValues, getKeyValuesAdapter } from '../src/node';
import { describe, expect, test } from '@jest/globals';
import crypto from 'crypto';
import { join } from 'path';
import { KeyValues, getKeyValuesAdapter } from '../src/node';

function testKV(kv: KeyValues) {
expect(kv.GetChildCount()).toBe(3);
Expand Down
31 changes: 16 additions & 15 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,21 +31,22 @@
],
"license": "MIT",
"devDependencies": {
"@jest/globals": "^29.6.1",
"@rollup/plugin-commonjs": "25.0.2",
"@rollup/plugin-node-resolve": "15.1.0",
"@rollup/plugin-typescript": "11.1.2",
"@types/jest": "^29.5.2",
"@types/mocha": "^10.0.1",
"@types/node": "^20.4.0",
"jest": "^29.6.1",
"@jest/globals": "^29.7.0",
"@rollup/plugin-commonjs": "25.0.7",
"@rollup/plugin-node-resolve": "15.2.3",
"@rollup/plugin-typescript": "11.1.6",
"@types/jest": "^29.5.12",
"@types/mocha": "^10.0.6",
"@types/node": "^20.11.30",
"iconv-lite": "^0.6.3",
"jest": "^29.7.0",
"jest-coverage-badges": "^1.1.2",
"prettier": "^3.0.0",
"rollup": "3.26.2",
"rollup-plugin-dts": "^5.3.0",
"ts-jest": "^29.1.1",
"ts-node": "^10.9.1",
"tslib": "^2.6.0",
"typescript": "5.1.6"
"prettier": "^3.2.5",
"rollup": "4.13.2",
"rollup-plugin-dts": "^6.1.0",
"ts-jest": "^29.1.2",
"ts-node": "^10.9.2",
"tslib": "^2.6.2",
"typescript": "5.4.3"
}
}
Loading

0 comments on commit 7c4053c

Please sign in to comment.