Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Panic] byte index 2 is not a char boundary #9145

Closed
yuuuxt opened this issue Dec 15, 2023 · 3 comments · Fixed by #9146
Closed

[Panic] byte index 2 is not a char boundary #9145

yuuuxt opened this issue Dec 15, 2023 · 3 comments · Fixed by #9146
Assignees
Labels
bug Something isn't working

Comments

@yuuuxt
Copy link

yuuuxt commented Dec 15, 2023

using v2023.56.0 and 1.85.1 in win10.

Seems a line inside docstring that starts with a Chinese character would cause this.

In jupyter notebook I create a function:

def sample_func(xx):
    """
    转置 (transpose)
    """
    return xx.T

the error logs:

thread 'main' panicked at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1\library\core\src\str\mod.rs:660:13:
byte index 2 is not a char boundary; it is inside '转' (bytes 0..3) of `转置`
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
@MichaReiser
Copy link
Member

What's the command you're running? Is it ruff check or ruff format?

@MichaReiser MichaReiser added the bug Something isn't working label Dec 15, 2023
@yuuuxt
Copy link
Author

yuuuxt commented Dec 15, 2023

not running any command, it's like an auto-check (not manually triggering formatting), so I assume it's similar to running ruff check.

@manunio
Copy link
Contributor

manunio commented Dec 15, 2023

@MichaReiser I think the bug is for both check and format , I was able to repro this by running check against .ipynb file with above sample_func.

ruff check .\test.ipynb
error: Panicked while linting test.ipynb: This indicates a bug in Ruff. If you could open an issue at:

    https://github.com/astral-sh/ruff/issues/new?title=%5BLinter%20panic%5D

...with the relevant file contents, the `pyproject.toml` settings, and the following stack trace, we'd be very appreciative!

panicked at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1\library\core\src\str\mod.rs:660:13:
byte index 2 is not a char boundary; it is inside '转' (bytes 0..3) of `转置`

@konstin konstin self-assigned this Dec 15, 2023
konstin added a commit that referenced this issue Dec 15, 2023
dhruvmanila pushed a commit that referenced this issue Dec 15, 2023
The example below used to panic because we tried to split at 2 bytes in
the 4-bytes character `转`.
```python
def sample_func(xx):
    """
    转置 (transpose)
    """
    return xx.T
```

Fixes #9145
Fixes astral-sh/ruff-vscode#362

The second commit is a small test refactoring.
charliermarsh pushed a commit that referenced this issue Dec 19, 2023
We've had bugs related to non-latin scripts, most recently #9145, where
just starting a docstring with multi-byte characters would panic. I've
added https://github.com/binary-husky/gpt_academic to catch those in the
ecosystem checks, it's a popular repo with mixed english and chinese
comments and symbols.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants