2024-07-12
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
https://github.com/meta-llama/llama3/issues/80
There is no problem reading the model, but the following occurs during inference:
RuntimeError: “triu_tril_cuda_template” not implemented for ‘BFloat16’
————————————————
When I try to understand the AutoProcessor of transformers, it prompts me:
RuntimeError: Failed to import transformers.models.auto.processing_auto because of the following error (look up to see its traceback):
Detected that PyTorch and torchvision were compiled with different CUDA versions. PyTorch has CUDA Version=11.8 and torchvision has CUDA Version=11.7. Please reinstall the torchvision that matches your PyTorch install.
It says that the cuda versions of my torch and torchvision don't match? I originally installed it according to Pytorch...
My torch version is as follows:
torch 2.0.0+cu118
torchaudio 2.0.1
torchvision 0.15.1
It's strange, there is no cu118 after the two. So I found the official website of pytorch and downloaded it again:
pip install torch2.0.0 torchvision0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
This time it was correct. I only uninstalled torchvision, so torchaudio did not update.
torch 2.0.0+cu118
torchaudio 2.0.1
torchvision 0.15.1+cu118
This is when the most recent error occurs.
————————
I am reading qwen1.5 7B, and set torch_dtype=torch.bfloat16. After changing bfloat16 to torch_dtype=torch.float16, it can be inferred. Or return torchvision to the normal version.
But torch.float16 and torch.bfloat16 are two completely different things. It doesn't seem right to just switch them...
——————————————
With torch_dtype="auto", transformers will automatically use bfloat16.
I also made some observations, printing model.config under different conditions:
Oh, it's too much.