-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Labels
fixed - pending confirmationFixed, waiting for confirmation from posterFixed, waiting for confirmation from poster
Description
- Did you update? yes. colab
Colab
- Number GPUs used, 1
- Which notebook? Please link!
- Which Unsloth version, TRL version, transformers version, PyTorch version?
Unsloth version: 2025.8.4
Transformers version: 4.56.0.dev0
Torch version: 2.8.0+cu128
- Which trainer?
SFTTrainer
,GRPOTrainer
etc :SFTTrainer
messages = [
{"role": "user", "content": "Solve x^5 + 3x^4 - 10 = 3."},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt = True,
return_tensors = "pt",
return_dict = True,
reasoning_effort = "low", # **NEW!** Set reasoning effort to low, medium or high
).to(model.device)
_ = model.generate(**inputs, max_new_tokens = 512, streamer = TextStreamer(tokenizer))
Cell/Error Output
<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-08-13
Reasoning: low
# Valid channels: analysis, commentary, final. Channel must be included for every message.
Calls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>user<|message|>Solve x^5 + 3x^4 - 10 = 3.<|end|><|start|>assistant<|channel|>analysis<|message|>Equation: x^5 + 3x^4 - 10 = 3. So x^5 + 3x^4 - 13 =0. Solve for real roots? maybe numeric. Let's try approximate.
We can test integer roots: try x=1 => 1+3-13=-9. x=2 =>32+48-13=67. So root between 1 and 2. Try x=1.5 => (1.5)^5=7.59375 +3*(1.5^4=5.0625)=15.1875 total 22.78125-13=9.78125>0. So root between 1 and 1.5. Try x=1.2: 1.2^5=2.48832 +3*1.2^4=3*2.0736=6.2208 total=8.70912-13=-4.29088 negative. x=1.3: 1.3^5=3.71293 +3*1.3^4=3*2.856=8.568 total=12.28093-13=-0.71907. x=1.35: 1.35^5=1.35^4*1.35. 1.35^2=1.8225, ^4=3.325, times 1.35=4.4918. plus 3*1.35^4=3*3.325=9.975 total=14.4668-13=1.4668. So root between 1.3 and 1.35. Interpolate. try 1.32: 1.32^5: (1.32^2=1.7424, ^4=3.0364, *1.32=4.0055) plus
---------------------------------------------------------------------------
AcceleratorError Traceback (most recent call last)
[/usr/local/lib/python3.11/dist-packages/unsloth/models/vision.py](https://localhost:8080/#) in unsloth_base_fast_generate(self, *args, **kwargs)
232 with torch.inference_mode(), autocaster:
--> 233 output = self._old_generate(*args, **kwargs)
234 except:
9 frames
[/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py](https://localhost:8080/#) in decorate_context(*args, **kwargs)
119 with ctx_factory():
--> 120 return func(*args, **kwargs)
121
[/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, use_model_defaults, custom_generate, **kwargs)
2475 # 12. run assisted generate
-> 2476 result = self._assisted_decoding(
2477 input_ids,
[/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in _assisted_decoding(self, input_ids, candidate_generator, logits_processor, stopping_criteria, generation_config, synced_gpus, streamer, **model_kwargs)
4875 # Ensure we don't generate beyond max_len or an EOS token
-> 4876 if is_done_candidate and n_matches == candidate_length:
4877 n_matches -= 1
AcceleratorError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
During handling of the above exception, another exception occurred:
AcceleratorError Traceback (most recent call last)
[/tmp/ipython-input-1892116402.py](https://localhost:8080/#) in <cell line: 0>()
12 ).to(model.device)
13
---> 14 _ = model.generate(**inputs, max_new_tokens = 512, streamer = TextStreamer(tokenizer))
[/usr/local/lib/python3.11/dist-packages/peft/peft_model.py](https://localhost:8080/#) in generate(self, *args, **kwargs)
884 with self._enable_peft_forward_hooks(*args, **kwargs):
885 kwargs = {k: v for k, v in kwargs.items() if k not in self.special_peft_forward_args}
--> 886 return self.get_base_model().generate(*args, **kwargs)
887
888 def _get_base_model_class(self, is_prompt_tuning=False):
[/usr/local/lib/python3.11/dist-packages/unsloth/models/vision.py](https://localhost:8080/#) in unsloth_base_fast_generate(self, *args, **kwargs)
236 kwargs.pop("prompt_lookup_num_tokens", None)
237 with torch.inference_mode(), autocaster:
--> 238 output = self._old_generate(*args, **kwargs)
239 finally:
240 pass
[/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py](https://localhost:8080/#) in decorate_context(*args, **kwargs)
118 def decorate_context(*args, **kwargs):
119 with ctx_factory():
--> 120 return func(*args, **kwargs)
121
122 return decorate_context
[/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, use_model_defaults, custom_generate, **kwargs)
2311
2312 device = inputs_tensor.device
-> 2313 self._prepare_special_tokens(generation_config, kwargs_has_attention_mask, device=device)
2314
2315 # decoder-only models must use left-padding for batched generation.
[/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in _prepare_special_tokens(self, generation_config, kwargs_has_attention_mask, device)
2053 return torch.tensor(token, device=device, dtype=torch.long)
2054
-> 2055 bos_token_tensor = _tensor_or_none(generation_config.bos_token_id, device=device)
2056 eos_token_tensor = _tensor_or_none(generation_config.eos_token_id, device=device)
2057 pad_token_tensor = _tensor_or_none(generation_config.pad_token_id, device=device)
[/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in _tensor_or_none(token, device)
2051 if isinstance(token, torch.Tensor):
2052 return token.to(device)
-> 2053 return torch.tensor(token, device=device, dtype=torch.long)
2054
2055 bos_token_tensor = _tensor_or_none(generation_config.bos_token_id, device=device)
AcceleratorError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Metadata
Metadata
Assignees
Labels
fixed - pending confirmationFixed, waiting for confirmation from posterFixed, waiting for confirmation from poster