Stanisław Szymczyk
sszymczyk
AI & ML interests
None yet
Recent Activity
updated a model 5 days ago
sszymczyk/DeepSeek-V3.2-Speciale-light-GGUF published a model 5 days ago
sszymczyk/DeepSeek-V3.2-Speciale-light-GGUF updated a model 5 days ago
sszymczyk/DeepSeek-V3.2-Speciale-nolight-GGUFOrganizations
None yet
Running the model with a dense attention
4
#35 opened 3 months ago
by
sszymczyk
Quick Start section in README.md is a bit misleading
👍 1
2
#10 opened 21 days ago
by
sszymczyk
Temperature and top_p values are swapped in the example code:
👍 1
1
#2 opened about 2 months ago
by
sszymczyk
Recommended sampling parameters?
👍 1
2
#3 opened about 2 months ago
by
sszymczyk
Problems with logical reasoning performance of GLM-4.7-Flash
👀 1
1
#35 opened about 2 months ago
by
sszymczyk
Recommended sampling parameters
🤝 2
5
#6 opened 2 months ago
by
sszymczyk
Thinking mode
2
#2 opened 2 months ago
by
anikifoss
Feedback
17
#1 opened 2 months ago
by
bibproj
Please test the QwQ-32B-Preview model
➕ 2
#3 opened over 1 year ago
by
sszymczyk
This model performs worse in complex problems compared to the DeepSeek R1
17
#254 opened about 1 year ago
by
sszymczyk
Requesting Support for GGUF Quantization of MiniMax-Text-01 through llama.cpp
👍 24
7
#1 opened about 1 year ago
by
Doctor-Chad-PhD
Missing tokenizer.json and tokenizer_config.json files
1
#2 opened about 1 year ago
by
sszymczyk
Please add the "tokenizer.model" file
2
#3 opened about 1 year ago
by
ken133
Hardware Requirements
6
#1 opened over 1 year ago
by
Lightchain
Can you provide code for inference with MCTS?
👍 6
6
#3 opened over 1 year ago
by
sszymczyk
Reason behind not using special tokens in the prompt format?
2
#2 opened over 1 year ago
by
Doctor-Shotgun
The curse of the Consolidated Safetensors strikes again...
2
#4 opened over 1 year ago
by
jukofyork
The model often enters infinite generation loops
👍 5
13
#32 opened over 1 year ago
by
sszymczyk