Help me please
Hello Sherif,
Thank you so much for the OCR model for Arabic to English handwritten text.
When I tested https://huggingface.co/spaces/sherif1313/Arabic-English-handwritten-OCR on an image of Arabic handwriting, the results were excellent. However, when I tried running the model locally on Google Colab, the results were completely different from those of HuggingFace Space. The extracted text appeared as random characters instead of the original text.
Could you please tell me the following:
What heuristic code is used in Space?
Are there any pre-processing steps for the image?
Is there any recommended guidance for achieving the best OCR results?
What is the actual model used in Space?
How can I obtain it so that the results on Colab match those on Space?
Thank you for your great work.
You will find everything here: https://github.com/sherif1313/Arabic-English-handwritten-OCR-v3 file app.py and requirements.txt
pip install -r requirements.txt
Feel free to ask any questions
Thank you, Mr. Sherif, for replying to my message. However, the model validation point is incompleteโalmost all weights are missing upon upload, including the visual encoder layers and the entire language model. Only a few of the converter weights are present. Could you please upload the complete built-in model weights?
I checked this link
https://github.com/sherif1313/Arabic-English-handwritten-OCR-v3 and all the files in it, but I couldn't find the correct way to upload the form and get the same result I got in the space.
Use the form from your own device to make sure the problem isn't from your Google Colab.
Hello Sherif,
I am trying to use the model locally on WSL and I have GPU RTX4060 8GB
when I run the following code
# ุชุญู
ูู ุงููู
ูุฐุฌ ู
ุน ุฅุนุฏุงุฏุงุช ู
ุญุณูุฉ
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
model_name,
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
I receive the following output
#output
Fetching 2 files: 100%|โโโโโโโโโโ| 2/2 [00:00<00:00, 18517.90it/s]
Loading weights: 100%|โโโโโโโโโโ| 824/824 [00:42<00:00, 19.48it/s]
Qwen2_5_VLForConditionalGeneration LOAD REPORT from: sherif1313/Arabic-English-handwritten-OCR-v3
Key | Status |
---------------+---------+-
lm_head.weight | MISSING |
Notes:
- MISSING: those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.
Why lm_head.weight is missing?
am I implementing something wrong?
Note that I also tried to use cuda instead of auto in device_map="auto",
This error is very strange because the weights don't separate from the model. If it's easy for you to download it again or download the quantized version, there might have been a download error. Please check SHA256:
2758295a1894d02e35b9053c2a7aa65e4d59266750c11986c6e90d82d1e970be and SHA256:
763a1686ba9e9917fe17f0c7ce124a405885aa1a953063b38877c34796de877c before downloading the files. I think it's a download error. Check SHA256 and let me know the result first.
Hi Sherif,
It worked, the issue was a version mismatch of transformers. While it has to be 4.57.1 I had 5.5
Once I downgraded transformers it worked fine.
Anyone who reads the message be aware of version mismatch for different packages in python