Help me please

by sedra-hugface - opened Mar 9

Mar 9

Hello Sherif,

Thank you so much for the OCR model for Arabic to English handwritten text.

When I tested https://huggingface.co/spaces/sherif1313/Arabic-English-handwritten-OCR on an image of Arabic handwriting, the results were excellent. However, when I tried running the model locally on Google Colab, the results were completely different from those of HuggingFace Space. The extracted text appeared as random characters instead of the original text.

Could you please tell me the following:

What heuristic code is used in Space?

Are there any pre-processing steps for the image?

Is there any recommended guidance for achieving the best OCR results?

What is the actual model used in Space?

How can I obtain it so that the results on Colab match those on Space?

Thank you for your great work.

sherif1313

Owner Mar 10

•

edited Mar 10

You will find everything here: https://github.com/sherif1313/Arabic-English-handwritten-OCR-v3 file app.py and requirements.txt
pip install -r requirements.txt

Feel free to ask any questions

sedra-hugface

Mar 10

•

edited Mar 10

Thank you, Mr. Sherif, for replying to my message. However, the model validation point is incomplete—almost all weights are missing upon upload, including the visual encoder layers and the entire language model. Only a few of the converter weights are present. Could you please upload the complete built-in model weights?

I checked this link
https://github.com/sherif1313/Arabic-English-handwritten-OCR-v3 and all the files in it, but I couldn't find the correct way to upload the form and get the same result I got in the space.

sherif1313

Owner Mar 11

Use the form from your own device to make sure the problem isn't from your Google Colab.

amjadmak

10 days ago

Hello Sherif,
I am trying to use the model locally on WSL and I have GPU RTX4060 8GB
when I run the following code

# تحميل النموذج مع إعدادات محسنة
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    model_name,
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

I receive the following output

#output
Fetching 2 files: 100%|██████████| 2/2 [00:00<00:00, 18517.90it/s]
Loading weights: 100%|██████████| 824/824 [00:42<00:00, 19.48it/s]
Qwen2_5_VLForConditionalGeneration LOAD REPORT from: sherif1313/Arabic-English-handwritten-OCR-v3
Key            | Status  | 
---------------+---------+-
lm_head.weight | MISSING | 

Notes:
- MISSING:	those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.

Why lm_head.weight is missing?
am I implementing something wrong?
Note that I also tried to use cuda instead of auto in device_map="auto",

sherif1313

Owner 10 days ago

This error is very strange because the weights don't separate from the model. If it's easy for you to download it again or download the quantized version, there might have been a download error. Please check SHA256:
2758295a1894d02e35b9053c2a7aa65e4d59266750c11986c6e90d82d1e970be and SHA256:
763a1686ba9e9917fe17f0c7ce124a405885aa1a953063b38877c34796de877c before downloading the files. I think it's a download error. Check SHA256 and let me know the result first.

amjadmak

9 days ago

Hi Sherif,
It worked, the issue was a version mismatch of transformers. While it has to be 4.57.1 I had 5.5
Once I downgraded transformers it worked fine.

Anyone who reads the message be aware of version mismatch for different packages in python

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment