Plan_Q-RAG

Setup Rent GPU

parent_dir/
├── Q-RAG/      ← [Q-RAG](https://github.com/griver/Q-RAG.git)
└── datasets/   ← [datasets Hotpotqa and Musique](https://huggingface.co/datasets/Q-RAG/Hotpotqa_and_Musique)

Git datasets for Q-RAG

git clone https://huggingface.co/datasets/Q-RAG/Hotpotqa_and_Musique
cd Hotpotqa_and_Musique
unzip hotpotqa+musique.zip -d /workspace/datasets
cd ..
rm -rf Hotpotqa_and_Musique
du -h

Git repo of Q-RAG

git clone https://github.com/griver/Q-RAG.git
cd Q-RAG
#Only need when you don't have your self-trained hotpotqa model yet
git clone https://huggingface.co/Q-RAG/qrag-ft-e5-on-hotpotqa

Environment Setup

# Setup venv
conda create -n qrag python=3.12 -y
conda activate qrag

python -m pip install -U pip wheel
pip install vllm  # pulls compatible PyTorch, Transformers, Triton, etc.
pip install hydra-core tensorboard rotary-embedding-torch pandas nltk sortedcontainers accelerate datasets

# Check environment
python -c "from rl.agents.pqn import PQNActor; print('✅ Q-RAG installed successfully')"

Train: Log with Time

python train_q_rag_logt.py \
   envs=hotpotqa \
   algo=pqn_e5_hotpotqa \
   envs.data_path="/workspace/datasets/hotpotqa" \
   steps_count=10000 \
   batch_size=12 \
   accumulate_grads=8 \
   eval_interval=50 \ #original 100
   envs_parallel=1 \
   max_action_length=220

Original Train

python train_q_rag.py \
   envs=hotpotqa \
   algo=pqn_e5_hotpotqa \
   envs.data_path="/workspace/datasets/hotpotqa" \
   steps_count=10000 \
   batch_size=12 \
   accumulate_grads=8 \
   eval_interval=100\
   envs_parallel=1 \
   max_action_length=220

Computer resources

基于HotpotQA+Musique(combined, GTE embedder) 训练出来的模型 Q-RAG文中没有提及他的测试

训练时常：18:07:48
显卡： Pro 6000 96GB
显存占用：60GB ± 0.5GB

HotpotQA_推理

训练时常：00:12:26
显卡：NVIDIA A100-SXM4-80GB
显存占用：30GB ± 1GB

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support