Deqing Fu PRO
deqing
AI & ML interests
None yet
Recent Activity
updated a model 2 minutes ago
deqing/llama-300M-v5-isolate_two updated a model 43 minutes ago
deqing/llama-300M-v5-window_64 updated a model about 1 hour ago
deqing/llama-300M-v5-window_8Organizations
Convergent Evolution (Architecture and Optimizer)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 314 -
deqing/convergent-gdn-300M-muon-original
Text Generation • 0.3B • Updated • 286 -
deqing/convergent-mamba2-300M-muon-original
Text Generation • 0.3B • Updated • 284 -
deqing/convergent-lstm-4layer-muon-original
Text Generation • 0.2B • Updated • 296
Convergent Evolution (Addition)
-
deqing/convergent-llama-300M-muon-addition_3digit
Text Generation • 0.3B • Updated • 782 -
deqing/convergent-llama-300M-muon-addition_3digit_seed123
0.3B • Updated • 18 -
deqing/convergent-llama-300M-muon-addition
Text Generation • 0.3B • Updated • 1.72k -
deqing/convergent-llama-300M-adamw-addition_3digit
Text Generation • 0.3B • Updated • 624
Convergent Evolution (Data)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 314 -
deqing/convergent-llama-300M-muon-unigram
Text Generation • 0.3B • Updated • 226 -
deqing/convergent-llama-300M-muon-isolate
Text Generation • 0.3B • Updated • 264 -
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation • 0.3B • Updated • 259
Convergent Evolution
Convergent Evolution (Addition)
-
deqing/convergent-llama-300M-muon-addition_3digit
Text Generation • 0.3B • Updated • 782 -
deqing/convergent-llama-300M-muon-addition_3digit_seed123
0.3B • Updated • 18 -
deqing/convergent-llama-300M-muon-addition
Text Generation • 0.3B • Updated • 1.72k -
deqing/convergent-llama-300M-adamw-addition_3digit
Text Generation • 0.3B • Updated • 624
Convergent Evolution (Architecture and Optimizer)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 314 -
deqing/convergent-gdn-300M-muon-original
Text Generation • 0.3B • Updated • 286 -
deqing/convergent-mamba2-300M-muon-original
Text Generation • 0.3B • Updated • 284 -
deqing/convergent-lstm-4layer-muon-original
Text Generation • 0.2B • Updated • 296
Convergent Evolution (Data)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 314 -
deqing/convergent-llama-300M-muon-unigram
Text Generation • 0.3B • Updated • 226 -
deqing/convergent-llama-300M-muon-isolate
Text Generation • 0.3B • Updated • 264 -
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation • 0.3B • Updated • 259