Pretrained models from the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"
Zayd Muhammad Kawakibi Zuhri PRO
zaydzuhri
AI & ML interests
I really like watching loss go down
Recent Activity
updated
a dataset about 20 hours ago
zaydzuhri/selective-copy-256 published
a dataset about 20 hours ago
zaydzuhri/selective-copy-256 updated
a dataset about 21 hours ago
zaydzuhri/reverse-copy-256 Organizations
None yet