Skip Navigation

Direct Preference Optimization - Your Language Model is Secretly a Reward Model

0
AI / Machine Learning @compuverse.uk manitcor @lemmy.intai.tech

Direct Preference Optimization - Your Language Model is Secretly a Reward Model

2 0
0 comments