Title: Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Authors: Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi
Published: 17th November 2023 (Friday) @ 18:45:45
Link: http://arxiv.org/abs/2311.10702v2
Abstract
Since the release of TâULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques. We test and incorporate a number of these advances into TâULU, resulting in TâULU 2, a suite of improved TâULU models for advancing the understanding and best practices of adapting pretrained language models to downstream tasks and user preferences. Concretely, we release: (1) TâULU-V2-mix, an improved collection of high-quality instruction datasets; (2) TâULU 2, LLAMA-2 models finetuned on the V2 mixture; (3) TâULU 2+DPO, TâULU 2 models trained with direct preference optimization (DPO), including the largest DPO-trained model to date (TâULU 2+DPO 70B); (4) CODE TâULU 2, CODE LLAMA models finetuned on our V2 mix that outperform CODE LLAMA and its instruction-tuned variant, CODE LLAMA-Instruct. Our evaluation from multiple perspectives shows that the TâULU 2 suite achieves state-of-the-art performance among open models and matches or exceeds the performance of GPT-3.5-turbo-0301 on several benchmarks. We release all the checkpoints, data, training and evaluation code to facilitate future open efforts on adapting large language models.