question about the training loss? #38

lufanma · 2024-12-03T02:56:58Z

非常赞且有意义的工作！！

按照llava15_train.sh中的原始配置（未改batchsize设置），8卡A100上训练，loss和一些指标的log如上图。想要请教一下这个loss看着在震荡，忽高忽低。以及目前看来'rewards_train/chosen'经常是比'rewards_train/rejected'要低，想问下这个是正常还是不正常呀？

期待您的回复@yiranyyu @Haoye17

lufanma · 2024-12-03T10:38:32Z

非常赞且有意义的工作！！

按照llava15_train.sh中的原始配置（未改batchsize设置），8卡A100上训练，loss和一些指标的log如上图。想要请教一下这个loss看着在震荡，忽高忽低。以及目前看来'rewards_train/chosen'经常是比'rewards_train/rejected'要低，想问下这个是正常还是不正常呀？

期待您的回复@yiranyyu @Haoye17

如果将batchsize设为8，即单卡batchsize为8，保持max_steps等其余参数不变，8 A100 GPUs训练，在最新83.1k数据集上差不多训练2 epochs，修改batchsize后最终的loss还是下降的，rewards_train/accuracies也能到0.80左右。

想问下，论文中写的batchsize为8，是总batchsize为8还是单卡batchsize为8呢？

期待您的回复@yiranyyu @Haoye17

yiranyyu · 2024-12-03T11:07:06Z

感谢关注我们的工作！

DPO 会优化 reward margin 的数值，如果这个指标为负比较异常，其他指标需要根据具体数据和训练情况分析。同时，我们开源的总共 80k+ 数据来自多组不同的实验过程中不同的多模态模型，数据分布可能比论文中实验遇到的分布更加复杂，增大 batch-size 或许可以帮助稳定训练。
rlaifv_llava_train 这张图片我这里无法查看。
batch-size 指的是优化意义的批大小，和卡无关。

lufanma · 2024-12-05T06:27:29Z

感谢关注我们的工作！

DPO 会优化 reward margin 的数值，如果这个指标为负比较异常，其他指标需要根据具体数据和训练情况分析。同时，我们开源的总共 80k+ 数据来自多组不同的实验过程中不同的多模态模型，数据分布可能比论文中实验遇到的分布更加复杂，增大 batch-size 或许可以帮助稳定训练。

rlaifv_llava_train 这张图片我这里无法查看。

batch-size 指的是优化意义的批大小，和卡无关。

感谢回复~
单卡batchsize从1增大到8后，在83.1k数据集上最后loss收敛，reward margin也逐渐增大，训练log图：

lufanma · 2024-12-05T06:40:15Z

@yiranyyu 您好，请教一个问题。
在偏好数据构造的divide_and_conquer中，如果采用除omniLMM、minicpm等以外的MLLM作为labeler model来生成yes/no的feedback，例如用qwen2-vl作labeler model。请问如何获取yes/Yes/no/No的tokenizer编码索引index呢？因为我看在提供的两种autocheck代码中yes/Yes/no/No的tokenizer编码索引index的获取方式都不一样.

请问这个编码索引的获取方式是如何确定的？

# omniLMM
yes_id = tokenizer.encode('\n<|assistant|>\nyes')[-1]
Yes_id = tokenizer.encode('\n<|assistant|>\nYes')[-1]
no_id = tokenizer.encode('\n<|assistant|>\nno')[-1]
No_id = tokenizer.encode('\n<|assistant|>\nNo')[-1]

# minicpm
yes_id = self.tokenizer.encode(f'{self.tokenizer.bos_token}yes')[-1]
Yes_id = self.tokenizer.encode(f'{self.tokenizer.bos_token}Yes')[-1]
no_id = self.tokenizer.encode(f'{self.tokenizer.bos_token}no')[-1]
No_id = self.tokenizer.encode(f'{self.tokenizer.bos_token}No')[-1]

刚入门，期待你的回复~ 感谢！！

lufanma changed the title ~~question about the training logs?~~ question about the training loss? Dec 3, 2024

lufanma closed this as completed Dec 5, 2024

lufanma reopened this Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the training loss? #38

question about the training loss? #38

lufanma commented Dec 3, 2024

lufanma commented Dec 3, 2024

yiranyyu commented Dec 3, 2024

lufanma commented Dec 5, 2024

lufanma commented Dec 5, 2024

question about the training loss? #38

question about the training loss? #38

Comments

lufanma commented Dec 3, 2024

lufanma commented Dec 3, 2024

yiranyyu commented Dec 3, 2024

lufanma commented Dec 5, 2024

lufanma commented Dec 5, 2024