cs.AI updates on arXiv.org 前天 13:18
基于Transformer的神经架构搜索方法
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出一种基于Transformer架构的神经架构搜索方法,通过多头注意力计算方式优化不同编码器-解码器组合,以实现更优的翻译结果。采用困惑度作为辅助评估指标,结合多目标遗传算法迭代优化神经网络结构,实验结果表明,该方法搜索出的神经网络结构优于所有基线模型,且引入辅助评估指标能找到比仅考虑BLEU分数更好的模型。

arXiv:2505.01314v1 Announce Type: cross Abstract: This paper presents a neural architecture search method based on Transformer architecture, searching cross multihead attention computation ways for different number of encoder and decoder combinations. In order to search for neural network structures with better translation results, we considered perplexity as an auxiliary evaluation metric for the algorithm in addition to BLEU scores and iteratively improved each individual neural network within the population by a multi-objective genetic algorithm. Experimental results show that the neural network structures searched by the algorithm outperform all the baseline models, and that the introduction of the auxiliary evaluation metric can find better models than considering only the BLEU score as an evaluation metric.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

神经架构搜索 Transformer 多目标遗传算法 翻译结果 困惑度
相关文章