philschmid RSS feed 09月30日
IGEL LLM模型介绍
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

IGEL是一款专为德语开发的LLM模型家族,首个版本基于BigScience BLOOM进行适配,由Malte Ostendorff翻译至德语。该模型旨在为多种自然语言理解任务提供准确可靠的语言理解能力,包括情感分析、语言翻译和问答。IGEL家族目前包含instruct-igel-001和chat-igel-001(即将推出)。instruct-igel-001作为一项概念验证,验证了使用开源模型和德语翻译指令数据集创建德语指令调优模型的可行性。通过在预训练的BLOOM模型(6B)上使用翻译的指令数据集进行微调,该模型成功学习了基于指令的响应生成能力。用户可在Hugging Face上免费试用该模型,或下载并使用transformers库运行。

🔹 IGEL是一款专为德语开发的LLM模型家族,首个版本基于BigScience BLOOM进行适配,由Malte Ostendorff翻译至德语。该模型旨在为多种自然语言理解任务提供准确可靠的语言理解能力,包括情感分析、语言翻译和问答。

🔸 IGEL家族目前包含instruct-igel-001和chat-igel-001(即将推出)。instruct-igel-001作为一项概念验证,验证了使用开源模型和德语翻译指令数据集创建德语指令调优模型的可行性。

🔶 通过在预训练的BLOOM模型(6B)上使用翻译的指令数据集进行微调,该模型成功学习了基于指令的响应生成能力。用户可在Hugging Face上免费试用该模型,或下载并使用transformers库运行。

🌐 该模型在内容生成、产品描述、营销邮件等方面具有应用潜力,但目前仍存在一些局限性,如幻觉、毒性和刻板印象等问题。

🚀 未来的发展方向包括完成聊天模型以实现对话界面,并提高数据质量,以进一步提升模型性能。

IGEL is an LLM model family developed for German. The first version of IGEL is built on top BigScience BLOOM, adapted to German from Malte Ostendorff. IGEL is designed to provide accurate and reliable language understanding capabilities for a wide range of natural language understanding tasks, including sentiment analysis, language translation, and question answering.

You can try out the model at igel-playground.

The IGEL family currently includes instruct-igel-001 and chat-igel-001 (coming soon).

Model Description

The 001 version of IGEL is designed as a naive proof of concept to determine whether it is possible to create a German instruction-tuned model using a set of available open-source models and German translated instruction dataset. The goal is to explore the potential of the LLMs for German language modeling tasks that require instruction-based responses.

To achieve this goal, we used a pre-trained adapted BLOOM model (6B) and fine-tuned it using the translated instruction-based dataset. The dataset was created by taking instructions in English and translating them into German using an automated translation tool. While this approach may introduce errors in the translated content, we wanted to test whether the model could still learn to generate instruction-based responses.

We are pleased to announce that we had success. 🥳 Instruct-igel-001 is LoRA-tuned BLOOM-CLP German (6.4B parameters) with merged weights to make it is to load and use with Hugging Face Transformers.

Samples

You can test out the model for free on Hugging Face: https://huggingface.co/spaces/philschmid/igel-playground

Or you can download the model and run it using transformers: philschmid/instruct-igel-001

Question Anshttps://www.philschmid.de/static/blog/introducing-igel/question-answering.pngel/question-answering.png" alt="question-answering">

Content Generation

How to use the model

The model is available on Hugging face at philschmid/instruct-igel-001.

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline # load modeltokenizer = AutoTokenizer.from_pretrained("philschmid/instruct-igel-001")model = AutoModelForCausalLM.from_pretrained("philschmid/instruct-igel-001") # load pipelinegenerator = pipeline("text-generation",model=model,tokenizer=tokenizer) # run generationgenerator("### Anweisung:\n{{input}}\n\n### Antwort:")

Training data

instruct-igel-001 is trained on naive translated instruction datasets without much any data-cleaning, filtering, or post-processing.

Known limitations

instruct-igel-001 also exhibits several common deficiencies of language models, including hallucination, toxicity, and stereotypes.

For example, in the following figure, instruct-igel-001 wrongly says that the chancellor of Germany is Angela Merkel.


Next Steps

The next steps are to finish the chat model to have a conversational interface, which goes beyond a simple request-response concept and improves the data quality.

If you are interested in collaborating or improving IGEL's capabilities or would like to learn how you can adapt and improve IGEL for your company’s needs, please contact me via email, Twitter, or LinkedIn.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

IGEL LLM 德语 指令调优 BigScience BLOOM
相关文章