Jay Alammar 09月12日
Cohere团队工作分享
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

作者加入Cohere团队近一年,专注于训练大型语言模型并作为API提供,支持多种应用。文章介绍了作者参与撰写的几篇关于大型语言模型、提示工程、文本摘要、语义搜索、模型微调等主题的文章,展示了如何帮助企业和开发者利用这些模型解决实际问题。

🔍 作者在Cohere团队工作近一年,专注于训练大型语言模型(GPT和BERT类型)并作为API提供,支持企业和开发者解决实际问题。

📝 作者参与撰写了多篇关于大型语言模型应用的文章,包括《大型语言模型入门》、《提示工程视觉指南》、《文本摘要实践》、《语义搜索教程》等,分享了模型应用的关键见解。

🚀 文章展示了如何通过结构化输入文本进行提示工程,实现文本分类、文案写作、摘要生成等多种任务,并解释了top-k和top-p参数如何控制生成过程。

🔗 作者还介绍了语义搜索的实现方法,使用句子嵌入模型和向量搜索库(如Annoy和Faiss)实现相似问题功能,并探讨了模型微调如何提升模型性能。

🔧 文章提供了Jupyter Notebook代码示例,帮助开发者实验文本生成和摘要技术,并讨论了如何从多个生成结果中筛选最佳输出。

A little less than a year ago, I joined the awesome Cohere team. The company trains massive language models (both GPT-like and BERT-like) and offers them as an API (which also supports finetuning). Its founders include Google Brain alums including co-authors of the original Transformers paper. It’s a fascinating role where I get to help companies and developers put these massive models to work solving real-world problems.

I love that I get to share some of the intuitions developers need to start problem-solving with these models. Even though I’ve been working very closely on pretrained Transformers for the past several years (for this blog and in developing Ecco), I’m enjoying the convenience of problem-solving with managed language models as it frees up the restrictions of model loading/deployment and memory/GPU management.

These are some of the articles I wrote and collaborated on with colleagues over the last few months:

Intro to Large Language Models with Cohere

This is a high-level intro to large language models to people who are new to them. It establishes the difference between generative (GPT-like) and representation (BERT-like) models and examples use cases for them.

This is one of the first articles I got to write. It's extracted from a much larger document that I wrote to explore some of the visual language to use in explaining the application of these models.

A visual guide to prompt engineering

Massive GPT models open the door for a new way of programming. If you structure the input text in the right way, you can useful (and often fascinating) results for a lot of taasks (e.g. text classification, copy writing, summarization...etc).

This article visually demonstrates four principals to create prompts effectively.

Text Summarization

This is a walkthrough of creating a simple summarization system. It links to a jupyter notebook which includes the code to start experimenting with text generation and summarization.

The end of this notebook shows an important idea I want to spend more time on in the future. That of how to rank/filter/select the best from amongst multiple generations.

Semantic search has to be one of the most exciting applications of sentence embedding models. This tutorials implements a "similar questions" functionality using sentence embeddings and a a vector search library.

The vector search library used here is Annoy from Spotify. There are a bunch of others out there. Faiss is used widely. I experiment with PyNNDescent as well.

Finetuning Representation Models

Finetuning tends to lead to the best results language models can achieve. This article explains the intuitions around finetuning representation/sentence embedding models. I've added a couple more visuals to the Twitter thread.

The research around this area is very interesting. I've highly enjoyed papers like Sentence BERT and Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval

Controlling Generation with top-k & top-p

This one is a little bit more technical. It explains the parameters you tweak to adjust a GPT's decoding strategy -- the method with which the system picks output tokens.

Text Classification Using Embeddings

This is a walkthrough of one of the most common use cases of embedding models -- text classification. It is similar to A Visual Guide to Using BERT for the First Time, but uses Cohere's API.

You can find these and upcoming articles in the Cohere docs and notebooks repo. I have quite number of experiments and interesting workflows I’d love to be sharing in the coming weeks. So stay tuned!

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Cohere 大型语言模型 提示工程 语义搜索 模型微调
相关文章