01
视频教程
02
课程介绍
在前几个课程中,我们分别为大家介绍了:
启发灵感:从想法到实践
搭建界面的基础:Gradio 框架基础
使用模型的流程:从魔搭开源模型到模型 API 服务
前后端联调及应用发布:如何把应用完整串联并部署创空间
接下来,我们会在这节课程中让你把之前几节课学到的内容融会贯通,通过单词卡的案例来介绍一个更具实用性的完整 AI 应用的实现。
03
效果展示
大家在学习过程中可能会有各种办法来帮忙记英语单词,比如也会使用单词卡,通过例句、画面等来帮助自己记忆一些单词,这次我们就来开发一个帮忙生成专属单词的 AI 应用。
创空间体验:
Notebook:
https://modelscope.cn/notebook/share/ipynb/12ab6058/word_memory_cards.ipynb
04
IDEA
希望生成一个功能全面、内容丰富的单词卡,需要下面的几个步骤。
05
实现过程
1. 跑通模型验证
生成单词信息 => 大语言模型:Inference API
try:url = "https://api-inference.modelscope.cn/v1/images/generations"payload = {"model": "MusePublic/489_ckpt_FLUX_1", # ModelScope Model-Id, required"prompt": "一只长颈鹿", # prompt, required}headers = {"Authorization": f"Bearer {MODELSCOPE_ACCESS_TOKEN}",# provide your modelscope sdk token"Content-Type": "application/json"}response = requests.post(url, data=json.dumps(payload, ensure_ascii=False), headers=headers)response_data = response.json()image_url = response_data['images'][0]['url']except Exception as e:print(e)
生成单词封面图 => 文生图模型:Inference API or SwingDeploy or AIGC 结合
try:url = "https://api-inference.modelscope.cn/v1/images/generations"payload = {"model": "MusePublic/489_ckpt_FLUX_1", # ModelScope Model-Id, required"prompt": "一只长颈鹿", # prompt, required}headers = {"Authorization": f"Bearer {MODELSCOPE_ACCESS_TOKEN}",# provide your modelscope sdk token"Content-Type": "application/json"}response = requests.post(url, data=json.dumps(payload, ensure_ascii=False), headers=headers)response_data = response.json()image_url = response_data['images'][0]['url']except Exception as e:print(e)
生成例句语音 => 语音生成模型:SDK pipeline
audio_model_id = 'iic/speech_sambert-hifigan_tts_zh-cn_16k'sambert_hifigan_tts = pipeline(task=Tasks.text_to_speech, model=audio_model_id)output = sambert_hifigan_tts(input="The giraffe stretched its long neck to reach the leaves at the treetop.", voice='zhitian_emo')wav = output[OutputKeys.OUTPUT_WAV]timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")filename = f"{timestamp}.wav"file_path = os.path.join(directory_path, filename)with open(file_path, 'wb') as f:f.write(wav)
生成界面卡片 => Coder 模型:Inference API
GenerateUiCodeSystemPrompt = """你是一个网页开发工程师,根据下面的指示编写网页。所有代码写在一个代码块中,形成一个完整的代码文件进行展示,不用将HTML代码和JavaScript代码分开。**你更倾向集成并输出这类完整的可运行代码,而非拆分成若干个代码块输出**。对于部分类型的代码能够在UI窗口渲染图形界面,生成之后请你再检查一遍代码运行,确保输出无误。仅输出 html,不要附加任何描述文案。"""GenerateUiCodePromptTemplate = """创建一个HTML页面,用于介绍英语单词 $title 的词源、含义及用法。页面应该包括以下部分:- **标题区域**:分成三行展示单词“$title”及其音标 $phonetic_symbols 和基本含义:$translation_meaning。字体颜色为深绿色。- **词源解释区域**:$etymological_explanation。字体颜色为深灰色。- **图片带链接区域**:包含一张与单词相关的图片,图片链接为 $image_url。- **例句区域**:提供一个使用单词 $title 的例句:“$example_sentence” 其中“$title”一词被高亮显示为深绿色,其他颜色为深灰色。英文显示。- **播放例句**:提供一个音频播放按钮,按钮上有文字“播放例句”,点击后播放一段音频,音频链接为:$audio_url。请确保页面布局美观,易于阅读,所有的内容居中对齐,不限制页面高度,背景颜色为浅色,边框颜色为深绿色。"""template = Template(GenerateUiCodePromptTemplate)prompt = template.substitute(infos)print('generate_ui_code:', prompt)messages = [{'role': 'system', 'content': GenerateUiCodeSystemPrompt },{'role': 'user', 'content': prompt},]display_messages = display_messages + messagestry:gen = client.chat.completions.create(model="Qwen/Qwen2.5-Coder-32B-Instruct",messages=messages,stream=True)full_response = ""display_messages.append({'role': 'assistant', 'content': full_response})for chunk in gen:content = chunk.choices[0].delta.contentfull_response += contentdisplay_messages[-1]['content'] = full_responseis_stop = chunk.choices[0].finish_reason == 'stop'yield {"display_messages": display_messages,"content": full_response,"is_stop": is_stop,}except Exception as e:yield {"display_messages": display_messages,"content": str(e),"is_stop": True,}
2. 串联流程开发
async def generate_media(infos):return await asyncio.gather(generate_audio(infos['example_sentence']),generate_image(infos['example_sentence_image_prompt']))def run_flow(query, request: gr.Request):display_messages = []yield {steps: gr.update(current=0),drawer: gr.update(open=True),}for info_result in generate_word_info(query, display_messages):if info_result['is_stop']:word_info_str = info_result['content']breakelse:yield {display_chatbot: covert_display_messages(info_result['display_messages']),}infos = json.loads(word_info_str)yield {steps: gr.update(current=1),display_chatbot: covert_display_messages(info_result['display_messages']),}display_messages.append({'role': 'assistant','content': f"根据这些内容生成插图和例句发音:\n 插图:{infos['example_sentence_image_prompt']}\n 例句发音:{infos['example_sentence']}",})yield {display_chatbot: covert_display_messages(display_messages),}generate_results = asyncio.run(generate_media(infos))root = get_root_url(request=request, route_path="/gradio_api/queue/join", root_path=demo.root_path)root = root.replace("http:", "https:")print('root:', root)infos['audio_url'] = f"{root}/gradio_api/file={demo.move_resource_to_block_cache(generate_results[0])}"infos['image_url'] = generate_results[1]yield {steps: gr.update(current=2),}for ui_code_result in generate_ui_code(infos, display_messages):if ui_code_result['is_stop']:ui_code_str = ui_code_result['content']breakelse:yield {display_chatbot: covert_display_messages(ui_code_result['display_messages']),}yield {drawer: gr.update(open=False),display_chatbot: covert_display_messages(ui_code_result['display_messages']),sandbox_output: send_to_sandbox(remove_code_block(ui_code_str)),}
3. 设计交互界面和实现 UI
with gr.Blocks(css=css) as demo:history = gr.State([])with ms.Application():with antd.ConfigProvider(locale="zh_CN"):with antd.Row(gutter=[32, 12]) as layout:with antd.Col(span=24, md=8):with antd.Flex(vertical=True, gap="middle", wrap=True):header = gr.HTML("""<div class="left_header"><img src="//img.alicdn.com/imgextra/i3/O1CN01Q501Kf1IjzMXFpjvL_!!6000000000930-0-tps-768-1024.jpg" width="200px" /><h2>随心单词卡</h2></div>""")input = antd.InputTextarea(size="large", allow_clear=True, placeholder="请输入你想要记什么单词?")btn = antd.Button("生成", type="primary", size="large")antd.Divider("示例")with antd.Flex(gap="small", wrap=True):with ms.Each(DEMO_LIST):with antd.Card(hoverable=True, as_item="card") as demoCard:antd.CardMeta()demoCard.click(demo_card_click, outputs=[input])antd.Divider("设置")view_process_btn = antd.Button("查看生成过程")with antd.Col(span=24, md=16):with ms.Div(elem_classes="right_panel"):with antd.Drawer(open=False, width="1200", title="生成过程") as drawer:with ms.Div(elem_classes="step_container"):with antd.Steps(0) as steps:antd.Steps.Item(title="容我查一下词典", description="正在生成单词的各类信息")antd.Steps.Item(title="容我补些素材", description="正在生成单词例句的助记图和发音")antd.Steps.Item(title="即将大功告成", description="正在生成单词卡的界面")display_chatbot = gr.Chatbot(type="messages", elem_classes="display_chatbot", height=800, show_label=False, )sandbox_output = gr.HTML("""<div align="center"><h4>在左侧输入或选择你想要的单词卡开始制作吧~</h4></div>""")
4. 完成完整开发和验证优化
06
进阶作业和扩展课题
进阶作业
下面内容供大家尝试去修改和实践
扩展课题
点击阅读原文,即可跳转课程合集~
?点击关注ModelScope公众号获取
更多技术信息~
