MarkTechPost@AI 2024年07月15日
IBM Researchers Propose ExSL+granite-20b-code: A Granite Code Model to Simplify Data Analysis by Enabling Generative AI to Write SQL Queries from Natural Language Questions
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

IBM研究人员提出了一种名为ExSL+granite-20b-code的模型,通过生成式AI将自然语言问题转换为SQL查询,简化了数据分析过程,并在BIRD基准测试中取得了优异表现,为商业决策提供了更广泛的数据访问能力。

🔍 ExSL+granite-20b-code模型利用提取式架构链接技术,理解数据库结构并检索相关数据表和列,优化了数据列的识别和数据值之间的关联建立过程。

💡 该模型通过三步骤流程改进了文本到SQL的生成:架构链接、内容链接和SQL代码生成,其中架构链接步骤显著加快了匹配问题关键词到相关数据表和列的速度。

🚀 在BIRD基准测试中,IBM的解决方案在准确性和执行速度上都表现出色,其代码执行速度得分80,仅次于人类工程师的90分,其他AI系统得分为65。

🔬 尽管系统正确回答问题的比例只有68%,低于人类工程师的93%,但其性能代表了自动化SQL生成方面的重要进步。

🌟 IBM的这项技术进步为商业决策提供了更便捷的数据查询方式,减少了SQL专业知识的需求,使更广泛的用户能够获取数据洞察。

Researchers at IBM address the difficulty of extracting valuable insights from large databases, especially in businesses. The massive volume and variety of data make it difficult for employees to locate the necessary information. Writing SQL code required to retrieve data across multiple schemas and tables can be complex. This limitation hampers the ability of businesses to make strategic decisions by fully leveraging their data.

Current methods for querying databases rely heavily on SQL, the dominant language for database interactions. However, SQL proficiency is typically limited to a small group of data professionals within an organization, which restricts broader access to data insights. Researchers at IBM proposed a Granite code model, ExSL+granite-20b-code, to simplify data analysis by enabling generative AI to write SQL queries from natural language questions. The proposed model achieved top performance on the BIRD benchmark, which measures the effectiveness of AI models in translating natural language into SQL.

ExSL+granite-20b-code incorporates an extractive schema-linking technique to understand database organization and retrieve relevant data tables and columns. The researchers tuned three versions of the Granite 20B model to optimize the process of identifying pertinent data columns, establishing linkages between data values, and generating accurate SQL code. 

IBM’s approach to improving text-to-SQL generation involves a three-step process: schema linking, content linking, and SQL code generation. The schema linking step matches keywords in the question to relevant data tables and columns. An extractive method speeds up this process significantly. In the content linking step, sub-tables are converted into string representations and passed to another model instance trained to generate multiple pieces of SQL code. This model compares columns with specific values relevant to the query. Finally, the third instance of the Granite model generates and selects the best SQL queries by analyzing execution results.

IBM’s solution stood out in the BIRD benchmark for both accuracy and execution speed. It achieved an 80 in code execution speed, just below the 90 earned by human engineers, while other AI systems scored 65. The extractive method for schema linking and a generative approach for content linking were key factors in this performance. Despite the system answering only 68% of questions correctly compared to human engineers’ 93%, its performance represents a significant step forward in automating SQL generation.

In conclusion, IBM has made significant advancements in leveraging generative AI to simplify data querying processes for businesses. IBM’s text-to-SQL generator presents a promising solution by addressing the need for SQL proficiency in businesses and enabling broader access to data insights. Despite the system answering only 68% of questions correctly compared to human engineers’ 93%, its performance represents a significant step forward in automating SQL generation. 

The post IBM Researchers Propose ExSL+granite-20b-code: A Granite Code Model to Simplify Data Analysis by Enabling Generative AI to Write SQL Queries from Natural Language Questions appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

IBM 生成式AI SQL查询 数据分析 自动化
相关文章