Second Brain: Crafted, Curated, Connected, Compounded on 10月02日 21:16
构建统一的数据访问接口:GraphQL的优势
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

随着云架构和工具的爆炸式发展,数据团队面临着提供统一数据访问接口的挑战。不同角色的用户(数据分析师、产品经理、数据科学家、商业分析师等)对数据有不同的需求和访问方式。本文探讨了如何通过标准化、带有内置文档和即时验证的GraphQL接口,来抽象异构数据存储,满足机器学习、商业智能、权限管理、内部应用、外部客户及管理层等多样化的数据需求,并简化了身份验证和授权管理,提升了数据工程的效率和一致性。

📊 **多样化的数据访问需求挑战统一接口:** 现代数据团队需满足从机器学习实验、实时BI报表到外部数据提取等多种用户群体和应用场景的数据访问需求。ML专家需要API进行Jupyter Notebook实验,BI用户则需要秒级响应的SQL接口进行实时分析和报告,而数据工程师则需要处理数据湖、OLAP立方体等复杂后端,这使得提供一个单一、高效的接口变得尤为困难。

💡 **GraphQL提供优雅的解决方案:** 文章提出,标准化、带有即时验证和内置文档的GraphQL接口是当前解决复杂数据访问问题的最佳途径。它能够抽象底层异构数据存储,为不同用户提供一致的访问体验,例如为ML用户提供Notebook接口,为BI用户提供SQL连接器,并支持权限管理、数据更新等功能,从而简化数据工程的复杂性。

🔐 **集中化身份验证与授权简化管理:** 传统的身份验证和授权管理需要为每个系统创建新的用户组,过程繁琐且容易出错。GraphQL接口能够将身份验证和授权集中化处理,一次性集成到核心API中,为用户提供更安全、便捷的访问体验,避免了在各个系统中重复配置的麻烦,提升了整体的安全性和管理效率。

Cloud architecture is more complex than ever, especially with the latest explosion of tools and technology. Today, every data team wants data to be readily available to decision-makers in the company. Whether a Data Analyst, Product Manager, Data Scientist, Business or Data Analyst approaches them, it’s hard to provide a single interface to abstract all heterogeneous data stores away and let them query all the data. On top of that, new principles and architecture are picking up old ideas, for example, decentralised data products in Data Mesh and a centralised cloud data warehouse.

Xavier Gumara Rigol from Adevinta says that each dataset should have at least two interfaces with SQL as fast access and programmatic access via notebooks if more complex processing is needed.

On the other hand, if you have a single Postgres database or any other simplified architecture, it probably doesn’t make sense to build and route it through an Analytics API. Let’s have a look at different data teams nowadays and with what they struggle today:

    Machine Learning folks want an API to experiment with particular data within a Jupyter Notebook.Business Intelligence users need to report how the company is doing with their dashboard tool of choice. They need a SQL Connector. Response time must be within seconds as they want to slice and dice in real-time and demo the numbers in meetings. If possible company-wide KPI’s are already precalculated and ready to use.Power-users want to update and fix some incorrect data. They need an interface or clear documentation of how to do that. And more importantly, whether they are doing it on a data lake, an OLAP cube, configs, etc., shouldn’t matter.Internal applications and pipelines that apply the product logic with different requirements include ingesting new data, fixing invalidate states, automatic maintenance such as compacting massive data sources or implementing complex business logic.External customers want to extract data for their data warehouse.Managers want to see the overall numbers at a glance.

As these stakeholders have different use-cases and skills, it is tough to support them all. With a standardised GraphQL interface validated on the spot and documented build-in, we have the best approach today. It is also a chance to make updates consistent and save, instead of getting direct access to people :fire_engine:.

Authorisation and authentication are noteworthy instead of creating new groups and users in every system. It’s essential to implement that once. But that is very hard if you do not have such an API. Of course, you could integrate your identity and access management solution, but baked-in in the central API and with GraphQL is a pragmatic and elegant way.


Origin:Aa Building an Analytics API with GraphQL: The Next Level of Data Engineering? | ssp.sh
References: Analytics API
Last Modified: 2022-02-19
Created 2022-02-19

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Cloud Architecture Data Engineering GraphQL Analytics API Data Mesh Data Warehouse Machine Learning Business Intelligence Authentication Authorization
相关文章