TechCrunch News 09月30日
DeepSeek:中国AI的崛起与挑战
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

中国AI实验室DeepSeek近期因其聊天机器人应用迅速攀升至苹果应用商店榜首而备受关注。DeepSeek采用计算高效技术训练其AI模型,引发华尔街分析师和技术专家对美国AI领先地位及AI芯片需求的讨论。DeepSeek由专注于AI量化交易的高飞资本支持,最初作为其母公司在2023年成立的AI研究实验室,后独立发展。尽管受到美国硬件出口限制,DeepSeek仍通过使用Nvidia H800等芯片进行模型训练。其发布的DeepSeek-V2和DeepSeek-V3模型在性能和成本上表现出色,甚至在某些基准测试中超越了Meta的Llama和OpenAI的GPT-4o。然而,作为中国开发的AI,DeepSeek的模型需接受中国互联网监管机构的审查,以确保其响应符合“核心社会主义价值观”,并可能在敏感问题上受限。其低成本和开源许可模式吸引了大量开发者,但也引发了国际上的担忧,包括数据安全和潜在的外国影响。

🚀 **DeepSeek的崛起与技术优势**:DeepSeek作为一家中国AI公司,凭借其高效的AI模型训练技术,迅速在国际市场上崭露头角。其推出的DeepSeek-V2和DeepSeek-V3系列模型,在多项AI基准测试中表现优异,部分甚至超越了Meta的Llama和OpenAI的GPT-4o等知名模型。尤其值得关注的是其R1“推理”模型,在关键基准测试中据称能与OpenAI的o1模型相媲美,并且在物理、科学和数学等领域表现出更高的可靠性,尽管推理过程可能稍长。

💰 **颠覆性的成本效益与商业模式**:DeepSeek以极具竞争力的价格提供其产品和服务,甚至将部分模型免费开放,这与行业普遍的高成本模式形成鲜明对比。这种定价策略迫使国内竞争对手(如字节跳动和阿里巴巴)降低模型使用价格。尽管面临大量风险投资的兴趣,DeepSeek并未寻求外部融资,其独特的商业模式似乎依赖于技术突破带来的高效率,但这方面的具体细节和专家对其数据的真实性存在一些争议。

🌍 **国际关注与监管挑战**:DeepSeek的快速发展引起了国际社会的广泛关注,尤其是在美国,引发了关于其是否会影响美国AI领先地位的讨论。同时,由于其中国背景,DeepSeek的模型面临着中国互联网监管机构的审查,以确保其内容符合特定价值观,这可能导致其在处理某些敏感话题时受到限制。此外,美国政府部门、纽约州及韩国等国家和地区已开始限制或禁止在政府设备上使用DeepSeek,而微软等公司则因数据安全和宣传担忧而限制员工使用,这凸显了地缘政治和数据安全在AI发展中的重要影响。

💡 **开发者社区的积极响应**:尽管存在争议和限制,DeepSeek的模型,特别是其R1模型,因其开放的许可协议(允许商业使用)而受到开发者社区的欢迎。在Hugging Face等平台上,开发者基于R1模型创建了大量“衍生”模型,获得了数百万次的下载量。这种开放性和易用性促进了AI技术的快速迭代和应用创新,展示了DeepSeek在推动AI生态发展方面的潜力。


DeepSeek has gone viral.

Chinese AI lab DeepSeek broke into the mainstream consciousness this week after its chatbot app rose to the top of the Apple App Store charts (and Google Play, as well). DeepSeek’s AI models, which were trained using compute-efficient techniques, have led Wall Street analysts — and technologists — to question whether the U.S. can maintain its lead in the AI race and whether the demand for AI chips will sustain.

But where did DeepSeek come from, and how did it rise to international fame so quickly?

DeepSeek’s trader origins

DeepSeek is backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions.

AI enthusiast Liang Wenfeng co-founded High-Flyer in 2015. Wenfeng, who reportedly began dabbling in trading while a student at Zhejiang University, launched High-Flyer Capital Management as a hedge fund in 2019 focused on developing and deploying AI algorithms.

In 2023, High-Flyer started DeepSeek as a lab dedicated to researching AI tools separate from its financial business. With High-Flyer as one of its investors, the lab spun off into its own company, also called DeepSeek.

From day one, DeepSeek built its own data center clusters for model training. But like other AI companies in China, DeepSeek has been affected by U.S. export bans on hardware. To train one of its more recent models, the company was forced to use Nvidia H800 chips, a less-powerful version of a chip, the H100, available to U.S. companies.

Techcrunch event

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Netflix, Box, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, Vinod Khosla — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch, and a chance to learn from the top voices in tech. Grab your ticket before doors open to save up to $444.

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Netflix, Box, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, Vinod Khosla — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss a chance to learn from the top voices in tech. Grab your ticket before doors open to save up to $444.

San Francisco|October 27-29, 2025

DeepSeek’s technical team is said to skew young. The company reportedly aggressively recruits doctorate AI researchers from top Chinese universities. DeepSeek also hires people without any computer science background to help its tech better understand a wide range of subjects, per The New York Times.

DeepSeek’s strong models

DeepSeek unveiled its first set of models — DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat — in November 2023. But it wasn’t until last spring, when the startup released its next-gen DeepSeek-V2 family of models, that the AI industry started to take notice.

DeepSeek-V2, a general-purpose text- and image-analyzing system, performed well in various AI benchmarks — and was far cheaper to run than comparable models at the time. It forced DeepSeek’s domestic competition, including ByteDance and Alibaba, to cut the usage prices for some of their models, and make others completely free.

DeepSeek-V3, launched in December 2024, only added to DeepSeek’s notoriety.

According to DeepSeek’s internal benchmark testing, DeepSeek V3 outperforms both downloadable, openly available models like Meta’s Llama and “closed” models that can only be accessed through an API, like OpenAI’s GPT-4o.

Equally impressive is DeepSeek’s R1 “reasoning” model. Released in January, DeepSeek claims R1 performs as well as OpenAI’s o1 model on key benchmarks.

Being a reasoning model, R1 effectively fact-checks itself, which helps it to avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer — usually seconds to minutes longer — to arrive at solutions compared to a typical non-reasoning model. The upside is that they tend to be more reliable in domains such as physics, science, and math.

There is a downside to R1, DeepSeek V3, and DeepSeek’s other models, however. Being Chinese-developed AI, they’re subject to benchmarking by China’s internet regulator to ensure that its responses “embody core socialist values.” In DeepSeek’s chatbot app, for example, R1 won’t answer questions about Tiananmen Square or Taiwan’s autonomy.

In March, DeepSeek surpassed 16.5 million visits. “[F]or March, DeepSeek is in second place, despite seeing traffic drop 25% from where it was in February, based on daily visits,” David Carr, editor at Similarweb, told TechCrunch. It still pales in comparison to ChatGPT, which surged past 500 million weekly active users in March.

In May, DeepSeek released an updated version of its R1 reasoning AI model on the developer platform Hugging Face.

DeepSeek unveiled a new experimental model called V3.2-exp in September, designed to have dramatically lower inference costs when used in long-context operations.

A disruptive approach

If DeepSeek has a business model, it’s not clear what that model is, exactly. The company prices its products and services well below market value — and gives others away for free. It’s also not taking investor money, despite a ton of VC interest.

The way DeepSeek tells it, efficiency breakthroughs have enabled it to maintain extreme cost competitiveness. Some experts dispute the figures the company has supplied, however.

Whatever the case may be, developers have taken to DeepSeek’s models, which aren’t open source as the phrase is commonly understood but are available under permissive licenses that allow for commercial use. According to Clem Delangue, the CEO of Hugging Face, one of the platforms hosting DeepSeek’s models, developers on Hugging Face have created over 500 “derivative” models of R1 that have racked up 2.5 million downloads combined.

DeepSeek’s success against larger and more established rivals has been described as “upending AI” and “over-hyped.” The company’s success was at least in part responsible for causing Nvidia’s stock price to drop by 18% in January, and for eliciting a public response from OpenAI CEO Sam Altman. In March, U.S. Commerce department bureaus told staffers that DeepSeek will be banned on their government devices, according to Reuters.

Microsoft announced that DeepSeek is available on its Azure AI Foundry service, Microsoft’s platform that brings together AI services for enterprises under a single banner. When asked about DeepSeek’s impact on Meta’s AI spending during its first-quarter earnings call, CEO Mark Zuckerberg said spending on AI infrastructure will continue to be a “strategic advantage” for Meta. In March, OpenAI called DeepSeek “state-subsidized” and “state-controlled,” and recommends that the U.S. government consider banning models from DeepSeek.

During Nvidia’s fourth-quarter earnings call, CEO Jensen Huang emphasized DeepSeek’s “excellent innovation,” saying that it and other “reasoning” models are great for Nvidia because they need so much more compute.

At the same time, some companies are banning DeepSeek, and so are entire countries and governments, including South Korea. New York state also banned DeepSeek from being used on government devices.

In May, Microsoft vice chairman and president Brad Smith said in a Senate hearing that Microsoft employees aren’t allowed to use DeepSeek due to data security and propaganda concerns.

As for what DeepSeek’s future might hold, it’s not clear. Improved models are a given. But the U.S. government appears to be growing wary of what it perceives as harmful foreign influence. In March, The Wall Street Journal reported that the U.S. will likely ban DeepSeek on government devices.

This story was originally published January 28, 2025, and will be updated regularly.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

DeepSeek 人工智能 中国AI AI模型 技术突破 DeepSeek AI China AI AI Models Technological Breakthrough
相关文章