
The Chinese AI startup DeepSeek recently launched its eponymous free app, which has swiftly garnered a significant user base on the App Store in regions like the United States. It boasts the use of an open-source AI model, “DeepSeek V3,” which reportedly outperforms Meta’s Llama 3.1 and rivals the capabilities of Anthropic’s Claude-3.5 and OpenAI’s GPT-4. Despite its impressive performance, the model operates on hardware resources far below those required by its competitors, with development costs amounting to less than $6 million.
Established in April 2023 by Liang Wenfeng, who also founded the quantitative hedge fund High-Flyer, DeepSeek benefits from the financial backing of its hedge fund operations. This independence spares it from relying on external venture capital, allowing for greater flexibility in operational decision-making.
DeepSeek’s first AI model, DeepSeek Coder, was offered free to researchers and even permitted for commercial use. Subsequently, the company introduced its first large-scale natural language model, DeepSeek LLM, followed by the release of DeepSeek-V2 in May last year. The latter attracted widespread adoption by offering superior performance at a lower cost, pressuring Chinese tech giants like ByteDance, Tencent, Baidu, and Alibaba to lower the usage fees for their AI models to retain their user base.
The latest model, DeepSeek-V3, has expanded its parameter scale to 671 billion, surpassing Meta’s Llama 3.1 with 405 billion parameters. Remarkably, it was trained using only 2,048 NVIDIA H800 GPUs within two months, at a cost of merely $5.6 million—a fraction of the training expenses incurred by other tech companies.
DeepSeek’s AI models are accessible via web platforms, apps, or API integrations. The DeepSeek-R1 version, released under the widely adopted and permissive MIT license, allows unrestricted commercial use, further attracting industry adoption.
In contrast to other tech companies that invest billions of dollars and procure vast quantities of GPUs to develop AI models, DeepSeek’s emergence demonstrates that high-performance AI can be achieved with significantly lower costs. This disruptive approach has even led to a sharp decline in NVIDIA’s stock value and other tech equities.
Moreover, DeepSeek’s rapid ascent highlights its ability to build high-performing AI models with fewer hardware and financial resources. Notably, it leverages NVIDIA’s earlier A100 accelerators, showcasing resilience against U.S. government-imposed technology export restrictions and proving that advanced AI technologies can still be developed under such limitations.
The rise of DeepSeek underscores the feasibility of constructing cost-effective, high-performance AI technologies. It also casts doubt on the exorbitant spending by major U.S. tech companies on AI development, potentially steering more players toward adopting affordable, efficient, and fast AI development strategies—reshaping the competitive landscape of AI innovation in the United States.
Related Posts:
- Starlink V3 Satellites Promise Blazing Fast Internet Speeds
- 34 tech firms signed “Cybersecurity Tech Accord” agreement that does’nt support government hacking operations
- Schneider Electric Fixes Vulnerability in U.motion Builder