After the initial furore following DeepSeeks’ release in January 2025; we can now take a bit of time to look back at DeepSeek, what happened and what the impact to this might be.
05-02-2025
After the initial furore following DeepSeeks’ release in January 2025; we can now take a bit of time to look back at DeepSeek, what happened and what the impact to this might be.
What is DeepSeek?
What is DeepSeek? Well, you may be forgiven for thinking why another blog on DeepSeek? After all it has been all over the news after it suddenly exploded onto every news outlet around the world! Before mid-January 2025 many (most) people would probably not have heard of DeepSeek unless they were very closely following the world of LLMs and Generative AI. In which case they may have noted this Chinese company and its releases of DeepSeek Coder in November 2023, DeepSeek-V2 in May 2024 and DeepSeek-V3 in December 2024 before DeepSeek-R1
in January 2025.
DeepSeek is yet another Chatbot style AI tool. Like others of its kind (such as OpenAI’s ChatGPT, Googles Gemini and Meta’s Llama) you can ask it questions and in most cases, it can give you surprisingly good answers very, very quickly. As with such systems as ChatGPT, it employs Natural Language Processing to ‘understand’ your questions or prompts and once it has generated it answers, it again uses NLP to created understandable and meaningful outputs. Between these two stages DeepSeek utilises various AI based tools (and optimization techniques) to generate these responses, including an element of context awareness.
Technically DeepSeek is a Chinese company that develops several tools, most of which have DeepSeek as part of their name, such as DeepSeek Code (intended as an intelligent code assistant) or DeepSeek-r1 (the latest release of the Chatbot style application at the time of writing). DeepSeek is itself a subsidiary of the rather fancifully named High-Flyer Capital Management (a quantitative analysis company launched in 2015). DeepSeek itself was only founded in 2023 although it has come a long way in that time. Both DeepSeek and its parent company High-Flyer Capital Management) were created by Liang Wenfeng.
Liang Wenfeng is an entrepreneur and businessman who although not a computer scientist per se has a long history of working with AI tools for a variety of industries, although it is in finance that he has been most successful. Indeed, High-Flyer has made extensive use of AI techniques as part of its operations.
A core element of its approach is to the use of LLMs (or Large Language Models). LLMs are language models with a very large number of parameters and are trained on very large amounts of text for them to ‘understand’ a wide variety of topics. In recent years OpenAI with its GPT algorithms has been the leader in the field of LLMs particularly in terms of performance and the results obtained – until now!
Above I somewhat cryptically said that DeepSeek can ‘in most cases’ give you surprisingly good answers. Why is it most cases? Well, this has to do where the DeepSeek company is based, China. It is a Chinese company and so asking it questions which the Chinese government might not like is taboo. For example, if you ask it about Tiananmen Square you will find that it doesn't want to tell you anything.
This is because, as with other Chinese AI based Chatbots they are ‘trained’ (if that is the right word) to avoid answering politically sensitive or politically charged questions. From a western point of view this raises numerous ethical and legal implications.
What is all the Hype About?
So, what is the hype about? Well, it has been claimed that DeepSeek has been able to achieve the performance levels of OpenAI’s ChatGPT and Meta Llama at a fraction of the cost and in much shorter time. Although it should be noted that there are those who are sceptical of this claim and others who claim that DeepSeek has been able to piggy back of OpenAI itself!
For example, DeepSeek claims that it has been able to develop the current version of its ChatBot for about £4.8 million (or $6 million). This is significantly less than the in excess of $100 million suggested by OpenAI boss Sam Altman with respect to GPT-4.
This means that in terms of development cost and from this its operating costs, DeepSeek is able to significantly undercut American rivals. In fact DeepSeek has been released as a fully Open-Source model which anyone can download and use to build their own system.
How has DeepSeek achieved this?
It appears that in order to cut the cost of development, DeepSeeks engineers have used several different techniques. These techniques in and of themselves don't; seem that unique but combining them together and focussing on their benefit seems to have been the key thing.
These include the use of pure reinforcement learning, reward engineering, distillation
and representations that use less memory. As well as something called emergent behaviour network.
Reinforcement learning is a little bit like a child learning not to put their hand on something hot. They do it once and the result is ‘bad’ so they don't do it again (mostly). In a similar way, DeepSeek learns in a trial-and-error manner, trying things out and seeing if the result is beneficial or not. Compared to other companies DeepSeek seems to have relied more heavily on pure reinforcement learning.
Distillation and Compact data Representations are again nothing new, but DeepSeek has used them to create fast an efficient system. Distillation essentially takes the results of the other learning systems and compresses them down into smaller, simpler systems, with less parameters that either focus on a specific set of tasks or act as summaries of the main process.
DeepSeeks’ innovation around the emergent behaviour of their systems is that using the results of the reinforcement learning, they can avoid expensive programmer input and rely on the system itself to develop naturally towards an emergent behaviour.
Does it really matter?
Price Disrupter!
For many US and in fact Chinese companies DeepSeeks’ pricing model is a major issue; they are simply under cutting their competition by some margin. This therefore undermines the financial viability of particularly, some US based companies. If they can't charge enough to recoup their costs, then their AI divisions will need to be funded from another part of the organisation or some external agency.
Stock Market Valuation Impact
This has already been seen around the world, but particularly in America. Thanks to DeepSeek’s fundamentally cheaper model stock markets have become concerned about the darlings of the AI world and fear that these companies may be lame ducks. Of course, the Stock Market is a notoriously bad predictor of successful technology companies and rarely do investors understand the complexities and intricacies of Generative Artificial Intelligence systems.
US v China v Rest of the World
The US has assumed that it was at the leading edge of AI research in this area. What DeepSeek has shown is that China is at least along with the US and possibly ahead of them in some areas. This could be moment similar to the Russians getting into space which might lead to a space race situation for AI with both countries trying to push ahead against each other. The rest of the world will probably tag along as well.
Open Source!
DeepSeek has open sourced its model! So what do other tech companies do now; where is the value in keeping part or all of their technologies hidden. Some reports even suggest the DeepSeek has been helped by open sourcing its codebase as other users / developers have helped move its technologies on.
AI is in the Big Time now!
AI is right there in the middle of all the news; its time has come! Now is the AI technology revolution that the internet was back in the 90s and the PC was back in the 80s!
Or is it another South Sea Bubble?
Or maybe it isn’t such a big revolution and this might be something of a South Sea Bubble with the technology reaching its peak in 2025 and then tailing off to just be a useful technology for creating automated ChatBots.
Artificial General Intelligence
Except DeepSeeks’ avowed aim is not the creation of a ChatBot per se but the development of Artificial General Intelligence (or AGI). AGI is a type of Artificial Intelligence that aims to create computer systems that have human like cognitive abilities. That is, they can learn, perceive, reason and adapt to new situations just like a human being. So who knows where DeepSeek will end up.
In essence this is a fix that only Google can resolve; there is little that downstream applications can do if Google provides an appropriate Access Token to them. Some fixes have been proposed and as of the end of 2024 Google has been officially looking into this issue (although it was originally reported to Google back at the end of September 2024 and initially Google rejected it as an issue).
Summary
In the long run, this last few weeks may seem like an inconsequential blip in the wests, and particularly the US’s, drive towards AI! However, if this is indeed the case, I suspect that there will still be an effect to note; like the Russia and the space race in the 1960s, DeepSeek has sent a jolt through the somewhat complacent view in the US that American companies were way ahead of everyone else. It has also sent a jolt through the stock markets as the value of AI companies has been hit (even if temporarily) by its release; which may make investors and the stock market more wary of AI companies in the near future.
Update!
According to an article on TechRadar, the Chinese AI, which purportedly cost an order of magnitude less than mainstream western counterparts to develop, may have done so using 50,000 specialised Nvidia GPUs - which would have cost an estimated $1.25 billion (at $25k per unit). It's more of a warren than a rabbit hole, but it does go to show that the stakes are high, and a lot of money is being thrown at this latest computing frontier by all sides.
Would you like to know more about AI?
We can help you embrace AI technologies to help your productivity. Find out more about our AI training courses:
We maintain a robust range of hands-on training courses covering Coding, Data Science, DevOps, Security and more - available for on-site and online delivery as part of a full-stack training programme or as short standalone workshops. We would love to discuss your learning needs - get in touch for a no-obligation scoping chat.
We use cookies on our website to provide you with the best user experience. If you're happy with this please continue to use the site as normal. For more information please see our Privacy Policy.