DeepSeek makes the V4 Pro price discount permanent

hckrnws

DeepSeek makes the V4 Pro price discount permanent

(api-docs.deepseek.com)

433

248

by Tiberium

> (3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC.

https://x.com/deepseek_ai/status/2057854261699195173

alyxya
921d
Once they have their own coding agent which they seem to be working towards, I may start predominantly using their models. They seem to be doing all the "right" things, open sourcing models, publishing research, and keeping prices low for everyone.
1. ammar_x
  461d
  You can use V4 Pro with Claude Code [1].
  I tried it and it's impressive.
  [1]: https://api-docs.deepseek.com/quick_start/agent_integrations...
  1. KronisLV
    01d
    I'm working on a custom launcher for hooking up Claude Code with various providers (groups env variables in profiles) cause DeepSeek doesn't have vision and sometimes I need browser use with screenshots or Opus reasoning, for other tasks it's fine: https://ccode.kronis.dev/
    # After installed (or when run portably with ./ccode) ccode init-config ccode edit-config # Run with default profile ccode # Run with named profile ccode --deepseek # Set default profile ccode set-default-profile deepseek
    Also turns out that with a local proxy you can get Remote Control working and see the DeepSeek sessions in the desktop app, screenshots on the page. Other than that, I'm happy that it works pretty well and the discount is enough to make me consider going from Anthropic's Max subscription to Pro and using it only where DeepSeek is insufficient. With that proxy I eventually hope to be able to transparently switch models mid-task, if I need Opus for like 5 turns or something.
    Overall though I'm not sure exactly how well Claude Code would stack up against OpenCode, since the latter overall feels a bit less hacky with 3rd party models and is even getting niche but nice features like a locally runnable web version: https://opencode.ai/docs/web/
2. LaurensBER
  01d
  It works very well with OpenCode. My team keeps hitting the 5h limits on other subscriptions and it's pretty good to have Deepseek as a backup. I just put 50 bucks on there and it feels like it'll never run out.
  It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!
3. lambda
  101d
  Why do you need them to provide a coding agent? Just use their model with any off the shelf coding agent. I happen to prefer Pi, but use whatever works for you.
  1. alyxya
    21d
    I probably have an unfounded assumption that whatever coding agent they make will work really well with their models, better than external harnesses. I don't have a good sense for how all the model + harness combinations compare, nor any good way to compare them myself, but generally believe model companies train their models to work best with their own harness.
4. smoe
  71d
  Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at interactive coding yet, but more at issue triage, bug auto-fixing, log analytics, etc.
  I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.
  So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.
  They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price
  1. try-working
    01d
5. jdboyd
  31d
  I would prefer a coding agent to be somewhat independent of the model provider. Providers are trading off on quality, features, and price so frequently, and I don't want to keep changing my agent every time.
  I am looking forward to things slowing down and stabilizing. I'm not saying that should happen today, just I am looking forward to it.
  1. gaolei8888
    223h
    I think this will happen much sooner than we thought. Maybe it will happen in next 6 months
6. tequila_shot
  01d
  You no longer need "their coding agent". You can hook up claude code to use Deepseek. Works perfectly.
7. minimaxir
  01d
  Zed's Agent natively supports a DeepSeek API key now. (do not use it through OpenRouter if you want to save the most cost)
8. vinhnx
  01d
  You can use DeepSeek with my coding agent VT Code. Recently I've added DeepSeek V4 Pro and DeepSeek V4 Flash support with all providers, via: Official DeepSeek API, HuggingFace, Ollama Cloud, OpenRouter providers.
  > https://github.com/vinhnx/vtcode
9. zozbot234
  41d
  antirez's ds4-agent works quite fine. It runs on any Apple Silicon device with 96GB RAM or more.
  1. rjh29
    11d
    I wonder how many years it'll take for the API token cost to exceed the money spent on ram.
10. raincole
  01d
  All the major coding agents already support DeepSeek.
11. teekert
  019h
  Why not OpenCode? Genuine question, not an expert..
12. cultofmetatron
  11d
  open code works with them today. I've been using it fulltime for 2 weeks so far.
  1. sunaookami
    01d
    Using it with Pi and can only report good thing so far. I'm very impressed by how good it is (also it's way slower than Claude Sonnet and GPT-5.5 and often thinks "too much" before starting).
13. potsandpans
  01d
  Give pi a try if you haven't already. Avoid vendor harness lock-in.
14. [deleted]
  01d
  [deleted]
15. linzhangrun
  022h
  there already is a open-sourced deepseek-tui coding agent. besides, you can always connect to opencode.
16. jack_pp
  01d
  i have done some amazing things for 5 dollars, using opencode. give it a shot, it is incredibly cheap
17. ReptileMan
  41d
  Both pi, opencode and zed work amazing with deepseek.
  1. Guillaume86
    31d
    You seem to have tried a few things, if you don't mind I have a few questions as someone currently on Claude Code but would prefer to not lock myself in a commercial ecosystem (and their pricing change regarding headless usage is annoying me):
    - how do/would you add the WebSearch tool to your harness? pay for a separate service or does deepseek offer something with their subscriptions?
    - do pi/opencode support pasting images in prompts?
    - how do you handle reading images? deepseek is not multi modal IIRC? do you pay for another model and route to it?
    Any of these missing would really annoy me in day to day use...
wg0
171d
If you have not tried DeepdeekV4 you're missing out. The pricing makes it unbelievably good.
The chains of thought for Deepseek are very very interesting reads. Open code won't show them but do read them and you'll be surprised at how underrated the model is.
My model usage is very low but I still do pay directly to Deepseek regularly as my tribute and contribution to them open sourcing their models as my gratitude and showing support for what I deem positive for overall social good.
1. abyssin
  51d
maltalex
151d
This looks suspiciously cheap.
The same model hosted by other providers is much more expensive [0]. So either DeepSeek can host it much cheaper than anyone else, or their business model is different. I suspect the latter, especially since their privacy policy [1] says personal data, including “User Input,” can be used "To improve and develop the Services and to train and improve our technology".
[0]: https://openrouter.ai/deepseek/deepseek-v4-pro/providers
[1]: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-pol...
1. Palmik
  118h
minimaxir
101d
I'm more curious about the caching:
> (2) For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.
There is no end date. Currently, it's 2% of the input price for DeepSeek V4 Flash and 0.8% with this new V4 Pro pricing, which is extremely low compared to competitors to the point that it affects the unit economics a bit and I thought it would be temporary.
In the case of V4 Pro, the effective cost is ~$0.04/M input tokens given the caching (based on OpenRouter's metrics: https://openrouter.ai/deepseek/deepseek-v4-pro), which is significantly cheaper than even small models from competitors.
1. Palmik
  018h
Sphax
21d
That is some insane value. I've been using GLM Coding Plan Max with GLM 5.1 for a while and i've tested DeepSeek V4 Pro maybe for 3 weeks now and I found it to be better than GLM 5.1 for complex coding tasks. I've used 65m tokens and with that price it cost me $1.5, that's really cheap.
1. DeathArrow
  11d
  I think Deepseek uses much more tokens than other models.
Reubend
91d
Props to them. That makes DeepSeek v4 Pro extremely cheap compared to others, even in the same category. Look at these prices per million outputs tokens:
DeepSeek V4 Pro: $0.87
Qwen 3.7 Max: $7.50
Grok 4.3: $2.50
GLM 1.5: $3.08
Opus 4.7: $25.00
GPT-5.5: $30.00
1. Arcuru
  11d
  It's actually even cheaper when you look at the cache read costs. Those costs can dominate in agent workflows and DeepSeek's cost for cache reads is insanely low comparatively. At $.003626/M tokens, the cheapest other thing on your list is >$.2/M tokens. That's on the scale of 100x cheaper.
doctoboggan
141d
I am more worried about accidental data leak (agent reading env file for example) with the Chinese hosted models compared to the US hosted models. Am I wrong to suspect that the Chinese government might be more likely to scan all chats and save useful information compared to the US government or company?
I hesitated to even post this comment as it sounds biased and xenophobic. I would love for someone to convince me I am wrong. Does anyone have any insight into the company behind deepseek hosting, and what their history of respecting data privacy is?
1. 3s
  11d
gertlabs
11d
Even with the V4 Pro discount, the V4 Flash model gives you the best performance per unit dollar, and better performance overall for agentic, tool-heavy workloads. V4 Pro is smarter in one-shot reasoning, but at a significant speed difference. The performance, cost, and speed, makes V4 Flash our top flash model today by far.
Data at https://gertlabs.com/rankings
1. dyauspitr
  01d
  In my use cases (mainly very large summarization and idea extraction) it’s pretty shit though compared to Pro.
cold_harbor
41d
their MLA architecture cuts KV cache by ~5-13x vs standard attention. that's why inference is actually cheaper to run, not just a price war to gain market share.
1. zozbot234
  01d
  That's also a game changer for local inference. It unlocks long contexts, batched inference and storing the KV cache to disk on ordinary consumer platforms.
g023
01d
If anyone is looking to hook it up to copilot, I made a proxy script to handle the connection a bit back that might be handy: https://gist.github.com/g023/c2bb7b540ffe64cee76023f18f6f936...
smallerfish
013h
They may be state backed, in which case the loss-leading could be a geopolitical move. It's a useful model regardless.
China sell lithium at a loss to make it unprofitable for Australian/US miners, for example (https://www.miningweekly.com/article/china-is-oversupplying-...).
wolttam
01d
I was hoping they were going to do this.
I'll keep running Flash locally for the stuff I care about data privacy, but the value of Pro through their API is unreal for anything else (and I want to give them my training data as long as they keep putting out open models).
jorl17
01d
I've been extremely impressed with DeepSeek V4 flash.
We've been working on a project which can be thought of as an agent, just not for coding. So we've been building everything: agents, sub-agents, RAG, dynamic intent detection, changing models based on what's being done, etc. In our tests, DeepSeek V4-flash is the cheapest model with acceptable replies (few hallucinations, while finding the right information). It's not the cheapest one we run overall (we're actually surviving with 3B models for some tasks), but it's definitely the one powering the system and driving the main "agent".
margorczynski
231d
Maybe the Chinese are playing the long game by trying to bankrupt the US competition? Because there's no way this is financially viable.
1. ecommerceguy
  01d
  Small team, cheap electricity, very efficient models. Many western companies operate at a loss to gain market share. Why can't the Chinese?
ascotan
013h
DeepSeek's official privacy policy explicitly states: “To provide you with our services, we directly collect, process and store your Personal Data in the People's Republic of China.”
US companies dont sell AI services in China (as far as I know) but deepseek markets to US companies and customers.
bel8
01d
Great! I have been using DeepSeek 4 Flash high for everything lately.
First accessible model with useable 1 million context window for me.
onlyrealcuzzo
91d
I just canceled Claude Code and Codex today.
RIP.
Claude literally refuses to finish tasks in auto mode and just keeps saying, now is a good stopping point, when it's 1% done (and doing the EXACT OPPOSITE of what I tell it).
Codex is barely better...
May as well pay 1/20th the price for DeepSeek.
Claude seems to have something that looks at how long you've been a customer and then just massively degrades quality.
When I started my subscription, Claude had none of these problems.
2 months into subscriptions Claude is completely unusable garbage, and Codex is not much better.
1. eiek
  41d
spudlyo
01d
I use it with Pi and with Gptel and I'm extremely happy about the price. The speed of deepseek-v4-pro though leaves something to be desired. I do love how detailed its chain of thought reasoning is, and it's pretty wild watching it think at ~2400 baud. It much more transparent than Gemini 3.5 flash in that regard, but maybe 4-5x slower? For my Latin language morphology and linguistic tasks it seems to be up to the job, and on the plus side I can analyze a handful of sentences parallel without worrying about breaking the bank.
belinder
111d
Anyone using deepseek through a gateway (not sure if right term) so there's no data retention? At work we're going through a few hundred million tokens a day in our app (using anthropic models), and we're looking for something significantly cheaper
1. wkcheng
  01d
  Use it through Azure! Azure hosts DeepseekV4-Pro and DeepseekV4-Flash themselves. We're using it and it works great.
  You don't get the discount that Deepseek is providing, but it's still a cheap model (v4-pro is cheaper than sonnet)
Palmik
018h
I really hope Huawei ramps up Ascend production and DeepSeek open sources their optimized inference engine (they already open source a lot of their kernels -- kudos to them). This could shake things up.
zmmmmm
01d
I will testify I have used V4 Pro as a coding agent and it did a great job solving a complex problem. It worked with Pi over something like an hour, iterating and running tests. I paid API rates via OpenRouter and it cost me less than $1 I think. I've had single prompts cost that much with Anthropic. I was very impressed.
louiereederson
21d
I wonder if/when the US limits market entry of Deepseek and other Chinese model vendors like they have done with Huawei
1. mmastrac
  01d
  How would that be technically feasible? Would we get IP bans?
tacone
11d
TIL I might be able to use DeepSeek directly from VS Copilot https://github.com/Vizards/deepseek-v4-for-copilot (disclaimer: I have to try it yet).
1. vitaflo
  01d
  Deepseek has instructions on how to do this on their website (along with many other agents):
  https://api-docs.deepseek.com/quick_start/agent_integrations...
velomash
11d
I found that DSV4 wasn't as cheap as its token price. It burns tokens at a pretty high rate
1. bel8
  01d
  try high variant instead of max.
  max is really chatty for minimal gain.
picardo
31d
I tried it with Claude Code for a while but lack of WebSearch tool became a dealbreaker for me. Does anyone know of they will provide support for it?
1. freakynit
  219h
  You can integarte a search mcp server. I use it this way and it works flawlessly well.
dburkland
01d
I've had a ton of success when pairing Opus 4.7 for planning w/ DeepSeek V4 Flash in opencode. Best part is DeepSeek V4 Flash is Free through opencode Zen.
sidcool
221h
I love Deepseek, but there is a pro China opinion deep rooted in it. Test it out for yourself.
1. ReptileMan
  119h
  I choose pro china over pro woke every day of the week.
  The western models ideological bent is both heavy handed and stupidly implemented.
kingjimmy
11d
is this the Huawei chip difference?
1. chvid
  01d
  That is probably why they were a few months delayed. But could be interesting to see their hosting / network / colocation setup.
keithfawcett
01d
Minimax M2.7 is surprisingly cheap as well, especially on their subscription plan.
Havoc
01d
Neat. I like DS for secondary checks on code. Sometimes spots things other models don't
nelox
01d
China says thank you.
sourcecodeplz
01d
Honestly I haven't even tried the Pro model. Flash was just so much more than I expected I just keep working with it. Thank you deepseek team
vladgur
11d
Which models do folks use for openclaw nowadays
1. npilk
  01d
  I've been using DeepSeek Flash to replace Sonnet once the subscription stopped working. Haven't really noticed a difference, although I don't usually have it doing anything very complicated.
jijji
01d
I just can't get past the deepseek-CCP connection... as good as it might be I'd wonder when your machine gets backdoored by the CCP or at least your data gets stolen
rvz
01d
Someone can afford to race everyone to zero.
Remember Jevons paradox? [0] It isn't at Anthropic or Microsoft [0], but it is at DeepSeek.
[0] https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...
guelo
41d
Even at these prices I find claude and codex subscriptions to be cheaper than per-token pricing when my usage is hovering around the session limits. I guess the subscriptions are heavily subsidized.
1. guelo
  31d
  I guess I got downvoted because people don't believe me that it's cheaper? But I spent $5 a couple days ago in one hour with deepseek v4 in a coding agent. That's way more expensive than a $20/month claude subscription. Even if I hit claude's 5h limit in one hour I can do that many times in a month.
[deleted]
01d
[deleted]
dyauspitr
01d
Oh shit that changes everything. This might be the biggest thing to happen to LLMs this year.
[deleted]
01d
[deleted]

# After installed (or when run portably with ./ccode) ccode init-config ccode edit-config # Run with default profile ccode # Run with named profile ccode --deepseek # Set default profile ccode set-default-profile deepseek