Alibaba’s $0.40 Qwen moves closed, multimodal AI closer to every workflow
Qwen3.7-Plus pairs text, video, and image input with low token costs, but Alibaba is now selling it through a proprietary gate.

Alibaba released Qwen3.7-Plus, a multimodal AI model priced at $0.40 per 1M input tokens and $1.60 per 1M output tokens, while keeping it under a closed commercial license. For decision-makers, that means lower inference costs and stronger agentic capabilities are now tied to vendor lock-in and compliance scrutiny, not open-weight flexibility.
Alibaba just made a sharp move with Qwen3.7-Plus: it is pricing a multimodal model at $0.40 per 1M input tokens and $1.60 per 1M output tokens, while also putting it behind a closed commercial license. That combination is the whole story. The model is cheaper than Alibaba’s prior text-only Qwen3.7-Max, which had a total cost of $10.00 per 1M tokens in VentureBeat’s snapshot, and it comes with more capabilities, not fewer. In plain English: Alibaba is offering a lower-cost model that can handle text, video, imagery, screenshots, and enterprise visuals, but only through proprietary APIs and Qwen Chat.
That matters because Alibaba’s Qwen line built much of its reputation on powerful open-source models, or at least open-weight releases that enterprises could inspect, adapt, and deploy more freely. Qwen3.7-Plus marks a big departure from that strategy. The immediate tradeoff is obvious for teams that care about budget and performance: the model is among the cheaper powerful AI options available right now, coming in just above MiniMax-M3’s limited-time discount pricing. The bigger tradeoff is strategic. Companies that had been relying on open Qwen models, including U.S. giants such as Airbnb mentioned in the source, may find that Alibaba’s newest releases no longer fit the same procurement, security, or customization playbook.
The pitch is still pretty compelling if you care about agentic workflows. Alibaba says Qwen3.7-Plus ships with a 1-million token context window and up to 256K tokens reserved for internal chain-of-thought processing. The source frames this as a response to a common failure mode in autonomous agents: state decay, or the tendency for a system to lose its analytical thread over long, multi-step tasks. That is not a sexy phrase, but it is a real problem. If you are trying to automate something like a cloud migration, a codebase audit, or a complex terminal workflow, the model needs to keep track of dependencies, previous decisions, and edge cases without forgetting where it started. Alibaba’s answer is a feature called preserve_thinking, which retains internal blocks across conversational turns so the model can keep reasoning without dropping context midstream.
This is not just Alibaba inventing a cute label for an old idea. The source says preserve_thinking was introduced during the prior Qwen 3.6 generation and was integrated into both the open-weight Qwen3.6-27B and the proprietary Max models. It also notes that the broader industry has converged on similar mechanisms. Anthropic calls its version Extended Thinking, and OpenAI uses an encrypted reasoning pass-back mechanism for models like GPT-5.5. Different names, same pressure: if a model is going to do complex tool use, it has to remember its own logic. For builders, that means the conversation is no longer just about raw model intelligence. It is about whether the model can hold a plan together while it executes.
Benchmark results suggest Qwen3.7-Plus is competitive, though not the absolute front-runner. On Terminal Bench 2.0-Terminus, which tests a model’s ability to run terminal-level code safely and iteratively, it scored 70.3, ahead of DeepSeek-V4-Pro Max at 67.9 and Gemini-3.1 Pro at 63.5. On ScreenSpot Pro, a computer vision benchmark focused on localized interface understanding, it scored 79.0, beating GPT-5.4 xhigh at 67.4 and Claude-Opus-4.6 at 49.5. The source also says it still falls below many leading and prior generations of U.S. proprietary models such as Anthropic’s Claude Opus 4.6 and OpenAI’s GPT-5.4 on raw capability metrics overall. So the picture here is not “Alibaba just won everything.” It is more specific than that: Qwen3.7-Plus looks strong in the places that matter for multimodal agents and enterprise workflow automation, even if it is not universally dominant.
The pricing structure is where the product gets really interesting for operators. Standard input processing sits at $0.40 per million tokens, but cached reads can fall to $0.04 per 1M tokens. That matters if your agent keeps revisiting a large code repository, a static UI kit, or a persistent enterprise knowledge base across hundreds of loops. Cheap caching turns expensive repetitive work into something much more usable at scale. Alibaba also says the API is OpenAI-compatible, which lowers switching friction for teams already built around that ecosystem, and the model can run through local terminal setups by changing base environment targets. In other words, Alibaba is not just selling a model. It is selling a migration path.
Still, the license is the catch. The source says legal and security teams will see the lack of open source licensing or open weights as a primary compliance question, especially for enterprises that previously liked the Qwen family because it offered Apache 2.0 or customized open-use options. That is the tension decision-makers now have to price in. Lower unit costs and stronger multimodal performance are attractive, but proprietary access changes how models are audited, governed, and swapped out later. For founders, CFOs, and platform teams, Qwen3.7-Plus is a reminder that the AI market is moving toward a blunt new bargain: the best features are often getting cheaper, but they are also getting more closed. The real question is no longer just what the model can do. It is how much control you are willing to give up to use it.
This story's Key Insights and Take-aways are locked.
Create a free account to unlock Executive Actions for one credit.
Register to UnlockAlways free for Executives Club members. Join the Club
More in Technology

Fermentation turns food waste into profit, not landfill
A centuries-old process is turning processing byproducts into valuable ingredients, hinting at a cleaner, more circular supply chain for food makers.
AI hardware is bigger than Nvidia and the hyperscalers
Investors looking for the generative-AI buildout can widen the lens beyond the obvious winners and hunt for the less crowded infrastructure plays.

Google quietly trims Cloud as AI spending keeps eating the org chart
Layoffs have hit Google Cloud and Mandiant, including the Threat Intelligence Group, as the company says it is reallocating toward growth areas like AI.
