Qwen-3 π§ , ChatGPT Shopping feature ποΈ, The Leaderboard Illusion π, Visa launches AI agents for shopping π, Sleep-time Compute π€
AI Connection is back! AI Connections #49 - a weekly newsletter about interesting blog posts, articles, videos, and podcast episodes about AI
NEWS π
βQwen3: Think Deeper, Act Fasterβ - blog post by Qwen Team: READ
Qwen3 is the latest open-weight large language model series from the Qwen team, featuring both dense and Mixture-of-Experts models like Qwen3-235B-A22B, which rival top-tier models in coding, math, and reasoning. It introduces hybrid βthinkingβ and βnon-thinkingβ modes for flexible inference, supports 119 languages, and significantly improves agentic capabilities and efficiency. Trained on 36 trillion tokens and available under Apache 2.0, Qwen3 models are easy to deploy using tools like Hugging Face, vLLM, and Qwen-Agent.
Google Q3 earnings call: CEOβs remarks about AI - blog post by Google: READ
Alphabet had a strong Q3 driven by rapid AI innovation, with its full-stack AI approachβspanning infrastructure, research, and global product reachβpowering major product launches and operational efficiencies, including a 90% reduction in AI Overview costs and widespread adoption of Gemini models. Google Cloud revenue grew 35% YoY to $11.4B, fueled by demand for its AI infrastructure and platforms, while YouTube surpassed $50B in annual ad and subscription revenue and Waymo became the first autonomous vehicle company to exceed 1 million fully autonomous miles driven weekly.
ChatGPT now can help you shop
ChatGPT is rolling out new shopping features to help users find, compare, and buy products more easily, including improved product results, visuals with pricing and reviews, and direct purchase links. These features are not ads and are being gradually released to all user tiers, including Plus, Pro, Free, and logged-out users.
The urgency of AI interpretability β blog post by Dario Amodei (Anthropic): READ
Dario Amodei warns that AI interpretability is not keeping pace with rapidly advancing capabilities and calls for urgent investment to avoid dangerous blind spots. He highlights progress in mechanistic interpretabilityβmapping features and reasoning circuits in models like Claudeβas a path to building an βMRI for AIβ before highly autonomous systems emerge by 2026β2027.
AIβs disruption of software jobs: new insights from Anthropicβs Economic Index β blog post by Anthropic: READ
Anthropicβs analysis shows developers increasingly use Claudeβespecially Claude Codeβfor automating coding tasks, with UI and web app work most affected. Startups lead adoption, and the trend may accelerate AI progress while reshaping software roles.
The Mechanics of Mafia β blog post by Peter Thiel: READ
Peter Thiel reflects on building the "PayPal Mafia" and argues that great company culture isnβt built with perks but with deep alignment on mission and team. He emphasizes hiring people who genuinely want to work together on a unique problem, assigning each person one clear responsibility, and fostering strong internal bonds that resemble cult-like dedicationβminus the crazy. The best startups, he says, aren't collections of talent but tightly knit tribes, fanatically right about something the world has missed.
Visa launches Intelligent Commerce: AI agents that shop for you
Visa unveiled Intelligent Commerce, a new AI-powered system of agents that can autonomously discover, shop, and buy on behalf of consumersβhandling everything from product discovery to post-purchase support. The goal is to create a more personalized and secure shopping experience by streamlining the entire consumer journey with intelligent automation.
The Always-On Economy β blog post by Sequoia Capital: READ
In the next 5β7 years, AI wonβt just automate tasksβit will eliminate time constraints, ushering in an βalways-onβ economy where sectors like healthcare, security, education, and customer service operate 24/7. Hybrid human/AI systems will enhance access, efficiency, and global competition, with startups already leading the shift in areas like diagnostics, documentation, and support. Buhler argues this transition will redefine work patterns and business models, giving a massive edge to organizations that embrace continuous, AI-powered operations.
RESEARCH PAPERS π
βMem0: Building Production-Ready AI Agents with Scalable Long-Term Memoryβ- research paper by the Mem0 AI team: READ
Mem0 is a scalable memory architecture that enables LLMs to maintain long-term conversational coherence by dynamically extracting and retrieving key information, with a graph-based variant capturing complex relational structures. It outperforms six major baselines on the LOCOMO benchmark while reducing latency and token costs by over 90%, making it both more accurate and efficient for multi-session dialogue.
βCollaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoningβ - research paper by University of California: READ
Researchers introduced MINDcraft, a platform for testing how LLM agents collaborate in the open-world game Minecraft, and MineCollab, a benchmark to evaluate embodied, multi-agent reasoning. Experiments show that current LLMs struggle with collaborative tasks, especially due to inefficiencies in natural language communicationβcausing up to a 15% performance drop. The study highlights that today's LLM agents are not well-optimized for embodied collaboration and require approaches beyond in-context or imitation learning.
The Leaderboard Illusion β research paper by Cohere Labs: READ
Cohere Labs reveals flaws in Chatbot Arena, showing that private testing practices and selective score disclosures by major providers like Meta, Google, and OpenAI distort leaderboard fairness. Closed models are sampled more often and retain Arena presence longer, giving them disproportionate data accessβleading to overfitting on Arena-specific dynamics rather than true model quality. The report calls for reforms to promote transparency and fairness in AI benchmarking.
Welcome to the Era of Experience β research paper by David Silver & Richard Sutton: READ
Silver and Sutton propose a shift in AI toward agents that learn primarily from experience, marking the dawn of a new era of superhuman capability. Rather than relying on static data, these next-gen systems will develop intelligence through interaction and continual learning, echoing the way humans learn over time.