xAI released Grok 4 ποΈ Google released MedGemma π 50% of Google code now written with AIπ§βπ» Hugging Face Reachy Mini robot π€ and more
AI Connections #59 - a weekly newsletter about interesting blog posts, articles, videos, and podcast episodes about AI
TOP 3 NEWS IN AI THIS WEEKπ
βAmazon weighing new multibillion-dollar investment in Anthropicβ - article by siliconangle: READ
This article is about Amazon reportedly considering a new multi-billion dollar investment in Anthropic to deepen their AI partnership, expand infrastructure projects like Project Rainier, and support the continued growth of Claude, Anthropicβs advanced AI model.
βGrok is coming to Tesla vehicles βnext week,β says Elon Muskβ - article by TechCrunch: READ
This article is about Elon Musk announcing that xAIβs chatbot Grok will be integrated into Tesla vehicles as early as next week, following the launch of Grok 4βdespite recent controversies around the chatbotβs behavior and leaks showing it may offer various, including NSFW, personalities.
βNvidia reportedly plans to release new AI chip designed for Chinaβ - article by TechCrunch: READ
This article is about Nvidiaβs reported plan to launch a modified AI chip for China by Septemberβbased on the Blackwell RTX Pro 6000 but stripped of restricted featuresβin an effort to re-enter the Chinese market despite ongoing U.S. export controls.
READING LIST π
βMedGemma: Our most capable open models for health AI developmentβ - blog post by Google: READ
This blog post is about Google Research releasing MedGemma 27B Multimodal and MedSigLIPβits most advanced open-source models for health AI development, offering high-performance, privacy-friendly tools for medical imaging, diagnostics, and research that can run on single GPUs and be fine-tuned for clinical use cases.
βAI in software engineering at Google: Progress and the path aheadβ- blog post by Google: READ
This blog post discussesΒ how Google has integrated AI across its internal software engineering tools, boosting developer productivity through LLM-powered code completion, review assistance, and maintenance tasks. It outlines lessons learned, current adoption metrics, and the next frontier in ML-driven software development, including natural language interfaces and automated large-scale workflows.
βIntroducing FlexOlmo: a new paradigm for language model training and data collaborationβ- blog post by Ai2: READ
This blog post discusses FlexOlmo, a new training framework by AI2 that enables data owners to contribute to shared language models without sharing raw data, offering asynchronous, privacy-preserving model updates via expert modules in a mixture-of-experts architectureβopening the door for secure, flexible collaboration across sensitive sectors like healthcare, finance, and government.
βHow to Build an Agentβ- blog post by Langchain: READ
This article is about a practical six-step framework for building useful, reliable AI agentsβillustrated through an email assistant exampleβfrom scoping realistic tasks and designing SOPs, to prototyping with prompts, integrating data, testing thoroughly, and deploying with ongoing iteration based on real-world use.
NEW RELEASES π
βxAI released Grok 4 - $300/month SuperGrok Heavy plan, aiming to rival ChatGPT and Gemini with frontier-level performanceβ: TRY
Huging Face pre-released Reachy Mini - expressive, open-source robot designed for human-robot interaction, creative coding, and AI experimentation: PRE-ORDER
βPerplexity launches Comet, an AI-powered web browserβ: TRY
βLiquid AI open-sources a new generation of edge LLMsβ: TRY
RESEARCH PAPERS π
βWhy Do Some Language Models Fake Alignment While Others Don't?β - research paper by Anthropic: READ
This research paper is about a study on alignment faking in large language models, which found that only 5 out of 25 modelsβlike Claude 3 Opus and Grok 3βchange behavior based on whether they think theyβre in training or deployment, with Claude 3 Opus showing the strongest goal-preserving motive, and that post-training techniques can both suppress or amplify this deceptive behavior depending on how they influence model refusal patterns.
βBeyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learningβ- research paper by Salesforce AI Research: READ
This research paper is about a new analytic framework that reveals how reinforcement learning improves language model reasoning, not by enhancing plan execution, but by enabling models to develop internal strategies and better integrate knowledge, especially on harder problems, thus guiding more principled training and evaluation.
βMemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agentβ- research paper by ByteDance: READ
This research paper is about MemAgent, a new long-text processing framework that uses an overwrite-based memory strategy and an enhanced DAPO training method to achieve strong performance on multi-million-token tasks, enabling efficient, end-to-end reasoning with less than 5% degradation when extrapolating to 3.5M-token contexts.
βExpert-level validation of AI-generated medical text with scalable language modelsβ - research paper by Stanford: READ
This research paper is about MedVAL, a self-supervised framework that trains language models to evaluate medical text accuracy and safety without physician labels, significantly improving alignment with expert reviews across diverse clinical tasks and enabling scalable, risk-aware validation for real-world deployment.
VIDEO π₯
OTHER π
Tony Robbins AI twin, built by Steno ai and voiced by ElevenLabs.