What's New in AI, April 4, 2026

Originally published on chento.io


            
        April 4, 2026
    
    
What's New in AI, April 4, 2026


        NotebookLM-generated technical brief, April 4, 2026
Google Released Gemma 4, Four Open-Weight Models Under Apache 2.0
Google released Gemma 4 on April 3, a family of four open-weight models built on Gemini 3 research. [1] The lineup: 2B dense, 4B dense, 26B mixture-of-experts, and 31B dense. All four carry an Apache 2.0 license and include native vision and audio processing across 140-plus languages. The larger models (26B and 31B) support 256K context windows, while the smaller E2B and E4B support 128K. The 2B and 4B models are designed to run on mobile devices and consumer laptop GPUs. [2] The Gemma family has crossed 400 million downloads across all generations. [3]
Apache 2.0 means commercial use without a separate usage agreement. You can build and ship products on top of these models, fine-tune them, and redistribute derivative weights. The multimodal capability at the 2B size is worth noting specifically: vision and audio in a model that fits on a phone is a different architecture category than what was available twelve months ago.
My Take
The small models are the actual story. A 2B parameter model with vision, audio, and 128K context that runs on consumer hardware is a real tool, not a benchmark result. For anyone building AI products independently, local inference on Apache 2.0 weights means cutting API costs, keeping data off cloud infrastructure, and owning your stack without a usage contract. The 400 million download count is not a vanity metric. Developers are shipping on Gemma in production. Better open-weight models at the small end are good for independent builders, and this is a meaningful step in that direction.
Oracle Is Cutting Up to 30,000 Workers to Fund $156 Billion in AI Infrastructure
Oracle terminated between 20,000 and 30,000 employees in late March, approximately 18% of its global workforce. [4] Workers were notified by email at 6am with no prior warning. The company expects to free $8 to $10 billion annually from the headcount reduction, redirected entirely toward AI data center construction. Oracle has committed $156 billion in AI infrastructure investment over the next several years. [5]
In the same period, Oracle filed over 3,100 H-1B visa petitions. [6] The company reduced its workforce by tens of thousands while simultaneously applying for thousands of foreign worker visas. Both moves were framed under the same AI investment rationale.
My Take
A 6am email to 30,000 people with no warning is not an investment strategy. It is a cost extraction event. The simultaneous H-1B filings make the stated rationale even harder to accept at face value. Oracle is not cutting headcount because AI has made these roles obsolete. Oracle is cutting headcount to free capital for infrastructure, and AI is the narrative frame around it. The workers who built Oracle's revenue base are funding the company's next bet. Executive compensation will not reflect the same sacrifice. Worth calling it plainly: this is corporate cost savings being sold as a technology pivot.
DeepSeek V4 Is Being Built on Huawei Chips Only, NVIDIA and AMD Were Denied Access
DeepSeek's upcoming V4 model is being developed exclusively on Huawei Ascend hardware. According to multiple reports, DeepSeek denied NVIDIA and AMD early access during development, giving Huawei and domestic Chinese chipmakers a head start on optimization. [7] V4 is a trillion-parameter mixture-of-experts architecture with a 1 million-token context window. DeepSeek plans to release it under Apache 2.0 at launch. [8] [9]
This would make V4 the first frontier-class open-source model developed without NVIDIA infrastructure. US export controls have restricted Huawei's access to advanced chips from American manufacturers since 2020. Chinese AI labs have been building on domestic hardware under that pressure ever since. DeepSeek V4 is the first publicly confirmed case of a frontier model being trained on Huawei Ascend at this parameter scale.
My Take
Export controls are producing two effects simultaneously. They limit access to high-end inference hardware for Chinese labs in the short term. They accelerate domestic Chinese chip investment in the medium term. If V4 ships at the described scale and performs competitively on Huawei Ascend, that is direct evidence the containment strategy has a ceiling. The Apache 2.0 license is separately good for the open-source ecosystem. The hardware exclusion is a geopolitical data point. Worth tracking both threads as this one develops.
Sources
\${sourcesHtml}

Originally published on chento.io
    

                                Don't miss what's next. Subscribe to Mitchell Toney:
                            
                        
            Email address (required)