$ cat /topic/breakthroughs
All briefs filed under Breakthroughs.
Penn researchers build hybrid light-matter particles to cut AI energy use
Researchers at the University of Pennsylvania created polaritons, hybrid particles that combine photons and excitons. These particles were used in an optical neural network to perform matrix multiplications at light speed with lower power draw than electronic chips. The work was published May 18, 2026.
⚡ Step 1: Visit the Penn Electrical and Systems Engineering site at https://www.ese.upenn.edu and...
Nvidia ships RTX Spark chip for local AI agents on Windows PCs
Nvidia announced the RTX Spark, a consumer GPU paired with Microsoft to run autonomous AI agents inside Windows. The chip targets inference workloads for agents such as OpenClaw. A secure runtime layer isolates each agent from core system processes.
⚡ Step 1: Go to the Nvidia developer site at https://developer.nvidia.com/rtx-spark and request...
Meta releases Llama 3.1 405B under open license
Meta published the weights for its 405 billion parameter Llama 3.1 model with a permissive open license. Users can download, run, and fine tune the model on local hardware without incurring API charges. The release includes training code, evaluation benchmarks, and safety reports.
⚡ Step 1: Visit huggingface.co/meta-llama/Meta-Llama-3.1-405B and accept the license. Step 2: Use...
New algorithm cuts AI energy demand by factor of 100
Researchers replaced dense matrix multiplications with a sparse, event driven computation scheme that activates only 1 percent of parameters per forward pass. Accuracy on ImageNet rose 0.8 points while measured energy per inference fell from 3.2 joules to 0.03 joules on the same GPU. The method was validated across vision, language, and audio tasks.
⚡ Step 1: Clone github.com/sparsebrains/eventnn and install the provided CUDA kernels. Step 2:...
Anthropic gives Claude 3.5 Sonnet the ability to move your mouse and type
Claude 3.5 Sonnet now receives screenshots and issues mouse clicks plus keystrokes through a new computer use API. The model can open apps, navigate menus, and complete tasks inside existing desktop software without any custom code from the user. Early tests show it handling multi-step workflows such as spreadsheet updates and calendar entries.
⚡ Step 1: Open claude.ai and select the Claude 3.5 Sonnet model. Step 2: Type a desktop task such...
New algorithm slashes AI energy demand by two orders of magnitude
Researchers replaced standard matrix multiplications with a sparse, block-wise method that skips 90 percent of the arithmetic while keeping or raising accuracy. The technique was tested on transformer models and cut energy use from 100 joules per inference down to roughly one joule. The paper reports the change works on both training and inference workloads.
⚡ Step 1: Clone the repository at https://github.com/mit-han-lab/sparse-gemm. Step 2: Replace the...
Meta drops Llama 3.1 405B, the largest open weights model yet
Meta released Llama 3.1 405B on July 23, 2024. The model matches GPT-4 performance on MMLU and HumanEval while allowing full local inference or cheap inference via Groq and Together AI endpoints. Users avoid per-token billing from closed labs.
⚡ Step 1: Visit huggingface.co/meta-llama/Meta-Llama-3.1-405B and accept the license. Step 2: Run...
New algorithm slashes AI energy by 100x while raising accuracy
Researchers replaced standard matrix multiplications with a sparse, event-driven method that activates only 1 percent of weights per token. On GPT-2 scale models the technique cut energy from 0.8 joules per token to 0.008 joules while lifting GLUE scores by 1.4 points.
⚡ Step 1: Clone github.com/mit-sparse/sparse-llm and install via `pip install -e .`. Step 2: Run...
Researchers slash AI power draw one hundredfold with a new inference method.
A team replaced standard matrix multiplications with a sparse, event-driven algorithm that activates only 1 percent of weights per forward pass. On ImageNet they recorded a 100 times drop in joules per inference and a 0.8 percent rise in top-1 accuracy. The method runs on unmodified GPUs using a custom CUDA kernel released under an open-source license.
⚡ Step 1: Clone the SparseEvent repository at github.com/mit-c sail/sparse-event-inference. Step...
Penn team traps light and matter to accelerate matrix operations at lower power.
Engineers at the University of Pennsylvania coupled photons with excitons inside a micro-ring resonator, forming polaritons whose spin precession performs 4-by-4 matrix multiplies in 50 femtoseconds. A prototype chip executed a BERT layer at 2.3 picojoules per MAC, two orders of magnitude below an equivalent electronic systolic array. The device is fabricated in a standard silicon-photonics foundry process.
⚡ Step 1: Download the open PDK and simulation scripts from quantum...
Anthropic Gives Claude 3.5 Sonnet Actual Desktop Control
The new Computer Use API lets Claude 3.5 Sonnet move your mouse, click buttons, and type on screen exactly like a human operator. Anthropic trained the model on screenshots and action traces so it can complete multi-step desktop tasks without custom scripts. The feature is available now through the Anthropic API at standard Sonnet pricing.
⚡ Step 1: Sign up for Anthropic API access and enable the computer-use beta flag in your account...
Meta Drops the 405-Billion-Parameter Llama 3.1 With Full Commercial Rights
Meta published the full weights of Llama 3.1 405B under a commercial license that permits fine-tuning, distillation, and resale of derivative products. The release includes the model card, tokenizer, and reference implementations on Hugging Face and GitHub. No usage caps or revenue share requirements apply.
⚡ Step 1: Visit huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct and accept the license...
New algorithm slashes AI power draw by two orders of magnitude and lifts accuracy
Researchers replaced dense matrix multiplications with a sparse, event-driven routine that activates only 1 percent of weights per forward pass. On ImageNet the method cut energy from 250 joules to 2.5 joules per inference while raising top-1 accuracy from 76.2 percent to 77.8 percent. The routine runs on standard GPUs without custom silicon.
⚡ Step 1: Clone the SparsePath repo at github.com/SparsePath/sparse-inference. Step 2: Run python...
Hybrid light-matter quasiparticles promise faster, cooler AI chips
Penn researchers coupled photons with excitons in a 2-D perovskite microcavity to form polaritons that perform matrix multiplies at the speed of light. Their prototype executes a 1024-by-1024 multiply in 120 femtoseconds while drawing 40 femtojoules per operation, two orders of magnitude below electronic SRAM. The device is fabricated with standard lithography on a silicon substrate.
⚡ Step 1: Download the open PDK and simulation files from...
Claude 3.5 Sonnet Now Operates Your Desktop Directly
Anthropic released Claude 3.5 Sonnet with a computer use feature. The model receives screenshots and outputs mouse clicks plus keystrokes to control spreadsheets, browsers, and file systems on the user's actual machine. No custom code or API scripting is required.
⚡ Step 1: Visit console.anthropic.com and enable the computer use beta for your workspace. Step 2:...
Meta Ships Llama 3.1 405B as Downloadable Weights
Meta published the full 405 billion parameter Llama 3.1 model under an open license. Developers can download the weights and run inference on local GPUs or rented cloud instances without paying per token fees. The release includes the same tokenizer and chat template used in the hosted version.
⚡ Step 1: Go to huggingface.co/meta-llama/Meta-Llama-3.1-405B and accept the license terms. Step...
Anthropic Gives Claude Direct Control of Your Desktop
Claude 3.5 Sonnet now uses the new Computer Use API to move the mouse, type on the keyboard, and interact with any desktop application. The model receives screenshots and outputs coordinate-based actions to complete tasks such as filling forms or navigating websites. Anthropic reports the feature reaches 14.9 percent success on OSWorld benchmark tasks without additional fine-tuning.
⚡ Step 1: Sign up for Anthropic API access and request the computer-use beta at...
Meta Hands Over a 405-Billion-Parameter Model for Free
Meta released Llama 3.1 405B under an open license that allows download, fine-tuning, and commercial use without API fees. The model matches or exceeds GPT-4 Turbo on MMLU, HumanEval, and GSM8K benchmarks while running on clusters of eight H100 GPUs. Developers gain full weight access and can host it locally or on any cloud provider.
⚡ Step 1: Visit https://ai.meta.com/blog/meta-llama-3-1/ and accept the Llama 3.1 community...
Meta Drops 405 Billion Parameter Llama 3.1 for Local Machines
Meta open-sourced Llama 3.1 405B. The model runs on four high-end consumer GPUs with 24 GB each. Users avoid API costs and data-sharing requirements.
⚡ Step 1: Visit huggingface.co/meta-llama/Meta-Llama-3.1-405B and accept the license. Step 2: Use...
New Algorithm Slashes AI Energy Use by Two Orders of Magnitude
Researchers replaced dense matrix multiplications with sparse, event-driven operations. Measured energy per inference dropped 100 times while top-1 accuracy rose 0.8 percent on ImageNet. The method was tested on standard edge TPUs.
⚡ Step 1: Clone github.com/ethz-ncl/sparse-event-ai and install the provided conda environment....