OLMo-core
PyTorch building blocks for the OLMo ecosystem providing foundational training scripts, inference utilities, and model implementations for the OLMo family of open language models.
View repo ↗Top AI News Weekly
Gemma 4 and Qwen 3.6 lead frontier models alongside Qwen 3.5 Omni and GLM-5V-Turbo. Agent ecosystem explodes with Claude Code Agent Teams, gstack, Superpowers, LangGraph, and 15+ agent frameworks. Research highlights include TRELLIS 2 for 3D generation, VOID video inpainting, DreamLite on-device diffusion, and OmniVoice 600-language TTS. Context-1 and Bonsai-8B push efficient inference boundaries.
61 launches and research drops that matter for enterprise AI builders—curated, tagged, and ready for your next roadmap sync.
New drops
61
Unique sources
56
Key themes
research · Developer · Agents
New reasoning systems, world models, and alignment papers.
PyTorch building blocks for the OLMo ecosystem providing foundational training scripts, inference utilities, and model implementations for the OLMo family of open language models.
View repo ↗Enterprise-grade 4B-parameter vision-language model for document data extraction including charts, tables, and key-value pairs from images into structured formats like CSV, JSON, and HTML.
View model ↗20B-parameter MoE model for agentic retrieval that decomposes complex queries into subqueries, iteratively searches corpora, and self-prunes irrelevant documents at 10x lower cost.
View model ↗Open-source foundation model for financial market prediction that tokenizes K-line candlestick data and uses autoregressive Transformers for forecasting, trained on data from 45+ global exchanges.
View repo ↗End-to-end 1-bit quantized 8B-parameter language model achieving 14x size reduction (1.15 GB) and 6x faster inference while maintaining competitive benchmark scores with full-precision models.
View model ↗200M-parameter pretrained foundation model for time-series forecasting supporting 16K context length and continuous quantile forecasting up to 1K horizons.
View repo ↗Latest release in Alibaba's Qwen open-weight model family with improvements to reasoning and agentic capabilities, continuing the series after Qwen 3.5.
View release ↗Google's most advanced open-source language models purpose-built for advanced reasoning and agentic workflows, delivering exceptional performance-per-byte efficiency.
View release ↗2.3B effective parameter multimodal model processing text, images, and audio with 128K context window, reasoning modes, and native function-calling optimized for on-device deployment.
View model ↗End-to-end omni-modal model processing and generating text, images, audio, and video with real-time streaming voice and video interaction and state-of-the-art cross-modal performance.
View release ↗Multimodal vision-language model for vision-based coding tasks, excelling at frontend code generation from design mockups, GUI automation, and agentic workflows requiring visual understanding.
View release ↗Video, audio, and physics-native generation techniques.
Large-scale 3D asset generation model creating high-quality 3D assets from text or image prompts in multiple formats including Radiance Fields, 3D Gaussians, and meshes with models up to 2B parameters.
View model ↗Video object removal framework performing physically-plausible inpainting using a vision-language model and video diffusion model for consistent results in complex interaction scenarios.
View project ↗Embodied agents learning to act in complex worlds.
Low-level orchestration framework for building, managing, and deploying long-running stateful AI agents with durable execution, human oversight, and persistent memory.
View repo ↗Lightweight vectorless personal memory system that captures your entire digital footprint locally and makes it queryable by CLI agents like Claude Code, Cursor, and others.
View repo ↗Terminal-Bench 2.0 competition artifact extending Terminus-KIRA with environment snapshot injection, achieving 76.4% accuracy by reducing early exploration steps in agentic coding tasks.
View repo ↗Open-source autonomous coding harness built and maintained by AI agents, orchestrating parallel coding sessions, testing, documentation, and workflow automation through modular tools.
View repo ↗Production-ready automation workflows for marketing and sales including experiment engines, lead qualification, content scoring, and SEO intelligence for Claude Code and other AI agents.
View repo ↗Node.js server and React UI for coordinating multiple AI agents to run autonomous businesses with org charts, task management, budget enforcement, and multi-agent orchestration.
View repo ↗Multi-tenant AI agent platform in Go supporting 20+ LLM providers, 7 messaging channels, multi-agent orchestration with task delegation, and built-in prompt injection detection.
View repo ↗Experimental feature for orchestrating multiple Claude Code instances as a coordinated team with shared task list, inter-agent messaging, and lead/teammate architecture via tmux.
View release ↗Open-source toolkit transforming Claude Code into a virtual engineering team with 23 specialized AI agents covering CEO, designer, engineer, QA, and security roles with sprint-based workflows.
View repo ↗Complete software development workflow for coding agents built on composable skills including brainstorming, TDD, planning, subagent-driven development, and git worktrees.
View repo ↗Self-evolving skill platform enabling AI agents to automatically create, refine, and share reusable skills across a community, reducing token consumption by 46% while improving task performance.
View repo ↗Multi-model orchestration system where Claude Code coordinates task routing between Gemini and Codex with 29+ slash commands, patch-based security, and parallel agent execution.
View repo ↗Fully local multi-agent simulation engine generating hundreds of AI agents with unique personalities to simulate social media reactions, sentiment evolution, and opinion dynamics without cloud APIs.
View repo ↗Curated collection of 1,060+ real-world agent skills from engineering teams at Anthropic, Google, Vercel, Stripe, and others for extending Claude Code, Codex, and Gemini CLI.
View repo ↗Autonomous agent engineering framework allowing a meta-agent to iteratively improve an agent harness overnight by modifying prompts, tools, and configuration while hill-climbing on benchmark scores.
View repo ↗Frameworks, playbooks, and OSS repos.
Pure JavaScript/TypeScript library for measuring and laying out multiline text without DOM interactions, avoiding layout reflow by implementing its own text measurement using the browser's font engine.
View repo ↗Leading open-source data integration platform for ETL/ELT pipelines with 600+ connectors to move data from APIs, databases, and files to warehouses and lakehouses.
View repo ↗Open-source business intelligence platform that lets anyone query, visualize, and analyze data without SQL expertise, now featuring AI-powered analytics via Metabot.
View repo ↗Full TypeScript source code (~512K lines) of Anthropic's CLI agent tool for terminal-based coding, exposed via an npm source map in March 2026.
View repo ↗Workflow layer for OpenAI Codex CLI that adds canonical skills like deep-interview, planning, and team coordination with persistent project state management.
View repo ↗Curated collection of 55+ DESIGN.md files from popular websites, enabling AI coding agents to generate consistent pixel-perfect UI with design tokens, color palettes, and component styles.
View repo ↗Adaptive web scraping framework whose parser learns from website changes and automatically relocates elements. Includes stealth fetchers bypassing anti-bot systems and a built-in MCP server for AI integration.
View repo ↗AI-powered social media management platform providing cloud-based virtual phones for managing multiple accounts across mobile platforms without physical devices.
View release ↗Real-time screen translator for PC games capturing on-screen text via multiple OCR engines with ML-based scoring. Supports translation to 30+ languages through DeepL, Google Translate, and others.
View repo ↗Animated AI companions living on macOS dock as transparent video characters. Click to open an AI terminal supporting Claude Code, Codex, Copilot, and Gemini CLIs with thinking bubbles.
View repo ↗Free open-source desktop app for creating product demos and screen recordings with automatic zoom effects, dual audio capture, annotations, motion blur, and variable speed control.
View repo ↗Privacy-focused open-source chat interface supporting 50+ AI models via OpenRouter with parallel multi-model evaluation, input perturbation for red-teaming, and semantic output transformation.
View repo ↗Agent skill converting complex terminal output into styled HTML pages and slide decks with interactive Mermaid diagrams, diff reviews, and responsive layouts for AI coding agents.
View repo ↗AI code review tool that automatically analyzes pull requests with full codebase understanding, identifies bugs and security issues, and provides inline comments with diagrams.
View release ↗Professional ComfyUI custom node pack for end-to-end PBR texture generation, enabling generation, refinement, upscaling, and visualization of physically-based rendering maps entirely within ComfyUI.
View repo ↗Automated graphic design system mimicking human designer workflows by autonomously collecting theme-related assets and executing design tool operations to produce professional PSD files.
View release ↗