hckrnws
back
matt_d
Mon Apr 21, 2014 4:05pm PST
Karma:
19297
submitted
Fri Mar 20, 2026 10:13pm PST
SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels
@matt_d
3
Fri Mar 20, 2026 9:43pm PST
Tony Hoare and His Imprint on Computer Science
@matt_d
1
1
6
Fri Mar 20, 2026 6:16pm PST
The End of Dijkstra's Algorithm? Breaking the Sorting Barrier for Shortest Paths [video]
@matt_d
2
Fri Mar 20, 2026 6:02pm PST
AlgoVeri: An Aligned Benchmark for Verified Code Gen. On Classical Algorithms
@matt_d
2
Fri Mar 20, 2026 5:58pm PST
Specy: Learning Specifications for Distributed Systems from Event Traces [pdf]
@matt_d
2
Thurs Mar 19, 2026 10:30pm PST
Generalized Dot-Product Attention: Tackling Real-World Challenges in GPU Kernels
@matt_d
1
Thurs Mar 19, 2026 10:29pm PST
M^2RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling
@matt_d
2
Thurs Mar 19, 2026 9:22pm PST
Tools of the Trade: C2C Activation Offloading on Grace Blackwell
@matt_d
1
Thurs Mar 19, 2026 9:01pm PST
EsoLang-Bench: Evaluating Genuine Reasoning in LLMs via Esoteric Languages
@matt_d
22
57
97
Thurs Mar 19, 2026 8:49pm PST
Speed-Of-Light ExecBench: A benchmark of real-world DL kernel problems
@matt_d
1
Thurs Mar 19, 2026 6:36pm PST
Equality Saturation and Symbolic Regression
@matt_d
2
Thurs Mar 19, 2026 4:42pm PST
NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL
@matt_d
3
Thurs Mar 19, 2026 4:40pm PST
Vectorization of Verilog Designs and its Effects on Verification and Synthesis
@matt_d
3
Wed Mar 18, 2026 10:39pm PST
LATTE ’26: Workshop on Languages, Tools, and Techniques for Accelerator Design
@matt_d
2
Wed Mar 18, 2026 9:35pm PST
Read Less, Steer More
@matt_d
4
Wed Mar 18, 2026 9:13pm PST
The Data Structures of Roads
@matt_d
2
Wed Mar 18, 2026 4:32pm PST
Verifying Move Borrow Checker in Lean:An Experiment in AI-Assisted PL Metatheory
@matt_d
1
1
4
Wed Mar 18, 2026 5:12am PST
Real or Slop? – Programming Languages Papers Edition
@matt_d
2
2
6
Tues Mar 17, 2026 10:45pm PST
Mamba-3
@matt_d
11
50
279
Tues Mar 17, 2026 6:02pm PST
EvoX: Letting AI Evolve Its Own Evolution Process
@matt_d
1
Tues Mar 17, 2026 4:17pm PST
Native DSLs Ops in PyTorch
@matt_d
1
Tues Mar 17, 2026 5:38am PST
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
@matt_d
9
14
182
Mon Mar 16, 2026 3:20pm PST
Gluon: Explicit Performance
@matt_d
22
Sun Mar 15, 2026 3:22pm PST
Block Number Formats are (Still!) Direction Preservers
@matt_d
2
Sat Mar 14, 2026 5:19pm PST
cuTile Rust: a safe, tile-based kernel programming DSL for Rust
@matt_d
4
Sat Mar 14, 2026 5:18pm PST
KernelBlaster: A framework for in context learning for code optimization
@matt_d
1
Sat Mar 14, 2026 2:33am PST
Demystifying and Improving Lazy Promotion in Cache Eviction [pdf]
@matt_d
1
Fri Mar 13, 2026 5:40pm PST
Journeying through Optimization with Heuristics [video]
@matt_d
2
Fri Mar 13, 2026 12:19am PST
To Sparsify or to Quantize: A Hardware Architecture View
@matt_d
1
1
2
Thurs Mar 12, 2026 11:55pm PST
Efficient sparse computations using linear algebra aware compilers (2025)
@matt_d
4
7
64
Thurs Mar 12, 2026 7:32pm PST
A Field Guide to Reward Hacking in AI Kernel Generation
@matt_d
1
1
2
Wed Mar 11, 2026 9:33pm PST
AI and the Mixed-Consistency Future
@matt_d
2
Wed Mar 11, 2026 8:57pm PST
FIDES: End-to-end Compartments for Mixed-language Systems [pdf]
@matt_d
3
Wed Mar 11, 2026 8:47pm PST
Practical Type Inference: High‑Throughput Recovery of Real‑World Types
@matt_d
1
Wed Mar 11, 2026 6:48pm PST
Idempotent Slices with Applications to Code-Size Reduction
@matt_d
2
Wed Mar 11, 2026 5:09pm PST
Designing AI Chip Hardware and Software
@matt_d
1
Wed Mar 11, 2026 4:17pm PST
Refinement Modeling and Verification of RISC-V Assembly Using Knuckledragger
@matt_d
11
Wed Mar 11, 2026 1:16am PST
Breaking Control Flow Integrity by Abusing Modern C++ (Coroutines) – BH USA 2025 [video]
@matt_d
2
Wed Mar 11, 2026 1:11am PST
Programming the Loop
@matt_d
2
Tues Mar 10, 2026 9:00pm PST
Scalable Training of Mixture-of-Experts Models with Megatron Core
@matt_d
2
Tues Mar 10, 2026 2:00pm PST
PolyBlocks: A Compiler Infrastructure for AI Chips and Programming Frameworks
@matt_d
3
Tues Mar 10, 2026 1:51pm PST
Formalizing Data Structures and Algorithms with Agents
@matt_d
3
Mon Mar 9, 2026 6:12pm PST
Thinnings: Sublist Witnesses and de Bruijn Index Shift Clumping
@matt_d
1
2
20
Mon Mar 9, 2026 1:53pm PST
Advent of Computing: Dan Temkin – Forty-Four Esolangs
@matt_d
2
Mon Mar 9, 2026 2:36am PST
Checking Write Bandwidth on GPUs
@matt_d
1
Mon Mar 9, 2026 2:23am PST
Challenges in Decompilation and Reverse Engineering of CUDA-Based Kernels [pdf]
@matt_d
5
Sat Mar 7, 2026 7:00pm PST
Block Number Formats Are Direction Preservers
@matt_d
8
Sat Mar 7, 2026 2:51pm PST
Cutie Fly: CuTe Layout Representation and Algebra, CuTeDSL, FlyDSL
@matt_d
2
Sat Mar 7, 2026 2:38am PST
Converting Binary Floating-Point Numbers to Shortest Decimal Strings
@matt_d
3
5
21
Fri Mar 6, 2026 3:47pm PST
Controlling Floating-Point Determinism in NVIDIA CCCL
@matt_d
3
Fri Mar 6, 2026 2:08pm PST
Bootstrapping Fuzzers for Compilers of Low-Resource Language Dialects Using LLMs
@matt_d
3
Thurs Mar 5, 2026 9:23pm PST
Custom Data Structures in E-Graphs
@matt_d
3
Thurs Mar 5, 2026 5:37pm PST
Formal Verification in the Age of AI
@matt_d
1
1
2
Wed Mar 4, 2026 4:19am PST
CuTe Layout Representation and Algebra
@matt_d
4
Tues Mar 3, 2026 8:16pm PST
Bespoke OLAP: Synthesizing Workload-Specific One-Size-Fits-One Database Engines
@matt_d
2
Tues Mar 3, 2026 8:01pm PST
SkyDiscover: A Flexible Framework for AI-Driven Sci. and Algorithmic Discovery
@matt_d
3
Tues Mar 3, 2026 6:50pm PST
Silent Backwards Compatibility Breaking Changes in PyTorch
@matt_d
1
1
4
Tues Mar 3, 2026 2:25am PST
Building an Open-Source Verilog Simulator with AI: 580K Lines in 43 Days
@matt_d
3
Mon Mar 2, 2026 4:55pm PST
AgentCgroup: Understanding and Controlling OS Resources of AI Agents
@matt_d
2
Mon Mar 2, 2026 1:51pm PST
Equality Saturation for Circuit Synthesis and Verification
@matt_d
2
Mon Mar 2, 2026 1:41pm PST
An Introduction to Folios
@matt_d
2
Sun Mar 1, 2026 7:26pm PST
Perplexity Cannot Always Tell Right from Wrong
@matt_d
2
Sun Mar 1, 2026 7:09pm PST
Ganak: The Making of a Versatile, High Performance Model Counter
@matt_d
1
Sun Mar 1, 2026 3:03am PST
TorchLean: Formalizing Neural Networks in Lean
@matt_d
3
20
104
Sun Mar 1, 2026 1:09am PST
Fast Autoscheduling for Sparse ML Frameworks
@matt_d
1
Sun Mar 1, 2026 1:05am PST
TENSURE: Fuzzing Sparse Tensor Compilers (Registered Report)
@matt_d
1
Sun Mar 1, 2026 12:58am PST
A Reinforcement Learning Environment for Automatic Code Optimization in MLIR
@matt_d
1
Fri Feb 27, 2026 8:30pm PST
Metamorphic Testing for Infrastructure-as-Code Engines [pdf]
@matt_d
2
Thurs Feb 26, 2026 10:38pm PST
K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model
@matt_d
1
1
2
Thurs Feb 26, 2026 10:29pm PST
Midtraining Bridges Pretraining and Posttraining Distributions
@matt_d
1
Thurs Feb 26, 2026 7:19pm PST
Testing "Raw" GPU Cache Latency
@matt_d
2
Wed Feb 25, 2026 7:04pm PST
Hexagon-MLIR: An AI Compilation Stack for Qualcomm's NPUs
@matt_d
1
1
3
Wed Feb 25, 2026 5:31pm PST
Analyzing Latency Hiding and Parallelism in an MLIR-Based AI Kernel Compiler
@matt_d
1
Wed Feb 25, 2026 12:14am PST
Argus: Automated Discovery of Test Oracles for DBMSs Using LLMs
@matt_d
1
Wed Feb 25, 2026 12:09am PST
A Decade of Docker Containers
@matt_d
4
Tues Feb 24, 2026 11:11pm PST
In Pursuit of High-Fidelity GPU Kernel Benchmarking
@matt_d
1
Tues Feb 24, 2026 4:52am PST
From ASPLOS to Orbit: Unikernels Twelve Years Later
@matt_d
1
3
Mon Feb 23, 2026 11:05pm PST
VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean
@matt_d
2
Sun Feb 22, 2026 9:40pm PST
CSLib: The Lean Computer Science Library
@matt_d
2
Sun Feb 22, 2026 7:08pm PST
Heliostat: Harnessing Ray Tracing Accelerators for Page Table Walks – ISCA 2025 [video]
@matt_d
1
1
2
Sat Feb 21, 2026 6:14am PST
LDOS: Toward a Learning-Directed Operating System
@matt_d
3
Sat Feb 21, 2026 2:42am PST
GenAI for Systems: Recurring Challenges&Design Principles from SW to Silicon
@matt_d
2
Sat Feb 21, 2026 1:18am PST
Precise exceptions in relaxed architectures [video]
@matt_d
1
1
2
Fri Feb 20, 2026 8:40pm PST
BitFields API: Type-Safe Bit Packing for Lock-Free Data Structures
@matt_d
1
Fri Feb 20, 2026 6:06pm PST
ThunderKittens 2.0: Even Faster Kernels for Your GPUs
@matt_d
2
2
3
Fri Feb 20, 2026 6:14am PST
Proof Assistants in the Age of AI
@matt_d
1
Fri Feb 20, 2026 2:07am PST
Open Source Software Projects Are Brands
@matt_d
2
Fri Feb 20, 2026 1:24am PST
Evaluating the Hardest CS Problems in the Age of LLMs
@matt_d
1
Fri Feb 20, 2026 12:00am PST
SE Radio 708: Jens Gustedt on C in 2026
@matt_d
1
1
16
Thurs Feb 19, 2026 11:14pm PST
Spaghetti Bench: Evaluating AI Agents on Concurrency Bug Fixes
@matt_d
1
Thurs Feb 19, 2026 10:24pm PST
Computer Science as Infrastructure: The Spine of the Lean CSLib
@matt_d
2
Thurs Feb 19, 2026 4:56pm PST
Problems with a weak tryLock operation in C and C++ standards
@matt_d
2
Thurs Feb 19, 2026 5:57am PST
Two mechanisms for dynamic type checks
@matt_d
2
Thurs Feb 19, 2026 3:12am PST
Semantics, Operations, and Properties of P3109 Floating-Point Formats in Lean
@matt_d
1
Wed Feb 18, 2026 10:53pm PST
Oral History of Michael J. Flynn [video]
@matt_d
3
Wed Feb 18, 2026 10:22pm PST
Productively Programming Accelerated Computing Systems – Rohan Yadav (Stanford) [video]
@matt_d
5