matt_d
Mon Apr 21, 2014 4:05pm PST
Karma:
18882
submitted
Fri Feb 6, 2026 9:35pm PST
European Lisp Symposium 2025: Talks
@matt_d
2
Thurs Feb 5, 2026 11:05pm PST
AutoOverlap: Enabling Fine-Grained Overlap of Computation and Communication
@matt_d
3
Thurs Feb 5, 2026 10:55pm PST
Axe: A Simple Unified Layout Abstraction for Machine Learning Compilers
@matt_d
1
1
Thurs Feb 5, 2026 7:22pm PST
35th ACM SIGPLAN International Conference on Compiler Construction (CC 2026)
@matt_d
2
Thurs Feb 5, 2026 5:25pm PST
VFlatten: Selective Value-Object Flattening Using Hybrid Static&Dynamic Analysis [pdf]
@matt_d
1
Thurs Feb 5, 2026 5:24pm PST
Agentic Proof-Oriented Programming
@matt_d
1
Thurs Feb 5, 2026 3:14am PST
MLIR-Tutor: Exercises for Learning MLIR (Originally Written for PPoPP 2026)
@matt_d
1
Wed Feb 4, 2026 10:19pm PST
Fast Autoscheduling for Sparse ML Frameworks
@matt_d
1
Wed Feb 4, 2026 2:23am PST
Replicate Forwards, Partial Backwards
@matt_d
1
Tues Feb 3, 2026 10:39pm PST
Frontier-CS 1.0 Release
@matt_d
2
Tues Feb 3, 2026 10:08pm PST
uops-again.info: corner-case behaviours of port assignment on Intel processors
@matt_d
1
1
1
Tues Feb 3, 2026 9:31pm PST
When magic meets multicore: OCaml and its elegant era of parallelism [video]
@matt_d
1
Tues Feb 3, 2026 9:15pm PST
FlashAttention-T: Towards Tensorized Attention
@matt_d
9
57
115
Tues Feb 3, 2026 9:15pm PST
Scaling GPU-to-CPU Migration for Efficient Distributed Execution on CPU Clusters
@matt_d
1
2
Tues Feb 3, 2026 9:09pm PST
MetaAttention: A Unified&Performant Attention Framework across Hardware Backends
@matt_d
1
Tues Feb 3, 2026 3:38pm PST
ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]
@matt_d
1
1
4
Mon Feb 2, 2026 10:26pm PST
LLMs versus the Halting Problem: Revisiting Program Termination Prediction
@matt_d
1
Mon Feb 2, 2026 9:27pm PST
DTensor Erasure
@matt_d
1
Mon Feb 2, 2026 9:01pm PST
Let the Barbarians In: How AI Can Accelerate Systems Performance Research
@matt_d
1
Sun Feb 1, 2026 6:17pm PST
Triton Bespoke Layouts
@matt_d
13
Sat Jan 31, 2026 8:23pm PST
AMD64 Bit Matrix Multiply and Bit Reversal Instructions
@matt_d
2
3
8
Sat Jan 31, 2026 8:05pm PST
Demystifying ARM SME to Optimize General Matrix Multiplications
@matt_d
4
19
88
Sat Jan 31, 2026 7:41pm PST
Evolving the OCaml programming language – CSE Bytes: K C Sivaramakrishnan [video]
@matt_d
15
Sat Jan 31, 2026 3:45pm PST
Magellan: Autonomous Discovery of Compiler Optimization Heuristics w/AlphaEvolve
@matt_d
4
Thurs Jan 29, 2026 9:46pm PST
Automatic Data Enumeration for Fast Collections
@matt_d
1
Thurs Jan 29, 2026 8:15pm PST
An MLIR Lowering Pipeline for Stencils at Wafer-Scale
@matt_d
1
Wed Jan 28, 2026 8:49pm PST
The JAX sharding type system
@matt_d
1
Wed Jan 28, 2026 5:19pm PST
AutoSP: Unlocking Long-Context LLM Training via Compiler-Based SP (ICLR 2026)
@matt_d
1
Tues Jan 27, 2026 10:55pm PST
Disentangling unification and implicit coercion (subtyping interaction problem)
@matt_d
2
Tues Jan 27, 2026 10:15pm PST
Global vs. Local SPMD
@matt_d
1
Tues Jan 27, 2026 6:19pm PST
ACM SIGPLAN Symposium on Principles of Programming Languages (POPL) 2026 talks
@matt_d
3
Mon Jan 26, 2026 9:21pm PST
Long branches in compilers, assemblers, and linkers
@matt_d
1
Mon Jan 26, 2026 9:05pm PST
Megatron via shard_map
@matt_d
1
Mon Jan 26, 2026 8:40pm PST
Cloud-Hardware Co-Design for Memory Bandwidth-Bound HPC Workloads: Azure HBv5
@matt_d
1
Sun Jan 25, 2026 10:12pm PST
CuTile on Blackwell: NVIDIA's Compiler Moat Is Already Built
@matt_d
3
Sun Jan 25, 2026 5:00pm PST
Compiling Classical Sequent Calculus to Stock Hardware: Duality of Compilation [video]
@matt_d
1
Sun Jan 25, 2026 1:54pm PST
Computing Sharding with Einsum
@matt_d
27
Fri Jan 23, 2026 5:07am PST
What Is Control Flow Analysis for Lambda Calculus? [audio]
@matt_d
1
1
1
Fri Jan 23, 2026 4:22am PST
Introduction to Coinduction in Agda Part 1: Coinductive Programming
@matt_d
1
Thurs Jan 22, 2026 9:51pm PST
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in CLIs
@matt_d
2
Wed Jan 21, 2026 9:47pm PST
Multi-Modal Program Verification in Velvet
@matt_d
1
Tues Jan 20, 2026 2:44pm PST
Python, Is It Being Killed by Incremental Improvements?
@matt_d
2
Sat Jan 17, 2026 3:19pm PST
Terabyte-Scale Analytics in the Blink of an Eye
@matt_d
2
Fri Jan 16, 2026 7:05pm PST
What Is Control Flow Analysis for Lambda Calculus? – Iowa Type Theory Commute
@matt_d
1
14
Fri Jan 16, 2026 2:48pm PST
Benchmarking a Baseline Fully-in-Place Functional Language Compiler [pdf]
@matt_d
3
5
41
Fri Jan 16, 2026 2:33pm PST
Trends in Functional Programming (TFP) 2026
@matt_d
4
Thurs Jan 15, 2026 10:27pm PST
Categorical Foundations for CuTe Layouts
@matt_d
3
Thurs Jan 15, 2026 8:42pm PST
StackWarp: Exploiting Stack Layout Vulnerabilities in Modern Processors
@matt_d
3
Mon Jan 12, 2026 7:03pm PST
Cloud RAM
@matt_d
4
10
32
Mon Jan 12, 2026 5:50pm PST
Triton Linear Layout: Examples
@matt_d
1
Mon Jan 12, 2026 5:49pm PST
When XLA Isn't Enough: From Pallas to VLIW with Splash Attention on TPU
@matt_d
1
Sat Jan 10, 2026 5:44pm PST
Warp Specialization in Triton: Design and Roadmap
@matt_d
2
Fri Jan 9, 2026 5:19pm PST
Challenges and Research Directions for Large Language Model Inference Hardware
@matt_d
2
Fri Jan 9, 2026 2:00pm PST
Library Liberation-Competitive Performance Through Compiler-Composed Nanokernels
@matt_d
2
Fri Jan 9, 2026 12:15am PST
Non-Traditional Profiling: "you can just put whatever you want in a jitdump"
@matt_d
2
Wed Jan 7, 2026 6:36pm PST
Triton Extensions: a framework for developing and building compiler extensions
@matt_d
2
Wed Jan 7, 2026 6:17pm PST
FlashInfer-Bench: Building the Virtuous Cycle for AI-Driven LLM Systems
@matt_d
1
Tues Jan 6, 2026 7:29pm PST
High-Performance DBMSs with io_uring: When and How to use it
@matt_d
12
49
194
Mon Jan 5, 2026 4:43pm PST
Are DBMS Researchers Making Correct Assumptions about Transaction Workloads?
@matt_d
4
Mon Jan 5, 2026 2:19am PST
vLLM: An Efficient Inference Engine for Large Language Models
@matt_d
2
Sun Jan 4, 2026 10:55pm PST
Microarchitecture: What Happens Beneath – Matt Godbolt [video]
@matt_d
4
Tues Dec 30, 2025 1:02pm PST
SMTMSMT: Gluing Together CVC5 and Z3 Nelson Oppen Style
@matt_d
1
Mon Dec 29, 2025 8:55pm PST
Tilus: A Tile-Level GPGPU Programming Language for Low-Precision Computation
@matt_d
2
Mon Dec 29, 2025 8:54pm PST
Optimal Software Pipelining and Warp Specialization for Tensor Core GPUs
@matt_d
2
Tues Dec 23, 2025 9:18pm PST
Oral History of Jeffrey Ullman [video]
@matt_d
1
Sun Dec 21, 2025 10:17pm PST
CPU Autoscaling with a Kernel of Truth
@matt_d
1
Sat Dec 20, 2025 8:51pm PST
ACM Transactions on Programming Languages & Systems: New Year, New Paper Tracks
@matt_d
4
Sat Dec 20, 2025 5:07pm PST
An Empirical Study of Bugs in the rustc Compiler (OOPSLA 2025) [video]
@matt_d
2
Fri Dec 19, 2025 10:07pm PST
A "Ready-to-Use" Template for LLVM Out-of-Tree Passes
@matt_d
1
1
1
Fri Dec 19, 2025 9:54pm PST
Mini-SGLang: Efficient Inference Engine in a Nutshell
@matt_d
2
Fri Dec 19, 2025 9:35pm PST
FrontierCS: Evolving Challenges for Evolving Intelligence
@matt_d
1
Fri Dec 19, 2025 8:21pm PST
svc-hook: hooking system calls on ARM64 by binary rewriting
@matt_d
1
1
3
Thurs Dec 18, 2025 4:58pm PST
The Simple Essence of Monomorphization (Oopsla 2025) [video]
@matt_d
1
Thurs Dec 18, 2025 3:21pm PST
Abusing x86 instructions to optimize PS3 emulation [RPCS3] [video]
@matt_d
2
Wed Dec 17, 2025 6:10pm PST
Decompiling the Synergy: Human–LLM Teaming in Reverse Engineering [pdf]
@matt_d
1
1
53
Wed Dec 17, 2025 5:55pm PST
Soteria Rust: the first symbolic execution engine with full Tree Borrows support [video]
@matt_d
1
Sun Dec 14, 2025 6:06pm PST
Testing and Benchmarking of AI Compilers
@matt_d
1
Sat Dec 13, 2025 9:27pm PST
Interpreters everywhere! – Lindsey Kuper [video]
@matt_d
1
Sat Dec 13, 2025 3:13pm PST
The Wild West of post-POSIX IO Interfaces [video]
@matt_d
2
Sat Dec 13, 2025 12:42am PST
Using the `vpternlogd` instruction for signed saturated arithmetic
@matt_d
2
Fri Dec 12, 2025 7:42pm PST
Indexed Reverse Polish Notation, an Alternative to AST
@matt_d
10
Wed Dec 10, 2025 3:34pm PST
ASM Visualizer: a new assembly visualization tool
@matt_d
2
Tues Dec 9, 2025 9:50pm PST
Oral History of Jensen Huang – Computer History Museum [video]
@matt_d
1
Tues Dec 9, 2025 5:37pm PST
The Equational Theories Project: Collaborative Mathematical Research at Scale
@matt_d
2
Tues Dec 9, 2025 3:15pm PST
The Quest Toward That Perfect Compiler – ACM SPLASH / OOPSLA 2025 Keynote [video]
@matt_d
2
Tues Dec 9, 2025 1:33am PST
Learning to love mesh-oriented sharding
@matt_d
2
Mon Dec 8, 2025 7:43pm PST
Microbenchmarking NVIDIA's Blackwell: An In-Depth Architectural Analysis
@matt_d
1
Sun Dec 7, 2025 9:49pm PST
tritonBLAS: Triton-based Analytical Approach for GEMM Kernel Parameter Selection
@matt_d
1
Fri Dec 5, 2025 7:17pm PST
RFC: Forming a Working Group on Formal Specification for LLVM
@matt_d
2
Fri Dec 5, 2025 4:33pm PST
hls4ml: A Flexible, OSS Platform for ML Acceleration on Reconfigurable Hardware
@matt_d
4
Tues Dec 2, 2025 6:58pm PST
Nice to Meet You: Synthesizing Practical MLIR Abstract Transformers [pdf]
@matt_d
1
Tues Dec 2, 2025 2:17am PST
SAT Etudes 2: Toy DPLL
@matt_d
1
Mon Dec 1, 2025 8:12pm PST
The Hitchhiker's Guide to Coherent Fabrics: 5 Programming Rules
@matt_d
8
Mon Dec 1, 2025 7:11pm PST
Optimizing libdwarf .eh_frame enumeration
@matt_d
4
Mon Dec 1, 2025 5:23pm PST
GSoC 2025: ClangIR Upstreaming
@matt_d
3