10 months ago
Thurs Sep 5, 2024 11:18pm PST
Qwen2-7B-Instruct with TensorRT-LLM: consistently high tokens/SEC
read article
comments:
add comment
loading comments...