FireAttention — Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs
Fireworks Raises the Quality Bar with Function Calling Model and API Release
Mixtral 8x7B on Fireworks: faster, cheaper, even before the official release
LLM Inference Performance Benchmarking (Part 1)
New in Fireworks: Image-to-Image and ControlNet support for SSD-1B and SDXL!