Fireworks AI · San Mateo, CA
At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovation through projects like our own function calling and multimodal models. Fireworks is a Series C company valued at $4 billion and backed by top investors including Benchmark, Sequoia, Lightspeed, Index, and Evantic. We’re an ambitious, collaborative team of builders, founded by veterans of Meta PyTorch and Google Vertex AI.
We're looking for a Software Engineer focused on Performance Optimization to help push the boundaries of speed and efficiency across our AI infrastructure. In this role, you'll take ownership of optimizing performance at every layer of the stack—from low-level GPU kernels to large-scale distributed systems. A key focus will be maximizing the performance of our most demanding workloads, including large language models (LLMs), vision-language models (VLMs), and next-generation video models.
You’ll work closely with teams across research, infrastructure, and systems to identify performance bottlenecks, implement cutting-edge optimizations, and scale our AI systems to meet the demands of real-world production use cases. Your work will directly impact the speed, scalability, and cost-effectiveness of some of the most advanced generative AI models in the world.
Total compensation for this role also includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package. Base salary is determined by a range of factors including individual qualifications, experience, skills, interview performance, market data, and work location. The listed salary range is intended as a guideline and may be adjusted.
Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
Shape the Future of Generative AI At Fireworks AI, we’re building the infrastructure that powers the next generation of AI applications. From real-time inference to model optimization, our platform empowers developers and enterprises to deploy, scale, and innovate with cutting-edge AI—faster and smarter than ever before. Why Fireworks AI? Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving. Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally. Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results. Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation. We’re seeking builders, tinkerers, and visionaries who: 🚀 Push boundaries in AI/ML infrastructure, distributed systems, or high-performance computing. 💡 Think creatively to solve problems others deem impossible. 🤝 Collaborate fearlessly in a fast-paced, no-ego environment. 🌍 Care deeply about democratizing AI and making it accessible, scalable, ...
Jobb.ai is an independent skill benchmarking platform. Applications are submitted on the employer's official website.