8127 core infrastructureRemoteFresh role

Staff Software Engineer, Stream Infrastructure

Stripe
Toronto, Canada, Canada Remote
Apply Now →
📋

OVERVIEW

About the Role

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.

📋

OVERVIEW

About the Role

The Stream Infrastructure team builds and operates Stripe’s real-time, event-driven platform that powers asynchronous communication between services and high-throughput streaming workloads across the company. We run globally distributed systems with high reliability and performance to meet Stripe’s scaling, availability, and product needs. The team operates dozens of Apache Kafka clusters with industry-leading reliability and efficiency, and we continually reduce operational toil by investing in automation and self-service tooling for upgrades, maintenance, and day-to-day operations. The team is distributed between Seattle, Toronto and remote locations.

What you’ll do

You’ll partner with other infrastructure engineers, leaders on adjacent teams, as well as product engineers and managers who depend on our platform to define and deliver the next generation of Stripe’s streaming infrastructure. You’ll help set a long-term technical direction that scales with Stripe’s growth while enabling reliable, efficient operations for years to come.

RESPONSIBILITIES

Responsibilities

Design, build, and operate event-driven infrastructure, leveraging technologies like Apache Kafka, Temporal, and AWS services

Collaborate with teams across Stripe to understand their requirements, unblock adoption, and identify opportunities to improve how streaming infrastructure is used

Define and implement Kafka operational best practices—such as shuffle sharding, cellular architecture, and load shedding—to improve resilience and reliability at scale

Drive a shift from “pets” to “cattle”: build automation, standardized workflows, and self-healing systems that reduce manual operations

Who you are

We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

🎯

REQUIREMENTS

Requirements

This is a Staff-level role - that typically means 10+ years of experience building, operating, and evolving large-scale production systems

Experience as a technical lead for team(s) working on distributed systems, including scaling them in fast-moving environments

Hands-on experience with big data technologies such as Kafka, Pulsar, Flink, or Pinot

Comfortable operating with high autonomy and ownership

Growth mindset and a willingness to learn quickly, explore ambiguous problem spaces, and dive deep when needed

Strong written and verbal communication skills, including the ability to produce clear technical documentation

DESIRABLE

Desirable Qualifications

Experience operating streaming technologies as a platform (e.g., Kafka, Pulsar, Flink, Pinot) for internal customers at scale

Experience building or operating control planes for managing large-scale infrastructure

Ready to Apply?

Click below to apply directly on the company's website.

Apply for This Position →

Find More Elite Remote Roles

Browse thousands of verified remote opportunities from top companies worldwide.