Site Reliability Engineer (Washington, DC)

Share this job

Site Reliability Engineer

Washington, DC

District Partners is supporting one of the most advanced engineering organizations in the country. Their teams build AI-native, software-defined systems across autonomy, robotics, networking, and distributed mission platforms. They’re widely considered the gold standard for modern national security engineering, and this role supports a flagship program requiring deep technical rigor and real-world impact.

This Site Reliability Engineer joins a mission-focused Platform Discovery team working at the intersection of cloud, networking, autonomy, and large-scale system integration. The work is fast-moving, deeply technical, and tied directly to operational outcomes.

Candidates who tend to excel here typically bring a strong academic foundation and professional experience gained inside high-performing engineering environments.

You’ll take on reliability, tooling, deployment, and operational challenges across highly complex distributed systems. This is hands-on engineering work ensuring systems run at scale, in secure environments, with real operators depending on them.

SREs on this team operate with a practical, “whatever it takes” mindset. You’ll make sound engineering tradeoffs, build scalable mechanisms, and partner closely with software, data, and operations teams to support live mission environments.

What You’ll Do:

Improve operational capabilities through automation, tooling, and large-scale deployment mechanisms
Lead post-mortems and drive continuous improvement across technical teams
Diagnose and resolve issues across cloud, networking, robotics-adjacent, and distributed system architectures
Build frameworks and processes that support reliable delivery as the program scales
Design and implement solutions using modern infrastructure technologies
Partner with internal and external engineers to identify and solve technical challenges

What You Must Have:

Active Top Secret clearance
STEM degree or equivalent hands-on technical training
5+ years in operations, SRE, systems, infrastructure, or platform engineering
Strong experience with Terraform, Ansible, and IaC workflows
Proficiency with AWS, Azure, or GCP
Proficiency with Docker and Kubernetes
Ability to navigate, debug, and support large, complex distributed systems
Demonstrated success in technically rigorous environments where expectations are high and quality matters

Preferred Experience:

Experience in reputable engineering organizations known for strong technical standards
Ability to drive alignment across diverse engineering and operational teams
Experience deploying into secure, air-gapped, or hardened environments
Background supporting autonomous or distributed mission systems
Strong data-driven root-cause analysis capabilities
Ability to read or debug Go, Python, Rust, or C++
Experience designing scalable systems with clear implementation paths

Location: Washington, DC (onsite)

Clearance: Active Top Secret required

Comp: $160K–$220K base + bonus + equity (total comp typically up to ~250K depending on experience)

Apply for this job