Career | <?phpecho $jobTitle;?> | <?phpecho $companyName;?>

Site Reliability Engineer


Seattle, WA, US
  • Job Type: Full-Time
  • Function: IT
  • Industry: Enterprise
  • Post Date: 11/24/2022
  • Website:
  • Company Address: 111 S Jackson St., Seattle, WA, 98104

About Fabric

Fabric is the headless commerce platform purpose-built for growth. Customers like BuildDirect, ABC Carpet & Home, and Universal Lacrosse trust Fabric for its open and modular design, allowing them to be live in weeks without having to replatform. Fabric is a force multiplier on retailers’ existing technology investments proven to grow digital revenue by up to 3x.

Job Description

Who we are: 

fabric is a modern commerce platform that gives retailers tools to create world-class shopping experiences for mid-market enterprises. We champion a new, harmonious way of doing business that emphasizes connectedness and collaboration over competition and dominance. This is showcased in our products that rely on microservices, APIs, and easy integrations, and in our globally distributed team that genuinely cares about its customers. Our founders directed groundbreaking commerce initiatives at Amazon, Staples, Google and eBay. We're growing fast and looking for more awesome people to join us.


The Site Reliability Engineer (Multiple positions open) at Commerce Fabric Inc. in Seattle, Washington will be responsible for building software systems and application tools for internal use as well as customer use that enable the engineering teams to operate safely at a high speed and wide scale. The duties include developing tools and automated solutions to support hosted services for high availability and resiliency; implementing, monitoring and alerting for improved mean time to detect (MTTD) and mean time to recover(MTTR), evaluating the average time elapsed between a failure and the next time it occurs and assessing the time it takes to run a repair after the failure; resolving issues escalated in the software production environment; troubleshooting performance, reliability and scalability issues on distributed systems; collaborating with application engineers and training developers; monitoring, alerting, and administering license management on cloud services including Amazon Web Services (AWS); developing monitoring and alerting framework utilizing Datadog, Cloudwatch, X-ray, Sedai(for Auto-remediation), Grafana and Prometheus; software programming using web technologies and infrastructure automation; administering and automating Linux Servers in cloud services; building on industry leading infrastructure tools and technologies including Terraform, Cloudfront, and AWS to create tailored solutions on a wide scale.  This is a 100% remote position.  May work from home anywhere in the United States.


Bachelor's degree in information systems or directly related field plus five years of software development experience developing tools and automated solutions to support hosted services for high availability.  The five years of experience must include five years of experience with each of the following: (1) Set-Up Datadog or Cloudwatch dashboards for monitoring the applications or systems; (2) Set Up alarms for identification of application or server failures; (3) AWS services including EKS, Serverless(lambda), Cloudfront, VPC, Apigateways, EC2, Cloud Watch, ECS, RDS, SNS, or IAM; (4) coordinating infrastructure planning for capacity planning analysis, disaster recovery and load balancing activities in the hosting environment; (5) writing terraform or python automations for configuration management and deployment of applications to servers; (6) Architecting High Availability and resilient systems with autoscaling; (7) analyzing technical solutions that fit with scalable and distributed systems on AWS cloud;  (8) designing and deploying microservice environment on Amazon EKS; (9) Orchestrate CICD pipelines on gitlab, bitbucket, or Jenkins for Infra deployment, Serverless API deployment and EKS deployments; (10) working with Terraform or Terragrunt code for the deployment of the Infrastructure; (11) Build Secure, Scalable Multi-VPC Network; and (12) Orchestrate Microservices or storefront NextJs applications on EKS/ECS Clusters.

This notice is subject to Commerce Fabric’s employee referral program.

Interested candidates must apply online at  

What we bring to the table:

  • Competitive compensation packages
  • PTO and Holiday plans
  • Benefits packages which include Medical, Dental, Life, and Vision
  • Fast-paced, fun and collaborative environment 
  • A team invested in you both personally and professionally

*fabric is an equal opportunity employer as well as a government contractor that shall abide by the requirements of 41 CFR 60-300.5(a), which prohibits discrimination against qualified protected Veterans and the requirements of 41 CFR 60-741.5(A), which prohibits discrimination against qualified individuals on the basis of disability.