Icon menu dark

Senior Site Reliability Engineer

SF

Domino has an ambitious vision for data science and machine learning. Our platform helps data science teams accelerate research, increase collaboration, and rapidly deploy predictive models. Our customers are the most sophisticated analytical organizations in the world, including Monsanto, Allstate, and Instacart. Backed by Sequoia Capital, Zetta Venture Partners, Bloomberg Beta, and In-Q-Tel, we are at the epicenter of the data science revolution, helping companies build better cars, develop more effective medicine, or simply recommend the best song to play next.

You will be joining a team of high-performance engineers and have a significant impact on managing a growing infrastructure. You’ll be tasked to maintain the health of the Domino platform in a variety of environments (AWS , VPC, on-prem) de-risking our stack, improving our availability, and customizing our DevOps and Deployment toolchain.

Responsibilities

  • Maintain and scale infrastructure configuration management software (SaltStack)
  • Maintain the availability of Domino in a variety of environments (on-premise, AWS, and VPC)
  • Build and enhance Domino’s deployment technology
  • Provide feedback to product team and management
  • Participate in 24/7 on call rotation and support escalations

Qualifications

  • Exceptional Coding/Scripting ability- Python, Bash , Scala
  • System administration and integration fundamentals (Storage, networking, Linux OS)
  • Experience with Amazon Web Services (AWS, EC2, S3)
  • Experience with Docker Container Management
  • Experience with Configuration Management Tools (Saltstack, Chef, Puppet)

Nice to have:

  • Knowledge of Hadoop, HDFS, and Spark
Save
Apply
Verified open
Posted by employer

Error

There was an error handling your request. Please make sure you're online.

Retry