BitMEX is hiring a

Site Reliability Engineer

San Francisco, United States

100x Group explores, incubates and pursues opportunities and investments, as part of its mission to reshape the modern digital financial system into one which is inclusive and empowering. It has been created by Arthur Hayes, Ben Delo and Sam Reed, the founders of HDR Global Trading, the company behind the cryptocurrency derivatives trading platform, BitMEX.

The 100x Group Infrastructure team sits at the core of the business and is responsible for the reliability and scalability of all the services that power the BitMEX platform and its developers. In only a few years, BitMEX became the leading crypto-products trading platform worldwide, and handles ten of thousands low latency transactions per second, representing several billions of dollars traded every day. We specialize in systems, whether it be networking, the Linux kernel, or some more specific interest in scaling, algorithms, or distributed systems.

Responsibilities:

  • Run our infrastructure with Chef, Terraform and Kubernetes.
  • Make monitoring and alerting alert on symptoms and not on outages.
  • Document every action so findings turn into repeatable actions–and then automation.
  • Improve the deployment process to make it as boring as possible.
  • Design, build and maintain core infrastructure pieces that allow BitMEX scaling to support hundreds of thousands of concurrent users.
  • Debug production issues across services and levels of the stack.
  • Plan the growth of BitMEX’s infrastructure.
  • Be on a Pager rotation to respond to the BitMEX platform availability incidents and provide support for service engineers with customer incidents.

About You: 

  • Think about systems - edge cases, failure modes, behaviors, specific implementations.
  • Have experience with Nginx, HAProxy, Docker, Kubernetes, Terraform, or similar technologies
  • 6+ years of professional experience, with a proven track record of designing, implementing, managing, and testing infrastructure at scale on AWS for high value environments,
  • Strong engineering skill set with a firm grasp of fundamental Computer Science principles and a modular, maintainable, agile & test-driven approach to software development
  • Capacity to multitask and give equal attention to a variety of functions while under pressure
  • Strong technical troubleshooting, diagnosing and problem solving skills
  • Ability to adapt to changing priorities within a fast moving industry and startup culture
  • A Bachelor’s degree or equivalent work experience preferred 

Similar jobs

Other jobs at BitMEX