Senior Site Reliability Engineer - Brasília, Brasil - Sigma Software Group

    Sigma Software Group
    Sigma Software Group Brasília, Brasil

    há 3 semanas

    Default job background
    Descrição

    We have an excellent opportunity for a bright, smart, and highly motivated Senior Site Reliability Engineer to join our mature project team.

    You have a unique chance to become part of our team and work with best practices and methodologies. You will have the opportunity to lead and perform at your best.

    CustomerOur customer is Beeswax ( a rapidly growing US AdTech company. Founded by three former Google specialists, it has a highly technical team and an excellent technological culture.

    Beeswax provides extremely high-scale Bidder-as-a-Service solutions in advertising technology, works with global businesses, and has raised $28M (including the most recent Series B raise of $15M) to date.

    Sigma Software works with Beeswax to provide numerous key components of the platform. It is looking for engineers to complement the Beeswax engineering team and drive further platform development.


    ProjectWe're seeking a skilled Senior Site Reliability Engineer responsible for the Cloud Infrastructure and Observability solutions for the Client`s platform and ensuring all systems run smoothly.

    If you're passionate about complex tasks, optimizing systems, driving innovation, providing the highest quality, and collaborating with top talent, this is the perfect opportunity.

    The project is an easy-to-use, massive-scale, and highly available demand-side platform.

    Backed by Amazon Web Services and Kubernetes, the team has embraced Infrastructure as code to manage thousands of applications, servers, and containers running in multiple regions worldwide.

    Bring your expertise to our dynamic and forward-thinking environment

    Responsibilities:
    Design and build infrastructure and tooling to provide high scalability, reliability, and sub-second performance levels using security industry best practicesWrite code and scripts to support Infrastructure as code (IaC), configuration management, and automated incident resolutionSupport and extend the observability stack to capture and alert on any system issuesParticipate in on-call rotations and be an escalation contact for service incidentsWrite systems documentation, troubleshoot playbooks, and other instruction manualsOther duties and responsibilities as assigned

    Qualifications:
    Bachelor's or higher degree in computer science, computer engineering, relevant technical field, or equivalent practical experienceExpertise with architecture solutions and system designExperience in analyzing and troubleshooting large-scale distributed systemsAt least 6 years of administration experience with Linux, AWS, and KubernetesFrom 6 years of experience in configuration management using Cloud Formation, Terraform, and Ansible or similarAt least 3 years of experience with PythonStrong problem-solving skillsStrong verbal/written communication skillsAt least an Upper-Intermediate level of English