Job Details
Description Duties and Responsibilities:Able to perform effective Incident Management from incident start through resolution,partnering with Development to determine root causes, and driving rigorous ProblemManagement to follow through on actionsProactive issue identification and resolutionOwn the production environment, monitoring availability and ensure a holistic systemhealth, handle application deploymentsTriage and Remediate production systems.
Resolve issues within SLAAchieve 90% automation and reduce manual interventionBe the primary operational support engineer for multiple large distributed criticalsoftware applications Job Requirements Engineering degree with 5+ years of experience in Application SupportStrong understanding of modern monitoring and logging technologies (Logzio, CloudWatch, Splunk, DynaTrace, New Relic, AppDynamics, etc.
)Understand microservice architectureExperience in Unix, Shell scripting/Python, SQL, AWS, etcExperience in troubleshooting complex application as well as environment issuesExcellent communication, presentation and documentation skills.
Strong experience with Intake, Problem Management, and Service AvailabilityManagementBasic knowledge of CI/CD tools and conceptsKnowledge of ITIL processes Ready to work in shifts