DevOps – Site Reliability Engineer

Full Time
Posted 3 months ago

Global Document Management
Project Overview:  Reference Data Technology Document Management

DevOps – Site Reliability Engineer – Document Management

– Design, code, test and deliver software to automate manual operational work
– Troubleshoot priority incidents, facilitate blameless post-incident evaluations, and ensure permanent closure of incidents
– Engage with development teams throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
– Identify application patterns and analytics in support of better service level objectives
– Design and develop self-healing and resiliency patterns
– Design and develop performance tests, identify bottlenecks and opportunities for optimization and capacity demands, and present solutions for continuous improvements
– Implement best in class monitoring frameworks to accomplish end-to-end flow monitoring and noiseless alerting
– Develop automated software and product upgrades, change management and release management solutions
– Influence developers and other teams across the globe to ensure resiliency and stability standards
– Effectively split time between operational work and engineering work
Contribute to around the clock support coverage as needed

– BS/BA degree or equivalent experience in a software engineering discipline
– 5 years of experience in working with at least one technology stack designing, coding, testing, delivering software
– Previous or current experience with software development, infrastructure development, or development and operations
– Working knowledge or understanding of infrastructure components such as routers, load balancers, cloud products, container systems, compute, storage, and networks
– Excellent debugging and trouble shooting skills
– Experience with Linux infrastructures, database SQL (MS SQL), CI/CD tools (Jenkins, Jules, Maven), scripting such as JavaScript, Python or Perl or Ruby, Scrum/Kanban/Agile methodologies
– 5+ years of experience in Core Java, Oracle, MS SQL Server, Unix Shell Scripting
– Basic knowledge about Development in J2EE, Spring Boot, MVC etc.
– Experience about Support models – Incident Management, Problem Management
– Knowledge about version control repositories like GIT
– Understanding of NOSQL databases e.g., Marklogic, Hadoop, MongoDB etc.
– Working knowledge of Centralized logging (Splunk) Or Log As Service
– Prefer to have experience Application Monitoring tools like Apica, AppDynamics or Dynatrace
– Good to have experience with cloud native technologies e.g., AWS, Kubernetes, and Pivotal GAIA
– Good interpersonal skills and communication with all levels of management and end users, participate in incident management and issue resolution across multiple teams
– Perform Production releases and support triaging of issues during releases or post releases
– Problem Solver – think through the problems and come up with alternatives to resolve it short term and/or long term.
– Able to prioritize and manage time efficiently

Job Features

Job CategoryFinance, Technology

Apply Online