No more applications are being accepted for this job
- Plan, manage, and oversee all aspects of a Production Environment
- Define strategies for Application Performance Monitoring, Optimization in Prod environment
- Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
- Ensures that batch production scheduling and process are accurate and timely.
- Performs ad hoc requests from users such as data research, file manipulation/transfer, research of process issues, etc.
- Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
- Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead in DevOps automation and best practices.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation and evolving systems by pushing for changes that improve reliability and velocity.
- Work with a global team spread across tech hubs in multiple geographies and time zones.
- Ability to share knowledge and explain processes and procedures to others
- Experience in Linux.
- Knowledge on ITSM/ITIL.
- Good to have experience in industry standard CI/CD tools like Git/BitBucket, Jenkins, Chef.
- Experience with scripting, pipeline management, and software design.
- Solid grasp on any Databases - Casandra/Postgres/Oracle.Strong fundamentals in writing SQL queries
- Experience in PCF (Pivotal Cloud Foundary)
- Knowledge in Kafka
- Knowledge in using any Monitoring tools - DynaTrace/Splunk/Grafana
- Support experience for Event Framework/Event Drive Applications/Java/J2EE/Spring/Springboot based applications, cloud based microservices
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Ability to help debug and optimize code and automate routine tasks.
- Ability to support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
- Appetite for change and pushing the boundaries of what can be done with automation.
- Experience in working across development, operations, and product teams to prioritize needs and to build relationships are a must.
- Experience designing and implementing an effective and efficient CI/CD flow that gets code from dev to prod with high quality and minimal manual effort is desired.
- Good Handle on Change Management and Release Management aspects of Software
Sr System Reliability Engineer - Dublin, Ireland - Fulcrum Digital
Description
Job Description
Who are weFulcrumDigital is an agile and next-generation digital accelerating company providingdigital transformation and technology services right from ideation toimplementation. These services have applicability across a variety ofindustries, including banking & financial services, insurance, retail,higher education, food, healthcare, and manufacturing.
The Role
Requirements