Site Reliability Engineer Atlanta , GA(Hybrid) (Local to GA)

Role:- AWS Site Reliability Engineer

Location :- Atlanta , GA(Hybrid)

Job Description :-

Suggest team's primary responsibility will be to enhance observability coverage across mission-critical delta applications, ultimately success criteria will be defined around reduced MTTI (Mean time to Identify, which refers to time to identify issues in a failing system).
It will be very difficult to find one size that will fit all the asks, however we can evaluate and mix/match. I will be available to screen the profiles and then take an L-1 discussion.
Primary:
Dynatrace – On-Prem and SaaS | Person should have hands-on experience in setting up and designing dashboards
Observability – Must have complete context of SLI/SLO/SLA, how to set, how to measure, how to track and communicate
Open Source Observability Stack – Good Understanding of Open Telemetry, How to instrument applications to get desired metrics, traces, logs, etc
AWS Service – Cloud Watch, X-Ray, Lambda, overall data flow
Open Shift Rosa – Red Hat Open shift on AWS
Development Experience – Any language, should be able to read code and develop utilities as required
Good to have:
Extensive SRE org setup/stakeholder management/assessment experience
DevOps pipelining exp
Quality Gate implementation exp to enhance the reliability of applications
Extensive development experience in Python managing time series data
Chaos Engineering - Gremlin, Chaos Monkey

Thanks,

Ankit Kumar Mishra

Direct : 732-832-3488 Ext: - 239

MSR Technology Group LLC

An MSRcosmos Group Company

Giga Giglet – Tech Jobs & Career Updates in the U.S.