Site Reliability Engineer, Trainline

Salary not provided
AWS
Python
Linux
Terraform
ELK
Flow
Data Flow
Grafana
Teamcity
Junior, Mid and Senior level
London

2+ days a week in office (Holborn, London)

Trainline

Train ticketing platform

Job no longer available

Trainline

Train ticketing platform

1001+ employees

B2CB2BTravelSustainabilityTransporteCommerce

Job no longer available

Salary not provided
AWS
Python
Linux
Terraform
ELK
Flow
Data Flow
Grafana
Teamcity
Junior, Mid and Senior level
London

2+ days a week in office (Holborn, London)

1001+ employees

B2CB2BTravelSustainabilityTransporteCommerce

Company mission

We're building the world's no.1 rail platform while empowering greener travel choices, connecting people and places.

Role

Who you are

  • Experience of SRE concepts such as SLI, SLO and error budgets
  • Hands-on experience with observability tooling such as New Relic, Elastic Cloud (ELK Stack), Influx, Grafana, with a good understanding of APM and MELT (metrics, events logs, traces),
  • Strong understanding of HTTP/TCP (status codes, nuances of headers, cookies, connection/request life cycle)
  • Understanding of load balancing and reverse proxy concepts, upstream config concepts, upstream health checks, worker & data flow concepts
  • Application architecture concepts (threading, queuing, readiness checks, health checks, circuit breakers, timeouts, exponential backoff, throttling)
  • Experience building, maintaining and evolving time series data, retention, cardinality, deviation, moving averages and other functions
  • Experience working with cloud providers preferably AWS
  • Experience with build, deployment & configuration management tooling such as TeamCity, GitHub Actions, and Terraform
  • Experience troubleshooting Linux operating systems
  • Experience of scripting in at least one language preferably Python

What the job involves

  • Trainline is a fast-growing company that loves utilising new technology to build world-class products for our customers
  • We run a diverse platform that is hosted on AWS and coupled with our own tooling allows us to embrace CI/CD, DevOps practices, SRE disciplines and cloud native services to their full potential
  • ReliabilityOps are at the forefront of platform observability maintaining availability, latency, performance, efficiency, capacity, CI/CD delivery co-ordination, critical incident response and cloud infrastructure automation and provisioning
  • We are looking for a Site Reliability Engineer to join the team contributing to owning observability and building tooling that supports operational engineering
  • We are looking for a strong technical team player who has experience implementing SRE practices within a team and contributing to advocating SRE principles
  • Critical incident response in production, from initial event, participating in rapid response and driving service restoration, identifying follow up measures
  • Building and implementing tooling to improve observability, identification and resolution of incidents with a strong emphasis on reducing MTTD
  • Supporting product engineering teams to ensure applications are operationally launch ready and that CI/CD activities are carried out in a safe manner with reliability in mind
  • Reducing MTTR by working with product engineering teams to understand issues, surface and present the right data to influence change
  • Contributing to incident retrospectives with deep technical knowledge to explain what may have occurred at HTTP, TCP, DNS layers of the stack
  • Promoting and expanding SRE concepts to the engineering community in both a consultancy and hands-on fashion, being a champion of observability engineering and reliability principles
  • Improving platform reliability, identifying metrics to base decisions on, surfacing them if we don’t record them, identifying continuous improvement across our pillars of observability
  • See data presentation as a socio-technological problem, we need the most pertinent information presented quickly in a human-consumable way to affect the resolution of a real time incident
  • Delivery on key road map deliverables and ensure that initiatives are contributing to the achievement and improvement of the SRE team, reliability of the platform & business OKRs
  • Participating in the SRE on call rota, assuming the role of incident commander ensuring our platform is supported 24/7 for our customers

Salary benchmarks

Otta's take

Xav Kearney headshot

Xav Kearney

CTO of Otta

Trainline make it super easy to book transport tickets. Their app is intuitive and well designed. and they're even pioneering paperless travel, allowing you to store your tickets in the app.

They're hyper focused on customer experience and have a product culture of fast iteration to deliver great outcomes for customers.

Their customers make 170,000 journeys a day and they sell 200 tickets a minute.

They're continuing to innovate to stay ahead. They're using machine learning to predict price movements and are integrating with Siri and Google Assistant so people can buy tickets just by speaking to their phone.

Insights

Some candidates hear
back within 2 weeks

42% female employees

14% employee growth in 12 months

Company

Employee endorsements

Challenging work

"The complex nature of the market as we work with different suppliers is an amazing opportunity to create fluid solutions. A very well segmented..."

Funding (1 round)

Jan 2006

$206.2m

LATE VC

Total funding: $206.2m

Company benefits

  • Flexible working (40% min office attendance)
  • Cycle to work scheme
  • Gym memberships
  • Private Health and dental Insurance
  • Pension (matched contributions)
  • Income Protection
  • Share Incentive Plan (buy one get one free)
  • Learning budget
  • Primary & Secondary Caregiver Leave
  • Shared Parental Leave
  • Enhanced sick pay
  • Free Perkbox subscription

Company values

  • Own It - We care about every customer, partner and journey
  • Do Good - We make a positive impact
  • Travel Together - We're one team
  • Think Big - We are building the future of rail

Company HQ

Farringdon, London, UK

Founders

Jody Ford

(CEO, not founder)

Previously Non-exec Director of Moonpig and Photobox Group.

Milena Nikolic

(Chief Technology Officer, not founder)

Previously Engineering Director at Google.

Lisa Hillier

(Chief People Officer, not founder)

Previously Chief People Officer at Gousto , The Restaurant Group and Just Eat.

Mike Hyde

(Chief Data Officer, not founder)

Previously at Meta.

Dave Price

(Chief Product Officer, not founder)

Previously at Spotify and GoEuro.


People progressing

Joined as a Product Manager. Promoted to Product Manager after 1 year. Then promoted again to Principal Product Manager after 7 months and is currently Product Director.

Joined as Business Performance Analyst. Promoted to Senior Analyst after 1.5 years. Then promoted again to Commercial Manager after 7 months.

Diversity & Inclusion at Trainline

  • πŸ’š Diversity drives us forward We know that having a diverse team is crucial to Trainline’s ongoing success. A team that is diverse in all forms – gender, ethnicity, sexuality, disability, nationality and diversity of thought are key to Trainline being an inclusive environment where every Trainliner can be their true self. We're committed to creating workplaces where everyone belongs, is celebrated and their differences are valued, creating an awesome employee and customer experience. πŸ’š Meet our Networks Our diversity networks are inclusive communities developed and led by Trainliners, with sponsorship and support from senior leaders. They're all about empowering and supporting underrepresented groups, by providing a safe space to talk, a place to come up with new ideas and a channel for voices to be heard. πŸš€ Women in Leadership πŸ’œ Artemis (gender equality) πŸ³οΈβ€πŸŒˆ Rainbow Train (LGBTQIA+) 🌻 Sunflower (accessibility) 🌍 Ethnic Diversity Network 🐣 Parent & Carers

Share this job