We’re on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
Our Alerting (Events Intake) team owns the logs collection agent, event edge services, processing and enrichment that deliver data to the Datadog platform. They build, scale, and operate some of the low-latency, high-throughput data systems that power the growth of our business. The systems are built on top of high-scale open source frameworks and languages: Kafka, Go, Elasticsearch, Cassandra, Kubernetes.
In the Alerting (Event Intake) team, you will work on building a key part of our platform, own and improve systems running at high scale. You’ll tackle challenging projects, make an impact and grow quickly.
You’ll join at an ideal time to make a big impact, the product is seeing very high growth, with many new features to build as well as a need for scaling up dramatically. You will be a key part of the success of the product.
- Own and support our data pipelines, remove scaling bottlenecks in critical services
- Mentor other engineers on your team, developing their skills and helping advance their careers
- Write a significant amount of code, lead architectural decisions for new and existing services
- Run a lightweight agile process to pace and track what we work on and deliver high quality work
- You have been building applications for some years and know the systems you’ve worked on from top to bottom
- You have architected, built, and operated distributed systems to solve problems at high scale
- You want to work in a fast-paced, high-growth startup environment to build a brand new product
- You’ve built a non-trivial application from scratch that sees significant user traffic
- You’ve worked at high scale with systems like Elasticsearch, Cassandra, Kafka
- You have significant experience with Go or Python
Is this you? Let’s chat!