What do you need?
5+ years of experience
At least one of the following : SciPy, PySpark, Spark Streaming, MLBase
At least one of the following : Apache Spark, Apache Beam, and Apache Airflow
Must Haves : Python, SQL, Java
An understanding of how to have great impact with TDD and agile methodologies in software
Creativity, pragmatism, curiosity, and a good sense of humor
Working knowledge of : Algorithms / Data Structures / Design patterns, Functional Programming, Databases (relational and non-relational), Data Processing at scale, and Build and Release Toolchains
What will you do?
Create and refine bounded (batch) and unbounded (streaming) ETL and ML data pipelines that comprise our production systems.
Develop applications, libraries and workflows with Python, Java, Apache Spark, Apache Beam, and Apache Airflow.
Design and implement systems that run at scale on Google’s Dataproc, Dataflow, Kubernetes, Pub / Sub, and BigQuery platforms;
and implement algorithms and machine learning operations, at-scale, using SciPy, PySpark, Spark Streaming, and MLBase libraries.
Employ test-driven development, performance benchmarking, rapid release schedule, and continuous integration.
Mentor and advance the development of your colleagues.