Lead Data Engineer

Apply now

Pendulum is a venture-backed, seed-stage startup that empowers companies and governments to detect, deter, and counteract the impacts of harmful online narratives with machine learning enabled tools. We are currently seeking a Lead Data Engineer to work as tech leader and subject matter expert to lead development of scalable and robust data workflows in the development of our data platform, which empowers our customers to define and track narratives and social media discussions that are relevant to their business.

As one of the first hires at a venture-track company, this role provides an opportunity to grow with the company and to have real impact with enormous upside. 

Your Responsibilities

  • Partner with technical and non-technical co-workers across the company to elicit their business and data requirements
  • Architect, build, and launch large efficient, reliable, scalable, and secure data pipelines in partnership with Product Engineering
  • Research a new problem space, disambiguate it, define tasks, and lead all team members to high impact delivery of projects
  • Design and develop data workflows for immediate and the future use cases
  • Build and manage data workflows in the cloud using Apache Airflow
  • Optimize workflows and ETL/ELT jobs
  • Integrate existing data platforms with AWS services like Amazon S3, Amazon Redshift, Amazon EMR, AWS Batch, and Amazon SageMaker
  • Design and develop data resources to enable self-serve data consumption
  • Build tools for testing workflow DAGs and validating data
  • Define and share best practices on data modeling, workflow management, and ETL tuning
  • Maintain workflows and troubleshoot complex data, deployment, security issues
  • Being a trusted tech leader and subject matter expert by providing technical guidance and consultation to team members and stakeholders across the company
  • Coach and mentor team members

Required Qualifications

  • Have a solid track record of driving complex projects to success by leading designs, justifying technical trade-offs, and solving hard challenges persistently.
  • Experience with workflow management solutions like Airflow
  • Strong knowledge of AWS Cloud services
  • Experience in schema design and dimensional data modeling
  • Strong skills in SQL and Python coding
  • Strong skills in distributed system optimization (e.g. Spark, Presto, Hadoop, Hive)
  • Strong track record of building large scale data sets, preferable petabyte scale and beyond
  • Ability to perform basic statistical analysis to inform business decisions
  • Experience with Docker and with engineering development tools like Git
  • Excellent communication skills, both written and verbal

Preferred Qualifications

  • Knowledge of Terraform or AWS Cloud Formation
  • Experience with AWS Lake Formation
  • Experience with notebook-based Data Science workflows
  • Familiarity with A/B testing and machine learning

About Pendulum

Pendulum is an information forensics company that classifies non-textual, publicly available content to identify the source, message, audience, and evolution of harmful narratives. Our tools help to detect, analyze, track and engage harmful narratives and dis-information that flow across many social media platforms and combat the threats and chaos sewn by bad actors on the social internet.  A sneak peek at our capabilities can be found in reports published on the Pendulum website.

Pendulum values inclusion and diversity and aspires to be among the tech industry’s most inclusive work environments.  We are committed to diversity in our workforce and are a proud equal opportunity employer. We do not make hiring or employment decisions on the basis of race, color, religion, creed, gender, national origin, age, sex, gender expression or identity, sexual orientation, or disability status, marital status, or veteran status.

Apply now

Interested applicants should include a resume and cover letter.