Role Name: GCP Data Migration Lead
Location: Chicago, IL – onsite
Contract Role
Key Responsibilities:
- Serve as Data Migration team leader for a large data and application migration to the Google Cloud
platform.
- As Data Migration team leader, this individual will be responsible for our team's end-to-end data
architecture and migration planning to support the migration effort as well as future-state client
efforts on the Google Cloud platform.
- As Data Migration team leader, this individual will collaborate closely with the overall Google Cloud
migration team leadership, working to deliver a successful application and data migration.
- Design and build data pipelines: Develop and maintain reliable and scalable batch and real-time data
pipelines using GCP tools such as Cloud Dataflow (based on Apache Beam), Cloud Pub/Sub, and
Cloud Composer (for Apache Airflow).
- Create and manage data storage solutions: Implement data warehousing and data lake solutions
using GCP products like BigQuery, Cloud Storage, and other transactional or NoSQL databases such
as Cloud SQL or Bigtable.
- Ensure data quality and integrity: Develop and enforce procedures for data governance, quality
control, and validation throughout the data pipeline to ensure data is accurate and reliable.
- Optimize performance and cost: Monitor data infrastructure and pipelines to identify and resolve
performance bottlenecks, ensuring that all data solutions are cost-effective and scalable.
- Collaborate with other teams: Work closely with data scientists, analysts, and business stakeholders
to gather requirements and understand data needs, translating them into technical specifications.
- Automate and orchestrate workflows: Automate data processes and manage complex workflows
using tools like Cloud Composer.
- Implement security: Design and enforce data security and access controls using GCP Identity and
Access Management (IAM) and other best practices.
- Maintain documentation: Create and maintain clear documentation for data pipelines, architecture,
and operational procedures.
Required Skills & Qualifications:
- Must have GCP Professional Certifications or Associate Certification; Candidate with GCP
- Professional Certifications will be given higher preference
- 8+ years of data engineering experience developing large data pipelines in very complex
- environments
- Very Strong SQL skills and ability to build very complex transformation data pipelines using custom
- ETL framework in Google BigQuery environment
- Very strong understanding of data migration methods and tooling, with hands-on experience in at
- least three (3) data migrations to Google Cloud
- Google Cloud Platform: Hands-on experience with key GCP data services is essential, including:
- BigQuery: For data warehousing and analytics.
- Cloud Dataflow: For building and managing data pipelines.
- Cloud Storage: For storing large volumes of data.
Cloud Composer: For orchestrating workflows.
- Cloud Pub/Sub: For real-time messaging and event ingestion.
- DataProc: For running Apache Spark and other open-source frameworks.
- Programming languages: Strong proficiency in programming languages, most commonly Python, is
mandatory. Experience with Java or Scala is also preferred.
- SQL expertise: Advanced SQL skills for data analysis, transformation, and optimization within
Big Query and other databases.
- ETL/ELT: Deep knowledge of Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT)
- processes.
- Infrastructure as Code (IaC): Experience with tools like Terraform for deploying and managing cloud
infrastructure.
- CI/CD: Familiarity with continuous integration and continuous deployment (CI/CD) pipelines using
tools such as GitHub Actions or Jenkins.