Previous
GCP Data Migration Lead
Next
| Ref No.: |
25-01356 |
| Location: |
Chicago, Illinois
|
| Position Type: | Contract |
Role Name: GCP Data Migration Lead
Location: Chicago, IL - onsite
Contract Role
Job description:
Data Migration Lead
Design and build data pipelines: Develop and maintain reliable and scalable batch and real-time data
pipelines using GCP tools such as Cloud Dataflow (based on Apache Beam), Cloud Pub/Sub, and
Cloud Composer (for Apache Airflow).
Create and manage data storage solutions: Implement data warehousing and data lake solutions
using GCP products like BigQuery, Cloud Storage, and other transactional or NoSQL databases such
as Cloud SQL or Bigtable.
Ensure data quality and integrity: Develop and enforce procedures for data governance, quality
control, and validation throughout the data pipeline to ensure data is accurate and reliable.
Optimize performance and cost: Monitor data infrastructure and pipelines to identify and resolve
performance bottlenecks, ensuring that all data solutions are cost-effective and scalable.
Collaborate with other teams: Work closely with data scientists, analysts, and business stakeholders
to gather requirements and understand data needs, translating them into technical specifications.
Automate and orchestrate workflows: Automate data processes and manage complex workflows
using tools like Cloud Composer.
Implement security: Design and enforce data security and access controls using GCP Identity and
Access Management (IAM) and other best practices.
Maintain documentation: Create and maintain clear documentation for data pipelines, architecture,
and operational procedures.
Required Skills & Qualifications:
- Must have GCP Professional Certifications or Associate Certification; Candidate with GCP Professional Certifications will be given higher preference
- 8+ years of data engineering experience developing large data pipelines in very complex
environments
- Very Strong SQL skills and ability to build very complex transformation data pipelines using custom ETL framework in Google BigQuery environment
- Very strong understanding of data migration methods and tooling, with hands-on experience in at least three (3) data migrations to Google Cloud
- Google Cloud Platform: Hands-on experience with key GCP data services is essential, including:
- BigQuery: For data warehousing and analytics.
- Cloud Dataflow: For building and managing data pipelines.
- Cloud Storage: For storing large volumes of data.
- Cloud Composer: For orchestrating workflows.
- Cloud Pub/Sub: For real-time messaging and event ingestion.
- DataProc: For running Apache Spark and other open-source frameworks.
- Programming languages: Strong proficiency in programming languages, most commonly Python, is mandatory. Experience with Java or Scala is also preferred.
- SQL expertise: Advanced SQL skills for data analysis, transformation, and optimization within Big Query and other databases.
- ETL/ELT: Deep knowledge of Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT) processes.
- Infrastructure as Code (IaC): Experience with tools like Terraform for deploying and managing cloud infrastructure.
- CI/CD: Familiarity with continuous integration and continuous deployment (CI/CD) pipelines using tools such as GitHub Actions or Jenkins.
|