Data Engineer - Bilingual Mandarin required

External

Cwill · Cary, NC

Full-timeHybrid2d ago

PythonSQL

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

CWILL (pronounced "quill") is the post-purchase and retention suite built for Shopify. With strong product-market fit and expanding US operations, we're building out our security and compliance capabilities to meet global data privacy standards. Learn more: www.cwill.com I. Basic Information Work Authorization Green Card / U.S. Citizen required (we do nor sponsor) Job Title Data Engineer Focus Areas Data ingestion, data lakehouse, data warehouse, data platform, data service APIs, data quality & engineering agent development Level Junior to mid-level with high growth potential Location United States - on-site, remote, or hybrid (per company requirements) Employment Type Full-time Collaborating Teams CWILL Data Engineering, Data Analytics, Business, Product, and Technology teams Language English required; Mandarin is a strong plus Cross-Timezone Work Must maintain a regular collaboration window with the China team; strong async communication and documentation skills required (approx. 2 hrs/day overlap needed) Collaboration Frequency Every 1-2 days; approx. 2 hrs per session. Candidates in western US time zones preferred for scheduling. II. Role Positioning CWILL is building data infrastructure to support business operations, product capabilities, customer service, analytics, and intelligent applications. As a US-side data engineer, you will participate in multi-source data ingestion, data lakehouse and warehouse development, data quality governance, data platform capability building, and AI Agent engineering automation exploration. We are looking for candidates with a solid foundation in SQL, Python, and data engineering - someone who can, with guidance from the existing data team, progressively take ownership of data ingestion, modeling, quality, and service tasks, while collaborating effectively with domestic data engineering, analytics, and business teams. This is not a pure data analysis, BI reporting, or one-off scripting role. It is a comprehensive data engineering position focused on data integration, data warehouse development, data platform capabilities, data services, and engineering automation. III. Role Mission Through stable, well-structured, and scalable data engineering capabilities, help the company unify, govern, model, and serve data scattered across business systems, SaaS platforms, external channels, and internal systems - improving the usability, accuracy, timeliness, and reusability of CWILL's data assets. This role is expected to continuously drive: - More standardized data source ingestion - Clearer data lakehouse and warehouse structure - More automated data quality monitoring - More platform-driven data service capabilities - Progressive adoption of agent-based and automated approaches for data development, troubleshooting, documentation, and quality checks IV. Key Responsibilities 1. Data Ingestion & Pipeline Development - Ingest data from internal and external business systems, third-party platforms, SaaS products, and external data sources; handle data collection, sync, cleansing, and loading - Participate in building offline and real-time data pipelines using SeaTunnel, Kafka, Flink, Spark, or similar technologies to improve ingestion stability and processing efficiency - Handle practical challenges in data sync: authentication, pagination, rate limiting, failure retry, incremental sync, backfill, schema changes, and task anomalies 2. Data Warehouse & Data Modeling - Participate in layered data warehouse development across ODS, DWD, DWS, and ADS layers; build and maintain data models - Support business domain modeling, metric standardization, shared data model development, and core table maintenance - Optimize data organization and query performance on OLAP engines such as Doris to provide stable data support for product, operations, growth, customer success, and management analytics 3. Data Quality & Data Governance - Build and maintain data quality rules for core data pipelines; ensure data accuracy, completeness, consistency, and timeliness - Participate in data validation, anomaly detection, alerting, and issue resolution; help improve stability of critical data pipelines - Contribute to data governance capabilities including DataHub or similar tools; improve metadata management, data lineage, data asset catalog, and data standards 4. Data Platform & Data Services - Participate in building data platform capabilities including data development, task scheduling, monitoring, quality management, governance, and service delivery modules - Use tools such as DolphinScheduler and StreamPark for task management, scheduling orchestration, and real-time task operations - Support the data service layer by delivering standardized APIs, metric services, and data capabilities to internal systems, analytics applications, and business tools - Support underlying data for tools like Superset; ensure data availability for BI dashboards, metric boards, and business monitoring 5. AI Agent & Engine

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at CWILL? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect