PRINCIPAL DATA ENGINEER/ ARCHITECT
Company: Lancesoft
Location: San Diego
Posted on: November 18, 2024
Job Description:
Principal Data Engineer/ Architect
A variety of soft skills and experience may be required for the
following role Please ensure you check the overview below
carefully.
Temp to perm
Full time - Pay $70 to $100/hr based on experience
RemoteAs the Senior Software Engineer, you will lead a team of data
engineers in designing, building, and maintaining high-performance
software systems to manage analytical data pipelines that fuel the
organization's data strategy using software engineering best
practices. Beyond technical expertise, you will also serve as a
change leader, guiding teams through adopting new tools,
technologies, and workflows to improve data management and
processing.This position requires extensive hands-on data system
design and coding experience, as well as the development of modern
data pipelines (AWS Step functions, Prefect, Airflow, Luigi,
Python, Spark, SQL) and associated code in AWS.You will work
closely with stakeholders across the business to understand their
data needs, ensure scalability, and foster a culture of innovation
and learning within the data engineering team and beyond.Key
Responsibilities:
- Be responsible for the overall architecture of a specific
module within a product (e.g., Data-ingestion,
near-real-time-data-processor, etc.), perform design and assist
implementation considering system characteristics to produce
optimal performance, reliability and maintainability.
- Provide technical guidance to team members, ensuring they are
working towards the product's architectural goals.
- Create and manage RFCs (Request for Comments) and ADRs
(Architecture Decision Records), Design notes and technical
documentation for your module, following the architecture
governance processes.
- Lead a team of data engineers, providing mentorship, setting
priorities, and ensuring alignment with business goals.
- Architect, design, and build scalable data pipelines for
processing large volumes of structured and unstructured data from
various sources.
- Collaborate with software engineers, architects, and product
teams to design and implement systems that enable real-time and
batch data processing at scale.
- Be the go-to person for PySpark-based solutions, ensuring
optimal performance and reliability for distributed data
processing.
- Ensure that data engineering systems adhere to the best data
security, privacy, and governance practices in line with industry
standards.
- Perform code reviews for the product, ensuring adherence to
company coding standards and best practices.
- Develop and implement monitoring and alerting systems to ensure
timely detection and resolution of data pipeline failures and
performance bottlenecks.
- Act as a champion for new technologies, helping ease
transitions and addressing concerns or resistance from team
members.Ideal Candidate:
- Experience leading a data engineering team with a strong focus
on software engineering principles such as KISS, DRY, YAGNI
etc.
- Candidate MUST have experience in owning large, complex system
architecture and hands-on experience designing and implementing
data pipelines across large-scale systems.
- Experience implementing and optimizing data pipelines with AWS
is a must.
- Production delivery experience in Cloud-based PaaS Big Data
related technologies (EMR, Snowflake, Databricks etc.)
- Experienced in multiple Cloud PaaS persistence technologies,
and in-depth knowledge of cloud-based ETL offerings and
orchestration technologies (AWS Step Function, Airflow etc.)
- Experienced in stream-based and batch processing, applying
modern technologies.
- Working experience with distributed file systems (S3, HDFS,
ADLS), table formats (HUDI, Iceberg), and various open file formats
(JSON, Parquet, CSV, etc.)
- Strong programming experience in PySpark, SQL, Python,
etc.
- Database design skills including normalization/de-normalization
and data warehouse design.
- Knowledge and understanding of relevant legal and regulatory
requirements, such as SOX, PCI, HIPAA, Data Protection.
- Experience in the healthcare industry, a plus.
- A collaborative and informative mentality is a
must!Toolset:
- AWS, preferably AWS certified Data Engineer and AWS certified
Solutions Architect.
- Proficiency in at least one programming language C#, GoLang,
JavaScript or ReactJs.
- Spark / Python / SQL.
- Snowflake/ Databricks / Synapse / MS SQL Server.
- ETL / Orchestration Tools (Step Function, DBT etc.)
- ML / Notebooks.Education and experience required:
- Bachelors or Master's in Computer Science, Information Systems,
or an engineering field or relevant experience.
- 10+ years of related experience in developing data solutions
and data movement.This role can be REMOTE.
#J-18808-Ljbffr
Keywords: Lancesoft, Placentia , PRINCIPAL DATA ENGINEER/ ARCHITECT, Engineering , San Diego, California
Didn't find what you're looking for? Search again!
Loading more jobs...