Data Engineer Trainee

Date: Jan 31, 2024

Location: MADRID, ES, 28037

Company: HOLCIM Group


The Data Engineer will join a growing team of analytics experts responsible for expanding and optimizing data, data infrastructure, processing and wrangling. 
The position will be responsible for creating data integration frameworks, set up data flow solutions, gather and cleanse data, understands the meaning of datasets and dataset compatibility and implements business rules.
The Data Engineer will support data scientists, product managers and analysts on data initiatives and will ensure optimal data delivery is consistent across projects. Data Engineer must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. This role is expected to optimize or even re-designing our data infrastructure to support our next generation of products, services and data initiatives. Also important to remark that Data Engineer will follow the directives defined by Cloud Architects and Enterprise Architects. 
The output produced by Data Engineer will be used to generate insights that support strategic decision-making, along with ideas to automate, innovate, and enhance the measurement and reporting process.
Reporting Line: Direct report to the Department Manager



  • Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions.
  • Create and schedule data flow tasks. Should be able to create automated data validation methods. Using the defined pipelines in Jenkins and follow the recommendations and standards defined in End to End team (E2E).
  • Deal with the current architecture defined in AWS (Postgres database, OpenShift, AWS services…)
  • Assemble, gather and cleanse data sets that meet business requirements.
  • Queries and/or integrates data by using SQL, SQL-like, ELT or ETL skills tools and techniques.
  • Creates data models by delivering calculations, aggregations and derivations from input data.
  • Implements business rules that are required to meet business needs
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Mine and analyze data from company databases or external data to drive optimization and improvement of product development.
  • Develop custom data models and algorithms to apply to data sets. 
  • Find acceptable alternatives that satisfy the needs of multiple stakeholders.
  • Develop processes and tools to monitor and analyze model performance and data accuracy.



  • University degree in the field of Computer Science (strongly preferred), Statistics, Mathematics. A Big data /data analytics Master’s degree is strongly preferred. 
  • Good understanding of data integration practices. Understanding how to schedule data flow tasks in AWS, usage of APIs, cloud connectors, etc.
  • Good understanding of JAVA and JavaScript would be an advantage.  
  • Good knowledge in statistical computer languages (R, Python, SLQ, etc.) to manipulate data and draw insights from large data sets
  • Good knowledge of data analytics architectures and standards, including data warehouses, master data management, ETL, OLAP, data quality management, advanced analytics, BI, visualization and service execution rigor / discipline
  • Good understanding in collaborative development environment and the usage of GIT
  • Good understanding of deployment tools (Dockers, Kubernetes…)
  • Good knowledge in development tools such as Anaconda Jupiter, AWS Sage maker, visual studio…
  • Good knowledge of data integration techniques and data management
  • General knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks.
  • General knowledge in statistical and data mining techniques: GLM/Regression, Random Forest, Boosting, Trees, etc.
  • General knowledge in visualizing/presenting data for stakeholders using Qlikview, Qlik sense, Business Objects, Angular, python (Jupiter) etc.
  • Pursues everything with energy, drive and the need to finish. Persist in accomplishing objectives despite obstacles and setbacks. 
  • Adapts quickly to changing resource requirements and enjoys multi-tasking applying knowledge of the organization (internal structures, processes and culture) to advance multiple objectives.
  • Pushes self and help others achieve results. Partners with others to get work done. Works cooperatively with others across the organization to achieve shared objectives 
  • Works effectively third parties (including offshore and nearshore service providers). 
  • Assist in prioritizing and executing tasks in a high-pressure environment. 
  • Excellent written, oral and interpersonal communication skills in English.
  • Spanish, French, Arabic, German and/other languages used in the countries in which we operate would be an advantage.  
  • Ability to effectively work in a team-oriented and collaborative environment. 
  • Highly motivated and attentive to detail. 



  • Value inclusion within the day to day responsibilities by respecting others’ perspectives/ convictions engaging others’ opinions, creating a safe environment where people, ideas and opinions are valued within the team / “internal” customers and external partners. 
  • Respect and take into consideration diversity by valuing different worldviews, challenges and cultures that represent all walks of life and all backgrounds. 
  • Is sensitive to how people, cultures and organizations function. Deals comfortably with organizational politics. Steer through the organizational maze to get things done.
  • Demonstrates positive thinking mindset, consistently identifying highlights.
  • Shows a can-do attitude in good and bad times and act as a role model in terms of ethics and self-awareness.