Your mission
Make sure the system never lies, and rarely fails, no matter the complexity
Dunia is building AI-native, automated laboratories for materials discovery. Our systems combine hardware, software, chemistry, and automation in tightly coupled workflows. In this environment, reliability and quality are not support functions. They are foundational.
As Reliability & Test Engineer, you will own the design and enforcement of reliability, testing, and safety practices across Dunia’s facilities. Your role is to minimize downtime, prevent failure propagation, and ensure that every experiment, run, and dataset meets a consistently high quality bar.
This role exists to make sure the system works every time, not just most of the time.
Dunia is building AI-native, automated laboratories for materials discovery. Our systems combine hardware, software, chemistry, and automation in tightly coupled workflows. In this environment, reliability and quality are not support functions. They are foundational.
As Reliability & Test Engineer, you will own the design and enforcement of reliability, testing, and safety practices across Dunia’s facilities. Your role is to minimize downtime, prevent failure propagation, and ensure that every experiment, run, and dataset meets a consistently high quality bar.
This role exists to make sure the system works every time, not just most of the time.
Your tasks will include:
Design reliability into the system- Identify where failures are likely to occur before they happen
- Eliminate single points of failure across hardware, automation, and workflows
- Build recovery strategies that minimize impact when things go wrong
- Define tests that reflect real operating conditions, not ideal scenarios
- Ensure consistency and reproducibility across runs and systems
- Prevent subtle degradation from quietly eroding quality
- Anticipate how increased automation changes risk
- Design safeguards that work without constant human intervention
- Take ownership of safety as a system property, not a checklist
- Investigate incidents andnear-missesdeeply
- Translate findings into durable system improvements
- Raise the reliability bar across the organization