Data Engineer
Before applying, read the Hiring Process so you know what you’re signing up for.
Read this first (self-selection)
Stop reading if:
- You’re optimizing for comfort, stability, or a “stay in your lane” role.
- You want perfectly defined specs, endless alignment meetings, and a low-responsibility environment.
- You equate performance with effort instead of outcomes.
- You dislike direct feedback, written communication, and being held accountable for what ships.
Continue only if:
- You’re a high-agency builder: you see problems, you propose solutions, you execute.
- You care about craft (clean design, maintainability, performance, security) and you’re willing to defend trade-offs.
- You want your work to hit production and matter quickly.
- You like a small, fast team where quality is non-negotiable.
About Revas
We are building RevasOS, the WorkOS for the next generation of business.
We believe the modern software stack is broken—too many tools, too little intelligence, and zero privacy. We are fixing this by centralizing the chaos.
RevasOS is a secure, sovereign Cloud SaaS designed to be the central nervous system of a company. We combine data-first strategies with AI-native workflows to create a platform that is powerful enough to run a business, but private enough to own it. We don't build "productivity tools"; we build the engine room.
We are experts on distributed cloud systems and ML/AI workflows.
- Stage: Bootstrapped.
- Team: 2: Nicolò - manages the technology and the company, Davide - manages design, marketing, sales and customers.
- Location: Remote. We also love to meet and work together.
The role
You’re joining as a Data Engineer.
Your mission: turn messy inputs into trusted data products and production-grade AI workflows.
Baseline expectations (everyone, every role)
Regardless of title, you’re expected to:
- Collaborate closely with customers. You will seek clarity from real users, validate assumptions, and treat customer feedback as a first-class input (not an interruption).
- Write and maintain documentation. If you change the system, you change the docs. We keep the knowledge base (KB) clean, structured, and easy to navigate.
- Be an elite human engineer. High standards of professionalism: reliability, ownership, clear written communication, ethical behavior at work, and zero tolerance for creating unnecessary work for others.
What you’ll do
60% system design and coding, 30% document writing, 10% customer management.
- Build and maintain ingestion pipelines from product, third-party APIs, and operational systems.
- Model data for analytics and operations (definitions, lineage, documentation, quality checks).
- Design retrieval systems (RAG/GraphRAG) with measurable quality and operational guardrails.
- Build LLM workflows that are observable: tracing, evaluation, prompt/version control, and rollbacks.
- Own data reliability: SLAs, monitoring, backfills, cost controls, and incident response.
- Automate internal operations with integrations (APIs, webhooks, event-driven jobs).
- Build and validate backup and restore architectures for data and critical artifacts.
What you won’t do
- You won’t be judged on “cool demos” that can’t run twice.
- You won’t be expected to hand-curate data manually as a long-term solution.
- You won’t be a pure dashboard builder; the focus is data products and pipelines.
What success looks like
In the first 30 days
- Map the data landscape: sources, sinks, definitions, and the most fragile/expensive flows.
- Ship one reliability win (freshness monitoring, schema drift detection, idempotent backfills).
By 90 days
- Own a complete pipeline end-to-end (ingest → transform → serve) with quality checks and alerts.
- Deliver one production AI capability with strong guardrails (caching, rate limits, PII controls, safe fallbacks).
By 180 days
- Lead a major initiative: warehouse architecture evolution, unified event model, or scalable knowledge system.
- Establish a durable “AI shipping standard”: evals in CI, versioned prompts, traceability, and cost observability.
The environment (what it feels like)
This is a high-ownership environment.
- We prefer small teams, clear goals, and shipping.
- We move fast, but we don’t worship chaos.
- We value written clarity and directness.
- We expect you to be resilient: reality is messy, production is real, customers have opinions.
Requirements
You should have most of these:
- Strong Python skills for data engineering (pipelines, services, tooling).
- Proven experience shipping data pipelines in production: orchestration, retries, idempotency, backfills.
- Experience with cloud data systems (warehouses, object storage, managed compute).
- Familiarity with LLM workflows and practical constraints (latency, cost, caching, hallucinations).
- Hands-on experience with LangChain and/or LangGraph (or equivalent workflow orchestration).
- Understanding of RAG patterns: chunking, embeddings, retrieval strategies, evaluation.
- Ability to build integrations and automation reliably (APIs, webhooks, queues, scheduling).
- Strong operational discipline: monitoring, data quality checks, and incident handling.
If you don’t have every single requirement but you feel strongly aligned with our culture, mission, and genuine interest in this role, apply anyway.
Nice to have
- GraphRAG experience and knowledge graph modeling.
- MCP and/or A2A ecosystem experience (tooling, agent protocols, orchestration).
- Experience with dbt, semantic layers, and metric consistency.
- Experience with streaming/event-driven data (Pub/Sub/Kafka-style patterns).
- Experience with privacy/compliance constraints (GDPR, retention, PII minimization).
- Experience designing backup/restore and disaster recovery for data systems.
Tools & stack
- Language: Python.
- LLM stack: LangChain, LangGraph; RAG/GraphRAG; agent/tool protocols (MCP, A2A).
- Data: cloud warehouse patterns, ELT/ETL, object storage, data quality checks.
- Integration/automation: API integrations, webhooks, schedulers, queues.
- Infra: Docker; IaC with Terraform and Pulumi.
- Reliability: monitoring/alerting for freshness, volume, schema drift, and cost.
- Backups: versioned storage, restore drills, documented recovery procedures.
Equal Opportunity & Accessibility
We welcome applicants regardless of gender, gender identity/expression, age, nationality, ethnicity, religion/belief, disability, sexual orientation, or family/caregiver status.
We aim to provide a safe and accessible environment, including for people living with disabilities. If you need reasonable accommodations at any step of the hiring process (e.g., assistive technologies, additional time, alternative formats), tell us and we’ll adapt.
Candidate Privacy (GDPR)
If you apply, your personal data will be processed for recruitment and selection purposes in accordance with the GDPR (Regulation (EU) 2016/679), including the Art. 13 information duties, and applicable local laws.
Candidate privacy notice:
Compensation & perks
We try to be explicit and realistic.
- Compensation philosophy: We’re bootstrapped. We may not match top-of-market packages at larger/VC-backed companies; our goal is fair compensation and continuous improvement over time.
- Contract & classification: Employment contract aligned with the applicable CCNL (level defined in the offer).
- Salary: Base annual gross salary (RAL) depends on experience and scope; we will share a range early.
- Benefits / welfare: Profit-sharing when applicable; welfare/fringe benefits via compliant tools, tax-efficient when possible and within statutory thresholds (subject to taxation and local rules).
- On-call rotation: This role may participate in a scheduled on-call rotation for production data/AI systems. Rotations are planned in advance and designed to protect rest and recovery.
- Equipment: Company laptop + needed peripherals.
- Working mode: Remote-first, flexible hours, outcome-driven.
How to apply (high-signal)
Send one email to nicolo.gardoni@revas.io with subject “Data Engineer — {Your Name}”.
Include:
- Evidence of work: GitHub, portfolio, or 1–3 projects (links).
- Hard problems: describe 2–3 of the hardest problems you solved and exactly how you solved them.
- Context
- Constraints
- Trade-offs you considered
- What you shipped (and how you measured success)
- One failure: the worst bug/incident you caused (or owned), what happened, and what you changed so it won’t repeat.
- Why Revas / why this role: 5–10 lines.
If this feels like “too much work”, this won’t be a fit.