We’re looking for a senior engineer with deep hands-on expertise in Rook (Ceph on Kubernetes) to help stabilize — and potentially redesign — a production object storage stack. This is part-time and long-term opportunity.
What we need help with
This project is focused on Rook-based architecture and production hardening, including:
• diagnosing root causes of instability and performance collapse
• designing a reliable Rook/Ceph architecture for Kubernetes
• improving upgrade safety, operational stability, and performance
• advising whether we should stay on Ceph (via Rook) or migrate awayWe are also open to alternatives (e.g., managed object storage like Wasabi), but the primary goal is to engage someone who can own Rook/Ceph decisions end-to-end.
Environment (current)
• ~10 Kubernetes nodes
• VMs running on Proxmox
Current challenge
We are currently running a self-hosted Ceph-based object storage setup that becomes unstable under:
• heavy read/write traffic (S3 + RADOS writes)
• rebalancing events
• ongoing Ceph upgrades
Additional context:
• No namespacing in the current design
• ~0.5TB stored data
• Current infrastructure costs: ~$15–20k/month (Leaseweb)
Strong production experience with Rook (Ceph on Kubernetes)
Deep understanding of Ceph internals (rebalancing, OSD behavior, CRUSH, recovery tuning, etc.)
Comfortable owning architecture decisions in real production systems
Fluent English (daily communication)








%20(2).avif)


.avif)



.avif)
Submit your CV, LinkedIn, and GitHub via the form. We’ll review your profile.
If your skills align, we'll reach out for a quick conversation to understand your experience and project preferences.
Once selected, we’ll match you with a client project that fits your expertise. A brief onboarding ensures you're set up with our tools and ready to start.