whitesky cloud platform

Data is your most valuable asset.
Hardware failure is not a matter of if, but when.

The difference between data loss and data safety is not marketing claims or SLAs — it is architectural decisions made from day one. The whitesky storage platform is engineered for reality: disks fail, servers fail, networks partition, and maintenance must happen without downtime.

This page explains how whitesky delivers data safety by design, from high-level principles to the technical foundations underneath.

Engineering for reality, not best-case scenarios

Traditional storage platforms are often optimized for benchmarks and ideal conditions. whitesky starts from a different assumption: failure is expected.

Every architectural choice is shaped by this premise:

Safety takes precedence over raw efficiency
Scalability is built in, not retrofitted
Failure is isolated, not amplified
Recovery is automatic, not heroic

Architecture determines resilience. Hardware fails. Power fluctuates. Networks partition.
Your storage platform must handle these realities transparently.

whitesky storage at a glance

The whitesky storage platform is a distributed, software-defined system designed to withstand cascading failures that plague traditional infrastructure.

Even during:

disk failures
server outages
network partitions
planned maintenance

the platform maintains data availability and consistency.

Platform layers

Compute layer: Virtual machine orchestration and workload management with seamless failover.
Block storage layer: High-performance virtual disks with distributed transaction logging and cheap snapshotting.
Object storage backend: Erasure-coded data distribution across fault domains with automatic self-healing.
Backup layer: Independent snapshot architecture using S3-compatible, immutable storage.

Multiple layers of fault tolerance

Device-level protection: beyond RAID

Traditional RAID introduces single points of failure. whitesky uses erasure coding instead.

Data is split into fragments with calculated redundancy. These fragments are distributed across different physical disks.
Even multiple simultaneous disk failures do not result in data loss.

Server-level protection: distributed by design

Fragments are deliberately spread across different physical servers. No single server ever holds critical data.

When a server fails:

data remains accessible
fragments are automatically rebuilt
redistribution happens without manual intervention

For decision makers: losing disks or servers does not mean losing data.
Failure is routine — not catastrophic.

Storage blocks: independent failure domains

Instead of monolithic storage clusters, whitesky uses storage blocks.

A storage block is a small, independent failure domain:

3 to 6 storage servers per block
always tolerates the loss of 1 full server
and multiple disk failures at the same time

Why this matters

Failure containment: Failures stay inside one block. There is no cascading “blast radius” across your entire cloud.
Predictable recovery: Each block has known recovery characteristics. No surprises during incidents.
Autonomous operation: Each block operates independently. Issues in one block never impact the operational state of others.
Multiple storage blocks combine into a full cloud location, enabling scale without sacrificing safety.

Scaling without compromising safety

Linear scale-out architecture

Capacity and performance scale by adding storage blocks. No re-architecture. No migration events. No redesign.

You can:

start with a single block for edge or small deployments
grow to dozens of blocks for regional data centers

Each block adds predictable capacity, performance, and fault tolerance.

Efficient storage economics

Erasure coding overhead can be as low as 33%.

This is dramatically more efficient than triple replication, which wastes 200% of raw capacity. You get enterprise-grade safety without hyperscaler-level storage waste.

Built-in data protection

Native backup integration

VM snapshots are stored directly in S3-compatible object storage. The backup layer is fully independent from the primary block storage layer.

If primary storage is impacted, backups remain intact and accessible.

Immutable snapshot design

Once written, snapshots cannot be modified or deleted through normal operational paths. This protects against:

accidental deletion
ransomware attacks targeting backup repositories

Rapid recovery

Because backup and storage are integrated, recovery does not depend on external systems. This dramatically reduces recovery time objectives (RTO).

Deployment models

Hyper-converged deployment

Compute and storage run on the same physical servers.

Best suited for:

smaller clusters
edge locations
cost-efficient regional deployments

Benefits:

lower hardware footprint
simplified operations
reduced capital investment

Disaggregated deployment

Dedicated compute nodes and dedicated storage nodes operate independently.

Best suited for:

large production environments
performance-sensitive workloads
independent scaling of compute and storage

This model simplifies failure handling and maintenance at scale.

Storage media configurations

Full flash

All layers run on SSD:

write buffers
distributed transaction logs
metadata services
erasure-coded storage layer

Delivers:

lowest latency
consistent performance
throughput bounded by network, not disks

Hybrid (flash + HDD)

Flash accelerates hot paths:

write buffers
metadata
cache

HDDs store cold data economically.

Delivers:

strong cost-per-TB efficiency
flash-like performance for active datasets
intelligent background data placement

Both configurations provide identical data safety guarantees.

Optional Security: protection against physical theft

If applied, all storage devices use encrypted filesystems. Encryption keys are stored in TPM 2.0 hardware modules.

There are:

no centralized key vaults
no shared secrets
no single point of compromise

Encryption is transparent to workloads and requires no application changes.

Security guarantee: Physical theft of disks does not result in data access. Without TPM-secured keys, stolen devices contain only encrypted fragments that cannot be reassembled.

Designed for real-world operations

Maintenance without downtime

Rolling upgrades allow software updates without service interruption. Servers can be replaced transparently while workloads keep running.

Self-healing architecture

Background agents continuously:

monitor health
repair fragments
rebalance capacity
verify data integrity

No manual intervention is required.

Operational simplicity

The platform absorbs complexity internally. Operators work with predictable states instead of emergency procedures.

This replaces heroic troubleshooting with calm, repeatable operations.

Technical deep dive (for architects & engineers)

Object storage backend

Core components:

OSDs storing object fragments on HDD or SSD
Arakoon clusters providing distributed consensus and metadata
Namespace managers tracking object locations
Stateless proxies for client access
Background maintenance agents for repair and rebalancing

Write path:

object split into fragments
erasure coding applied
fragments distributed across fault domains
metadata stored in namespace manager

Read path:

fragment locations resolved
missing fragments reconstructed automatically if needed

Redundancy policies define how many node and disk failures are tolerated.

Virtual block device layer

Virtual disks are exposed via a custom protocol.

Key characteristics:

log-structured object aggregation
cheap snapshots and clones
mapping between logical blocks and objects stored in metadata servers
distributed transaction log protects in-flight data

Each virtual disk:

is owned by exactly one volume driver
can fail over to another driver automatically
supports live migration during maintenance

Ownership fencing ensures split-brain conditions cannot corrupt data.

Why customers trust whitesky storage

Failure-aware architecture: Built from first principles to isolate, contain, and recover automatically.
Sovereignty and control: Transparent operation under your control, aligned with European data sovereignty.
Scale without compromise: From edge to data center with consistent safety characteristics.

whitesky does not avoid failure — it engineers for it.

Storage Platform