Cloud Storage Isn't Cheap Anymore. Try Cold Backups
Why “just store it in s3” stopped being obvious, and how cold storage changes the math.
Storage costs creep up quietly
Most teams don’t wake up one day and decide cloud storage is too expensive. They notice it months later: a line item that looked harmless at 200 GB has become 20 TB, plus requests, plus data retrieval, plus transfer out.
That creep happens because object storage bills are never just GB-month. Providers charge across multiple dimensions — storage class, requests, retrieval, transitions, data transfer out — so “we store more” quietly becomes “we store more and touch it more and move it more.”
Why hot cloud storage feels expensive
Hot storage is priced for immediate availability, high throughput, and frequent access patterns, great for serving app assets, logs in active analysis, and user uploads you expect to read constantly.
But many backup workloads aren’t “hot.” They’re write-once, rarely-read insurance, and paying hot-tier economics for cold behavior is how storage turns into a slow monthly budget leak.
The moment the team asks: “Do we really need instant access?”
The turning point usually isn’t price. It’s an RTO question: If we had to restore this, how fast do we actually need it? Archive storage exists precisely because many datasets don’t require millisecond retrieval.
Once you admit “hours is fine,” the pricing model changes because you stop optimizing for “always online” and start optimizing for “recoverable when needed,” which is what cold backup storage is designed for.
What “cold backup” actually means
Cold backup means you intentionally trade retrieval speed (and sometimes retrieval cost mechanics) for lower ongoing storage cost, and you plan restores as events, not as normal reads.
Cold backup does not mean “unreliable” or “not durable.” It means retrieval is intentionally slower and often priced differently.
Cold backup also does not mean “free to retrieve.” Many cold tiers introduce explicit retrieval fees, request fees, and minimum storage durations, so you pay less per month but you must model restore behavior honestly.
Three common “cold-ish” options teams actually use
1) Amazon S3 Glacier (true cold/archive inside AWS)
Glacier is the canonical “cold backup” path. Glacier Flexible Retrieval offers multiple retrieval speeds (expedited minutes, standard hours, bulk hours) and Glacier Deep Archive targets very infrequent access with longer retrieval windows (12–48 hours).
The hidden gotchas are usually lifecycle mechanics: minimum storage duration and the fact that restores can involve additional charges beyond simple storage.
2) Backblaze B2 (cheap “always hot” backup storage)
Backblaze B2 isn’t cold in the Glacier sense. It is “always-hot object storage,” but it is priced aggressively for backup and archiving patterns.
B2 also makes the request-cost dimension more explicit by grouping API calls into classes, with different free allowances and per-call pricing once you exceed them. This matters if your backup tool creates millions of small objects or lists constantly.
3) Cloudflare R2 (predictable egress, different knobs)
Cloudflare R2 is “hot object storage” with zero egress fees, but it charges for storage plus Class A/B operations. Its Infrequent Access tier adds a per-GB retrieval fee and a minimum storage duration requirement.
R2 can be a “cold backup” win when your problem is unpredictable egress or cross-cloud movement (egress is free), but you must still model operations and, if using Infrequent Access, retrieval processing fees and minimum duration.
The real comparison: it’s not price per GB, but the dimensions that bite you
If you only compare “$/GB-month,” you’ll pick the wrong tool for your restore behavior because real bills are driven by multiple axes: storage class, retrieval pricing, request pricing, and transfer pricing.
Here are the dimensions that usually matter more than the headline storage rate:
- Storage cost over time: GB-month (and which tier/class you’re in) is the baseline you can’t avoid.
- Retrieval cost: Some classes charge per GB retrieved (and sometimes per request), which turns “we restored once” into a surprise bill.
- Retrieval speed: Minutes vs. hours vs. days must match your RTO.
- API/request cost: GET/LIST/PUT requests can matter a lot when you have many small objects or chatty backup software.
- Egress/data transfer: Some platforms remove egress fees; others don’t, and that often dominates real-world “restore” or “migration” costs.
- Minimum storage duration + early delete penalties: Deleting “too soon” can cost more than you expect.
- Operational complexity: Archive tiers introduce restore jobs, timing windows, and extra steps, which changes how your incident runbooks work.
Common mistakes teams make when choosing cold storage
Teams often choose cold storage based on the cheapest storage number, then discover later that their access patterns (or tooling) turn “cold” into “expensive.”
Typical failure modes look like this:
- Putting “warm” data into “cold” tiers: If you read it weekly, it’s not archival. Retrieval fees and restore latency will punish you.
- Ignoring object count and request volume: Millions of small files can make request costs and listing costs non-trivial and can slow restores.
- Forgetting minimum storage duration: Moving backups in and out quickly can trigger minimum-duration charges (common with poorly tuned retention).
- Assuming “restore” means “download”: In cold and archive models, restore is often a distinct process with its own timing and pricing implications.
- Choosing a provider without modeling egress: If you might restore to another cloud or to on-prem frequently, transfer economics can dominate.
- No restore drills: The first time you test retrieval shouldn’t be during an outage, especially when your storage tier has hours-long retrieval times.
A practical decision framework (questions to ask before you pick)
Cold backups are a product decision disguised as a storage decision. You’re really choosing what you can tolerate during recovery.
Ask these before you touch pricing tables:
- What’s our RTO for a full restore vs. a single-file restore? (Minutes? Hours? A day?)
- How often do we actually restore in practice? (Quarterly drills? Only during incidents?)
- How big is a “real” restore event? (10 GB? 10 TB? 200 TB?)
- Will we restore across clouds or to the public internet? (If yes, egress is a first-class cost dimension.)
- How many objects do we generate? (Millions of small chunks vs. fewer large archives changes request cost and restore complexity.)
- Do we need an “infrequent but instant” tier? (Sometimes “archive instant retrieval” patterns are cheaper than true cold when humans are waiting.)
Once you can answer those, you can compare providers on the right axes: storage + requests + retrieval + transfer + operational steps, not just $/GB-month.
Cold backups don’t eliminate cost; they move it
When you lower storage cost, you usually increase the need for operational discipline: restore planning, runbooks, retention hygiene, and periodic verification that restores still work.
In other words, the bill shifts from “monthly GB-month” toward “engineering time + process maturity + occasional retrieval events,” especially in archive tiers where restores are asynchronous and have multiple retrieval options.
What mature teams actually do (and why it works)
Mature teams rarely pick one tier and call it done. They tier backups by age and business value because “yesterday’s backup” and “last year’s audit archive” are different products.
A common pattern is:
- Recent backups stay hot/warm for fast restores while they’re most likely to be needed.
- Older backups transition to archive tiers where ongoing storage cost is minimized and restore is acceptable in hours/days.
- They model restore bills (including retrieval + transfer) and explicitly budget for “rare but big” events.
- They run restore drills and document the exact steps, timing, and dependencies (DNS, IAM, keys, tooling versions).
This is also where teams start comparing real-world trade-offs between providers like R2, B2, and Glacier. Not on “cheap storage,” but on restore friction, egress risk, and operational overhead during incidents.
The moment teams actually choose a provider
Once teams accept that cold backups are a product decision, not just cheaper storage, the choice usually comes down to which cost you want to be predictable:
- Predictable long-term storage cost → archive tiers (Glacier / Deep Archive)
- Predictable retrieval and migration cost → egress-free or low-egress models (R2, B2)
- Predictable operations → fewer lifecycle transitions, simpler restore flows
There’s no universal best choice. The right one is the storage model that matches your restore reality, not your pricing spreadsheet.
Closing
Cold storage isn’t a pricing trick. It saves money only when you’ve actually aligned it with how you restore: retrieval speed, restore size, frequency, and where you’re restoring to. The winning move isn’t “find the cheapest GB-month.” It’s “design a backup strategy that matches what an incident actually looks like for us.”
The best storage setup makes costs predictable and restores boring. Know your tiers, model retrieval and transfer, and practice recovery as part of the system — not something you figure out under pressure at 2 a.m. Cold backups change the math, but only when operations owns the math.
Written by the Infra Atlas author
I work on infrastructure and software systems across layers: writing code, shipping products, and dealing with the practical trade-offs of hosting, memory, and network behavior in production. When this site says it covers “layer 3 to layer 9,” it’s half a joke and half a truth: from routing and packets, up through operating systems, applications, and the human decisions that actually cause outages.
Infra Atlas is a collection of field notes from that work. Some pages may include affiliate or referral links as a low-key way to support the site. Think of it as buying me a coffee while I write about why systems behave the way they do.