Cloudthread Y Combinator
December 19, 2022

CloudCostClip: AWS Intelligent tiering and other storage optimization strategies

Ilia Semenov:

Hello, hello, this is Ilia from Cloudthread, and Cloud Cost Clips. Today, we have Vitaly, from Align. And we will be talking about storage cost management on AWS, this is a big topic, a lot of money is wasted on S3. And probably everybody who is dealing with cost management knows about it, right? S3 is consistently among the top three highest spending services that we see among our clients. And Vitaly is a real expert. So we'll start with an intro. And then we will go into the weeds of the topic.

Vitaly Belyasov:

Hey, I'm Vitaly, I lead Cloud FinOps at Align Technology. And I've been in this space for a while now. Before Align, I was working as a Cloud Infrastructure Consultant at Accenture and within my career, I saw how the infrastructure evolves over time. So here I have a more specific focus on cost management. But before that, at Accenture I had a broader scope of things.

Ilia Semenov:

Yeah, awesome. And, man, I know that you've been dealing with storage costs at Align quite a bit lately. And I wonder what you think about this topic in general? Like, why is it usually so difficult to deal with S3 costs? Why don't they make it easier? What issues do you see?

Vitaly Belyasov:

Well, great question. So I think it all comes from the naming. The names of the storage tiers are really confusing, because like, standard is the top performance storage tier that everybody has, and infrequent access is kind of a number two performance tier. And when you start talking with developers about the storage cost, they're saying “Okay, why can’t we just use standard tier, and you're like, but this is the most expensive tier, you're not, you don't need it in like 100% of the cases, why why you should use it?”. And when you're starting to, to identify, like, work with them and understand if the infrequent access works for the use case, the naming is just kind of leading to this unnecessary conversation, like what we don't need our data to be. We don't. We need our data to be accessed frequently. We don't need infrequent access. So I think it's, if it is made intentionally, that this is a great kind of a monetization policy. Because, like, man, there's like, always, there is a conversation about the name. So this is one. The other thing is like, when you're thinking of AWS as a global cloud provider, you're thinking that it is consistent across all the regions.

It is consistent from a use case perspective, but it isn't consistent in terms of pricing. And guess what, it's not only expensive to live in Northern California, it's also expensive to host your resources in Northern California, like you need to carefully pick up the region for your data to be stored in. Like, spoiler alert, US West is not the cheapest region, but, like you can imagine, not knowing that can lead to a significant kind of architectural mistake. And the similar situation is in Europe, there are some regions that are more expensive than the others. Next big issue that comes to my mind is the amount of data that you're storing. Well, it's obviously the kind of a key cost driver but like you need to be careful when you're creating the additional use cases for the data. So we've been trying to identify where we have the data duplication across like multiple applications or multiple use cases. And it's kind of hard because like, well, at the beginning, the storage cost is not high and you're just kind of a forgetting about it and then you start to review it and then you figuring out that well there is a, like a backup bucket that was created a long time ago before the versioning was introduced to the S3. And like, you don't need it now, but it was like somehow you forgot about it. So it's kind of a large amount of data, you need to carefully review the kind of data storage use cases. And yeah, speaking about the versioning, it's also going to cost you money. So you need to be careful where you configure the versions, or the S3.

Ilia Semenov:

Yeah, it all makes sense. When I was dealing with Cloud Cost Management as a practitioner at EA, I've been through those issues as well. So, yeah, it has always been a bit frustrating. You know, AWS doesn't make it really ntuitive. Right? Well, that's how they make money. Probably, that's the answer. And they make a lot of money. I also remember dealing a lot with retention policies for S3 data. And AWS has been making some progress on that front, on the help side. I think last re:Invent (the one we had before the most recent one last week) they introduced Intelligent Tiering, right, which was supposed to make the transfer from a standard tier to Reduced Redundancy or Glacier much easier, basically automated. And that seemed like, you know, a great solution, like all you need, but turns out, it's not, you know, exactly that. I wonder if you had an experience with that. And like, how are you dealing with data retention policies on s3?

Vitaly Belyasov:

Yeah, I mean, Intelligent Tiering is, it was kind of well presented during the re:Invent, but then right after that, there was like a lot of discussions around how effective it could actually be. And speaking about the retention policies, when I was talking about the different use cases that you need to identify, in order to optimize your storage cost, one of the use cases could be like this, that you have, like, you need a temporary storage for, like, for your app, just to store data between the different kind of calculations. And in that case, you don't need the data for a long period of time. So if you would configure it for the like as a storage as an Intelligent Tiering, it would just kind of archive that data. So in that case, retention policy would be the better solution so that you make sure that the data is kind of deleted after a certain period of time. But speaking about Intelligent Tiering, it's kind of a great starting point, when you don't know anything about the kind of different storage types or different kinds of strategies to optimize storage costs. But also kind of can be a little bit hard to identify if it actually suits you. So in our case, what actually we found out when we were testing it is that when you have many objects in, like smaller size, the kind of the monitoring fees are actually super high. So in order to identify if it actually suits, you need to implement it on a like temporary bucket with a kind of a similar data structure. And just to monitor how much you will be paying for the monitoring fees. And on top of the monitoring fees, you need to also account for the kind of transition fees because like, guess what, moving from one tier to another, it's not free, you need to pay for each kind of transition between tiers. So this could actually end up in you paying more. So what AWS can do on their end is to make it more user friendly, have you ensured that you're not paying more than a standard tier. In that way they will incentivize you to use Intelligent Tiering because you know that potentially it can really optimize your storage cost.

Ilia Semenov:

Some guardrails? And like no downside for this feature, I think that is so logical. It's unclear why they didn't do it.

Vitaly Belyasov:

Yep. And yeah, I guess you need to carefully review the monitoring and transaction fees for your dataset, so that you will understand if it actually suits you, because I guess for many use cases this could work out and you will remove all this kind of additional complexity that will be added by having these retention policies because like, you need to review them. And it's not that straightforward as just kind of implementing Intelligent Tiering.

Ilia Semenov:

Yeah, and just to summarize for your use case Intelligent Tiering basically didn't work out. So, again, what are you using instead?

Vitaly Belyasov:

We're just using the retention policy. So we're basically moving objects from an infrequent access tier to kind of more towards the archive, within a certain period of time. So basically we implemented the same logic as Intelligent Tiering without the monitoring fees.

Ilia Semenov:

Gotcha. Okay. That makes perfect sense. And to our listeners, I guess that can be actual advice. Before using a new AWS feature (probably it applies to any feature AWS releases), make sure that, you know, it really fits your use case. You know, in this case, check the math, right, if it will be like saving you money. And if not, there are lifecycle policies there, or certain workarounds that are a bit more complex from an engineering perspective, but, you know, they lead to the same result, like with actual savings. Cool, Vitaly, thank you so much. It's been really, really insightful!

Vitaly Belyasov:

Thank you, Ilia, for making this happen.

Ilia Semenov:

Thank you.

Make cloud costs a first class metric for your engineering organization.
Copyright © 2024 CloudThread Inc.
All rights reserved.
Copyright © 2024 CloudThread Inc. All rights reserved