-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache describe_regions using lru_cache from stdlib #803
Conversation
I can probably add cache expiration to avoid a stale region list: import threading
import time
from functools import wraps
from typing import Any, Callable
def cache_for(seconds: int) -> Callable:
"""
Caches the result of a function for a specified number of seconds."""
def decorator(func: Callable) -> Callable:
lock = threading.Lock()
cache = {}
hits = misses = 0
@wraps(func)
def wrapper(*args: Any, **kwargs: Any) -> Any:
nonlocal hits, misses
with lock:
key = str(args) + str(kwargs)
current_time = time.time()
if key in cache:
result, timestamp = cache[key]
if current_time - timestamp < seconds:
hits += 1
return result
misses += 1
result = func(*args, **kwargs)
cache[key] = (result, current_time)
return result
def cache_stats() -> dict:
"""
Returns the cache statistics.
:return: A dictionary containing the cache statistics"""
with lock:
return {'hits': hits, 'misses': misses}
wrapper.cache_stats = cache_stats
return wrapper
return decorator
@cache_for(seconds=60)
def describe_regions(all_regions: bool = True) -> Any:
"""
Fetches all regions from AWS and returns the response.
:return: The response from the describe_regions method
"""
return get_ec2_client().describe_regions(AllRegions=all_regions)
# Example usage with AWS regions
import boto3
def get_ec2_client():
return boto3.client('ec2')
# Example usage to access cache statistics
if __name__ == "__main__":
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 0, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 1, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 2, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 3, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 4, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 5, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 6, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_stats()) # {'hits': 7, 'misses': 1} I get the following output:
|
Or, we can use a 3rd party library like https://cachetools.readthedocs.io/en/latest/ from typing import Any
from cachetools.func import ttl_cache
@ttl_cache(ttl=1800) # 30 minutes
def describe_regions(all_regions: bool = True) -> Any:
"""
Fetches all regions from AWS and returns the response.
:return: The response from the describe_regions method
"""
print("Fetching regions from AWS...")
return get_ec2_client().describe_regions(AllRegions=all_regions)
# Example usage with AWS regions
import boto3
def get_ec2_client():
return boto3.client('ec2')
# Example usage to access cache statistics
if __name__ == "__main__":
# print(dir(describe_regions))
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 0, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 1, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 2, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 3, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 4, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 5, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 6, 'misses': 1}
describe_regions(all_regions=False)
print(describe_regions.cache_info()) # {'hits': 7, 'misses': 1} Output: Fetching regions from AWS...
CacheInfo(hits=0, misses=1, maxsize=128, currsize=1)
CacheInfo(hits=1, misses=1, maxsize=128, currsize=1)
CacheInfo(hits=2, misses=1, maxsize=128, currsize=1)
CacheInfo(hits=3, misses=1, maxsize=128, currsize=1)
CacheInfo(hits=4, misses=1, maxsize=128, currsize=1)
CacheInfo(hits=5, misses=1, maxsize=128, currsize=1)
CacheInfo(hits=6, misses=1, maxsize=128, currsize=1)
CacheInfo(hits=7, misses=1, maxsize=128, currsize=1) |
I wanted to see the details of what is happening exactly in our code and why this is an issue. I wrote this while I was testing it: The problem comes from this line: elastic-serverless-forwarder/handlers/aws/handler.py Lines 147 to 152 in 8be4fc4
We need How do we obtain the
|
Thanks for the in-depth analysis. I tested the cloudwatch lambda trigger on the AWS console and ESF. As of today, it seems cloudwatch lambda triggers can only work with log groups in the same region from as the lambda functions. For example, if I deploy ESF on Given this limit, there is no reason to keep calling the EC2:DescribeRegion API on every event. I plan to remove this API call from ESF. Here's my two-steps plan:
WDYT? |
I am fine with approving the PR as it is. You need to change the version of ESF currently (I believe you need to update the changelog and version.py. After that the release workflow will be triggered, but if we push this change just like this, then nothing will happen. |
Thanks! On it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree!
I'll work on removing the EC2:DescribeRegions API call later this week. |
What does this PR do?
Caches EC2:DescribeRegion API calls response.
Why is it important?
On high-volume deployments, ESF can hit the EC2:DescribeRegions API requests limit, causing throttling errors like the following:
ESF needs the list of existing regions to parse incoming events from the
cloudwatch-logs
input. Since new AWS region additions do not happen frequently, picking up and caching the list of existing regions at function startup seems adequate.The list of existing AWS regions is available at https://aws.amazon.com/about-aws/global-infrastructure/regions_az/
Checklist
CHANGELOG.md