-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
91 add endpoint for image proxying #100
Conversation
ac26149
to
3e11c83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @WarmBeer looks good. I would extract some methods. I've added some suggestions but nothing critical to stopping it from merging.
let image_cache_manager_config = ImageCacheManagerConfig { | ||
max_image_request_timeout_ms, | ||
max_image_size, | ||
user_quota_period_seconds, | ||
user_quota_bytes, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @WarmBeer, why don't you use torrust_index_backend::config::Cache
, something like:
let cache_config = cfg.settings.get_cache_configuration().await;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @josecelano , I see the image cache as an isolated plugin to the backend. This is why I thought that it should own its own config struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, you mean it could have its crate without specifying how the configuration should be stored in a toml file, do you? I suppose that kind of extension should also provide a specification for a config file section. But maybe they could also use an independent config file. Anyway, for the time being, config.settings.get_cache_configuration
could return an ImageCacheManagerConfig
. I think that's easy to change in the future if you extract the extension.
EntrySizeLimitExceedsTotalCapacity, | ||
BytesExceedEntrySizeLimit, | ||
CacheCapacityIsTooSmall, | ||
CacheFull, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WarmBeer It seems CacheFull
is not being used. Anyway, it's full for ...? I assume it's full for adding a new entry.
} | ||
|
||
// With a limit for individual entry sizes. | ||
pub fn with_entry_size_limit(esl: usize) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WarmBeer Generally, I prefer to be explicit with entry_size_limit
instead of esl
. Unless the abbreviation is really common, like cfg
, ...
new | ||
} | ||
|
||
// With bot a total capacity limit and an individual entry size limit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both
instead of bot
pub fn met(&self) -> bool { | ||
self.usage >= self.max_usage | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WarmBeer I think "reached" is more common than "met", but it's OK.
|
||
/// Get an image from the url and insert it into the cache if it isn't cached already. | ||
/// Unauthenticated users can only get already cached images. | ||
pub async fn get_image_by_url(&self, url: &str, opt_user: Option<UserCompact>) -> Result<Bytes, Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WarmBeer some people use "maybe_" prefix instead of "opt_". It makes more sense to me. "opt" sounds like a configuration option. However, "maybe_" is only used in Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @josecelano , I can't find a naming convention for this in Rust anywhere. I have come across code using the opt
prefix some time ago and I just stuck with that I guess.
There is a Reddit thread that discusses this naming convention: https://www.reddit.com/r/rust/comments/9hpyam/question_what_names_do_you_use_for_option_and/ .
Both prefixes maybe
and opt
seem to be used.
I don't have a preference for any over the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @josecelano ,
Thank you for the review! I have given my thoughts and put a thumbs up were I fully agree with you and will update things. I think however that we should merge this now and then I will make these changes later on as other tasks take priority currently. Is that ok?
Yes, it's OK.
// TODO: Update the cache on a separate thread, so that the client does not have to wait. | ||
// Update image cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WarmBeer, maybe you can split this function into two parts here. You can return the bytes, return the images to the client and later call another method with the rest of the tasks: update cache, quota, etcetera. I think that's very common in web frameworks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also extract a method for each task: request a remote image, update cache, update quota, etcetera.
pub image_cache_user_quota_period_seconds: u64, | ||
pub image_cache_user_quota_bytes: usize, | ||
} | ||
|
||
#[derive(Debug, Clone, Serialize, Deserialize)] | ||
pub struct TorrustConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WarmBeer rename to TorrustIndexBackendConfig
? However, I think it's not configuration but a kind of AppContainer
with tall the application service dependencies.
Err(e) => unsafe { | ||
// Handling status codes in the frontend other tan OK is quite a pain. | ||
// Return OK for now. | ||
let (_status_code, error_image_bytes): (StatusCode, Bytes) = match e { | ||
Error::UrlIsUnreachable => (StatusCode::GATEWAY_TIMEOUT, ERROR_IMAGE_URL_IS_UNREACHABLE.clone()), | ||
Error::UrlIsNotAnImage => (StatusCode::BAD_REQUEST, ERROR_IMAGE_URL_IS_NOT_AN_IMAGE.clone()), | ||
Error::ImageTooBig => (StatusCode::BAD_REQUEST, ERROR_IMAGE_TOO_BIG.clone()), | ||
Error::UserQuotaMet => (StatusCode::TOO_MANY_REQUESTS, ERROR_IMAGE_USER_QUOTA_MET.clone()), | ||
Error::Unauthenticated => (StatusCode::UNAUTHORIZED, ERROR_IMAGE_UNAUTHENTICATED.clone()), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @WarmBeer why is this unsafe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @josecelano ,
All the error image bytes are stored in mutable static variables. Reading these variables is unsafe in Rust, because you don't know what other threads do with them. Since the variables are instantiated in a once_cell, we can consider them safe as there are no other places where they would be mutated.
I do not like using static and unsafe code, so I will update this in the future.
Hey @josecelano , Thank you for the review! I have given my thoughts and put a thumbs up were I fully agree with you and will update things. I think however that we should merge this now and then I will make these changes later on as other tasks take priority currently. Is that ok? |
#[tokio::test] | ||
async fn set_multiple_bytes_cache_with_capacity_and_entry_size_limit_should_have_len_of_two() { | ||
let mut bytes_cache = BytesCache::with_capacity_and_entry_size_limit(12, 6).unwrap(); | ||
let bytes: Bytes = Bytes::from("abcdef"); | ||
|
||
assert!(bytes_cache.set("1".to_string(), bytes.clone()).await.is_ok()); | ||
assert!(bytes_cache.set("2".to_string(), bytes).await.is_ok()); | ||
|
||
assert_eq!(bytes_cache.len().await, 2) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @WarmBeer, I'd refactor this test to:
#[tokio::test]
async fn given_a_bytes_cache_with_a_capacity_and_entry_size_limit_it_should_allow_adding_new_entries_if_the_limit_is_not_exceeded() {
let bytes: Bytes = Bytes::from("abcdef");
let mut bytes_cache = BytesCache::with_capacity_and_entry_size_limit(bytes.len() * 2, bytes.len()).unwrap();
// Add first entry (6 bytes)
assert!(bytes_cache.set("key1".to_string(), bytes.clone()).await.is_ok());
// Add second entry (6 bytes)
assert!(bytes_cache.set("key2".to_string(), bytes).await.is_ok());
// Both entries were added because we did not reach the limit
assert_eq!(bytes_cache.len().await, 2)
}
If what we are testing is the cache limit.
let image_cache_manager_config = ImageCacheManagerConfig { | ||
max_image_request_timeout_ms, | ||
max_image_size, | ||
user_quota_period_seconds, | ||
user_quota_bytes, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, you mean it could have its crate without specifying how the configuration should be stored in a toml file, do you? I suppose that kind of extension should also provide a specification for a config file section. But maybe they could also use an independent config file. Anyway, for the time being, config.settings.get_cache_configuration
could return an ImageCacheManagerConfig
. I think that's easy to change in the future if you extract the extension.
|
||
/// Get an image from the url and insert it into the cache if it isn't cached already. | ||
/// Unauthenticated users can only get already cached images. | ||
pub async fn get_image_by_url(&self, url: &str, opt_user: Option<UserCompact>) -> Result<Bytes, Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @josecelano ,
Thank you for the review! I have given my thoughts and put a thumbs up were I fully agree with you and will update things. I think however that we should merge this now and then I will make these changes later on as other tasks take priority currently. Is that ok?
Yes, it's OK.
8dd997b
to
83afc51
Compare
closes #91 .
With this PR we can let clients proxy torrent description images through the backend.
How it works
The following endpoint has been added to the backend
/proxy/image/{original_image_url_urlencoded}
This endpoint accepts the optional header
"Authentication": "Bearer TOKEN"
Authenticated users
When an authenticated user requests an image by its url from the backend proxy, the backend will first check whether the image is already in a local cache and then serve it to the user. If it is not in the local cache already, the backend will request the image from the supplied url and then store it in the local cache.
Unauthenticated users
Guests can only request images that are already in the cache.
Quota
Too mitigate spam, the proxy endpoint is limited by a quota per user. Whenever an authenticated user requests an image that is not already in the local cache, the size of the requested image will be deducted from the user's quota.
When the quota is met, the user will only be able to request images that are already in the cache.
The quota resets after a certain time period that can be set in the config.
Error feedback
In the case of an error, such as (but not limited to):
The backend will return an image with the error text in red.
Configuration