Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/http range/v23 #6312

Closed
wants to merge 12 commits into from

Conversation

catenacyber
Copy link
Contributor

Link to redmine ticket:
https://redmine.openinfosecfoundation.org/issues/1576

Describe changes:

  • Uses a generic container (a thread-safe hash table whose key is the url/filename) to store file parts with HTTP Range header
  • handles reassembly of these parts, while enforcing a global memcap

suricata-verify-pr: 517
OISF/suricata-verify#517

Continues #6308 :

  • Adds more comments
  • Added a SC_ATOMIC_SUB for memcap when HTTP_RANGES_RB_INSERT finds a duplicate. Is it right ? (it looked strange to have free and not diminishing the memcap)
  • Refactors the code to be more clear that we can avoid the lock in HTPFileStoreChunk
    • Adds a fileOwning field to show
    • refactoring HttpRangeAppendData to append not as a default case, but only if fileOwning is set, and asserting that the default case is the last case remaining
    • Handling gaps not while appending, but on closing
    • not using directly files->tail->size everywhere but a lastsize field who gets synced when we have the ownership of files

For testing : https://github.com/jasonish/suricata-http-range-test

victorjulien and others added 8 commits August 24, 2021 10:20
Compares two buffers with their two sizes
adds a container, ie a thread safe hash table whose
key is the filename

keep a tree of unordered ranges, up to a memcap limit

adds HTPFileOpenWithRange to handle like HTPFileOpen
if there is a range : open 2 files, one for the whole reassembled,
and one only for the current range
Simplify locking by using the THashData lock instead of a separate
range lock.

Avoid size_t in function arguments.

Clean up file handling functions.

Implement handling of alloc errors.

Rename yaml entry to byterange

Unify public api naming
@catenacyber catenacyber requested a review from a team as a code owner August 27, 2021 15:30
Better structure design to ensure that one flow maximum
is owning and appending into the file, adding fileOwning field.

Adds also a gap field in a range buffer, so that we can
feed the gap on closing, when we are protected from concurrency
by a lock, (lock which got removed in the append path)

Fixes memcap when encountering a duplicate while inserting
in red and black tree

Adds many comments
@catenacyber catenacyber force-pushed the feature/http-range/v23 branch 2 times, most recently from b46d128 to 75ce5e6 Compare August 27, 2021 15:31
@@ -438,6 +458,7 @@ File *HttpRangeClose(HttpRangeContainerBlock *c, uint16_t flags)
HTTP_RANGES_RB_INSERT(&c->container->fragment_tree, c->current);
if (res) {
SCLogDebug("duplicate range fragment");
(void)SC_ATOMIC_SUB(ContainerUrlRangeList.ht->memuse, c->current->buflen);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@victorjulien is this correct ?

Everywhere else we SCFree(*->buffer), we diminish the memcap

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the question. We update the memcap on this free, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We update the memcap on this free, right?

I added this in this latest PR.
It was not there before

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. This is why I don't like updates to existing PRs btw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I don't like updates to existing PRs btw.

In this case, the change was added between #6308 and #6312, so it was right ?

@catenacyber
Copy link
Contributor Author

I added last commit because CI failed quickly while rebasing S-V PR...

@catenacyber
Copy link
Contributor Author

Left TODO : implement for HTTP2

@codecov
Copy link

codecov bot commented Aug 27, 2021

Codecov Report

Merging #6312 (08eca7e) into master (7551247) will decrease coverage by 0.02%.
The diff coverage is 68.24%.

@@            Coverage Diff             @@
##           master    #6312      +/-   ##
==========================================
- Coverage   76.95%   76.93%   -0.03%     
==========================================
  Files         611      612       +1     
  Lines      185955   186368     +413     
==========================================
+ Hits       143102   143378     +276     
- Misses      42853    42990     +137     
Flag Coverage Δ
fuzzcorpus 52.83% <45.19%> (-0.03%) ⬇️
suricata-verify 51.21% <71.15%> (+0.09%) ⬆️
unittests 63.00% <3.55%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@suricata-qa
Copy link

Information: QA ran without warnings.

Pipeline 3934

@victorjulien victorjulien marked this pull request as draft September 2, 2021 08:46
uint16_t flags;
/** wether a HttpRangeContainerBlock is currently
owning the FileContainer in order to append to the file */
bool fileOwned;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style file_owned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in next PR

/** pointer to the main file container, where to directly append data */
HttpRangeContainerFile *container;
/** we own the container's file for now */
bool fileOwning;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style file_owning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in next PR

@@ -438,6 +458,7 @@ File *HttpRangeClose(HttpRangeContainerBlock *c, uint16_t flags)
HTTP_RANGES_RB_INSERT(&c->container->fragment_tree, c->current);
if (res) {
SCLogDebug("duplicate range fragment");
(void)SC_ATOMIC_SUB(ContainerUrlRangeList.ht->memuse, c->current->buflen);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the question. We update the memcap on this free, right?

* As this belongs to a flow, appending data to it is ensured to be thread-safe
* Only one block per file has the pointer to the container
*/
typedef struct HttpRangeContainerBlock {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after our conversation the other day I expected a FileContainer pointer to be here, where the flow handling the in-order chunk would move the FileContainer pointer into this block and return it to the flow on close.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had understood a FileContainer pointer or a boolean...
Switching to the FileContainer pointer then ;-)

DEBUG_VALIDATE_BUG_ON(c->container->files == NULL);
if (data == NULL) {
// gap overlaping already known data
r = FileAppendData(c->container->files, NULL, len - c->toskip);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this is accessed w/o a lock in the HTPFileStoreChunk path. It may work due to c->fileOwning, but I'm not happy with this setup. I don't want to see any access of a shared data structure w/o holding a lock.

Like I wrote in another comment, I thought the idea was to move c->container->files into HttpRangeContainerBlock for the flow handling the in-order chunk. That way we can append to it w/o needing to access the shared data at c->container

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworking with the FileContainer pointer from comment above

@catenacyber
Copy link
Contributor Author

Replaced by #6331

@catenacyber catenacyber closed this Sep 2, 2021
AkakiAlice added a commit to AkakiAlice/suricata that referenced this pull request Oct 22, 2024
victorjulien pushed a commit to victorjulien/suricata that referenced this pull request Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants