http/fetch: Optimise memory consumption #378

pjbgf · 2022-10-14T11:05:13Z

The previous http/fetch logic would load into memory the tar file, causing large files to increase the likelihood of concurrent reconciliations to cause OOM.

The Fetch func downloads a file, and then hashws it content and if the checksum matches, it then goes ahead and extract its contents. Given that resp.Body is not a io.SeekReader, to avoid loading the full size of the file into memory, we need to save it into a temporary file, and then use a io.Reader to read it on the subsequent operations. With this approach the memory consumption per operation was reduced from 23mb to 2.1mb:

Benchmark_Fetch-16      5  227630480 ns/op  23003358 B/op  19511 allocs/op
Benchmark_FetchNew-16   5  227570375 ns/op   2106795 B/op  19504 allocs/op

The tar size used was 7mb.

Expanding on defensive programming, the download process and subsequent operations are short-circuited after a Max Download Size is reached. With a max limit set to 100 bytes, the error message yielded is:

artifact is 7879239 bytes greater than the max download size of 100 bytes

Further details of improvements:

github.com/fluxcd/pkg/http/fetch.(ArchiveFetcher).Fetch

  Total:        720B   327.95MB (flat, cum) 93.20%
    168            .          .           		} 
    169            .          .           		u.Host = r.hostnameOverwrite 
    170            .          .           		archiveURL = u.String() 
    171            .          .           	} 
    172            .          .            
    173            .     6.91kB           	req, err := retryablehttp.NewRequest(http.MethodGet, archiveURL, nil) 
    174            .          .           	if err != nil { 
    175            .          .           		return fmt.Errorf("failed to create a new request: %w", err) 
    176            .          .           	} 
    177            .          .            
    178            .    39.12kB           	resp, err := r.httpClient.Do(req) 
    179            .          .           	if err != nil { 
    180            .          .           		return fmt.Errorf("failed to download archive, error: %w", err) 
    181            .          .           	} 
    182            .          .           	defer resp.Body.Close() 
    183            .          .            
    184            .          .           	if code := resp.StatusCode; code != http.StatusOK { 
    185            .          .           		if code == http.StatusNotFound { 
    186            .          .           			return FileNotFoundError 
    187            .          .           		} 
    188            .          .           		return fmt.Errorf("failed to download archive from %s, status: %s", archiveURL, resp.Status) 
    189            .          .           	} 
    190            .          .            
    191         720B       720B           	var buf bytes.Buffer 
    192            .          .            
    193            .          .           	// verify checksum matches origin 
    194            .   299.94MB           	if err := r.verifyChecksum(checksum, &buf, resp.Body); err != nil { 
    195            .          .           		return err 
    196            .          .           	} 
    197            .          .            
    198            .          .           	// extract 
    199            .    27.96MB           	if err = tar.Untar(&buf, dir, tar.WithMaxUntarSize(-1)); err != nil { 
    200            .          .           		return fmt.Errorf("failed to extract archive, error: %w", err) 
    201            .          .           	} 
    202            .          .            
    203            .          .           	return nil 
    204            .          .           }

github.com/fluxcd/pkg/http/fetch.(ArchiveFetcher).FetchNew

  Total:           0    19.40MB (flat, cum)  5.51%
     70            .          .           		} 
     71            .          .           		u.Host = r.hostnameOverwrite 
     72            .          .           		archiveURL = u.String() 
     73            .          .           	} 
     74            .          .            
     75            .     4.61kB           	req, err := retryablehttp.NewRequest(http.MethodGet, archiveURL, nil) 
     76            .          .           	if err != nil { 
     77            .          .           		return fmt.Errorf("failed to create a new request: %w", err) 
     78            .          .           	} 
     79            .          .            
     80            .    25.72kB           	resp, err := r.httpClient.Do(req) 
     81            .          .           	if err != nil { 
     82            .          .           		return fmt.Errorf("failed to download archive, error: %w", err) 
     83            .          .           	} 
     84            .          .           	defer resp.Body.Close() 
     85            .          .            
     86            .          .           	if code := resp.StatusCode; code != http.StatusOK { 
     87            .          .           		if code == http.StatusNotFound { 
     88            .          .           			return FileNotFoundError 
     89            .          .           		} 
     90            .          .           		return fmt.Errorf("failed to download archive from %s, status: %s", archiveURL, resp.Status) 
     91            .          .           	} 
     92            .          .            
     93            .     1.78kB           	f, err := os.CreateTemp("", "fetch.*.tmp") 
     94            .          .           	if err != nil { 
     95            .          .           		return fmt.Errorf("failed to create temp file: %w", err) 
     96            .          .           	} 
     97            .          .           	defer os.Remove(f.Name()) 
     98            .          .            
     99            .          .           	// Save temporary file, but limit download to the max download size. 
    100            .          .           	if r.maxDownloadSize > 0 { 
    101            .          .           		// Headers can lie, so instead of trusting resp.ContentLength, 
    102            .          .           		// limit the download to the max download size and error in case 
    103            .          .           		// there are still bytes left. 
    104            .          .           		// Note that discarding of remaining bytes in resp.Body is a 
    105            .          .           		// requirement for Go to effectively reuse HTTP connections. 
    106            .          .           		_, err = io.Copy(f, io.LimitReader(resp.Body, int64(r.maxDownloadSize))) 
    107            .          .           		n, _ := io.Copy(io.Discard, resp.Body) 
    108            .          .           		if n > 0 { 
    109            .          .           			return fmt.Errorf("artifact is %d bytes greater than the max download size of %d bytes", n, r.maxDownloadSize) 
    110            .          .           		} 
    111            .          .           	} else { 
    112            .   320.16kB           		_, err = io.Copy(f, resp.Body) 
    113            .          .           	} 
    114            .          .           	if err != nil { 
    115            .          .           		return fmt.Errorf("failed to copy temp contents: %w", err) 
    116            .          .           	} 
    117            .          .            
    118            .          .           	// We have just filled the file, to be able to read it from 
    119            .          .           	// the start we must go back to its beginning. 
    120            .          .           	_, err = f.Seek(0, 0) 
    121            .          .           	if err != nil { 
    122            .          .           		return fmt.Errorf("failed to seek back to beginning: %w", err) 
    123            .          .           	} 
    124            .          .            
    125            .          .           	// Ensure that the checksum of the downloaded file matches the 
    126            .          .           	// known checksum. 
    127            .   325.97kB           	if err := r.verifyChecksumNew(checksum, f); err != nil { 
    128            .          .           		return err 
    129            .          .           	} 
    130            .          .            
    131            .          .           	// Jump back at the beginning of the file stream again. 
    132            .          .           	_, err = f.Seek(0, 0) 
    133            .          .           	if err != nil { 
    134            .          .           		return fmt.Errorf("failed to seek back to beginning again: %w", err) 
    135            .          .           	} 
    136            .          .            
    137            .          .           	// Extracts the tar file. 
    138            .    18.74MB           	if err = tar.Untar(f, dir, tar.WithMaxUntarSize(-1)); err != nil { 
    139            .          .           		return fmt.Errorf("failed to extract archive (check whether file size exceeds max download size): %w", err) 
    140            .          .           	} 
    141            .          .            
    142            .       320B           	return nil 
    143            .          .           } 
    144            .          .            
    145            .          .           // verifyChecksum computes the checksum of the tarball and returns an error if the computed value 
    146            .          .           // does not match the artifact advertised checksum. 
    147            .          .           func (r *ArchiveFetcher) verifyChecksumNew(checksum string, reader io.Reader) error {

Should further decrease the likelihood of: fluxcd/kustomize-controller#725

PS: It is important to notice that fetch/untar are not the highest memory consumers on KC, and therefore any further optimisations shall yield only marginal gains.

pjbgf · 2022-10-14T11:13:33Z

http/fetch/archive_fetcher.go

+	}
+
+	// Extracts the tar file.
+	if err = tar.Untar(f, dir, tar.WithMaxUntarSize(-1)); err != nil {


Not really keen on exposing the MaxUntarSize as that does not feel like Fetch's responsibility. Forcing an unlimited max untar size is also not good, as that cause issues with highly compressible files. It would be good to get some ideas for this.

I see nothing wrong with having MaxUntarSize as option here, it's the Fetcher responsibility to download and extract the archive.

The previous `http/fetch` logic would load into memory the tar file, causing large files to increase the likelihood of concurrent reconciliations to cause OOM. The Fetch func downloads a file, and then hashs it content and if the checksum matches, it then goes ahead and extract its contents. The `resp.Body` is not a `io.SeekReader`, which means that to avoid loading the full size of the file into memory, we need to save it into a temporary file, and then load the file to the subsequent operations. With this approach the memory consumption per operation was reduced from 23mb to 2.1mb: ``` Benchmark_Fetch-16 5 227630480 ns/op 23003358 B/op 19511 allocs/op Benchmark_FetchNew-16 5 227570375 ns/op 2106795 B/op 19504 allocs/op ``` The tar size use was 7mb. Expanding on preventing programming, the download process and subsequent operations are short-circuited after a Max Download Size is reached. With a max limit set to 100 bytes, the error message yielded is: `artifact is 7879239 bytes greater than the max download size of 100 bytes` Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>

stefanprodan

LGTM

Thanks @pjbgf 🏅

- update fluxcd/pkg/tar to v0.2.0 (fluxcd/pkg#377) - update fluxcd/pkg/http/fetch to v0.2.0 (fluxcd/pkg#378) Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

pjbgf added the area/kustomize Kustomize related issues and pull requests label Oct 14, 2022

pjbgf added this to the GA milestone Oct 14, 2022

pjbgf requested a review from a team October 14, 2022 11:05

pjbgf commented Oct 14, 2022

View reviewed changes

pjbgf force-pushed the streamfetch branch from 5af11e4 to 30a4aca Compare October 14, 2022 11:16

pjbgf force-pushed the streamfetch branch from 30a4aca to 678f92d Compare October 18, 2022 12:54

pjbgf requested a review from stefanprodan October 18, 2022 12:58

stefanprodan approved these changes Oct 18, 2022

View reviewed changes

pjbgf merged commit fe2ef0a into fluxcd:main Oct 18, 2022

pjbgf deleted the streamfetch branch October 18, 2022 14:12

stefanprodan mentioned this pull request Oct 19, 2022

Optimise the memory usage of artifact operations fluxcd/kustomize-controller#747

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

http/fetch: Optimise memory consumption #378

http/fetch: Optimise memory consumption #378

pjbgf commented Oct 14, 2022 •

edited

Loading

pjbgf Oct 14, 2022

stefanprodan Oct 18, 2022

stefanprodan left a comment

http/fetch: Optimise memory consumption #378

http/fetch: Optimise memory consumption #378

Conversation

pjbgf commented Oct 14, 2022 • edited Loading

pjbgf Oct 14, 2022

Choose a reason for hiding this comment

stefanprodan Oct 18, 2022

Choose a reason for hiding this comment

stefanprodan left a comment

Choose a reason for hiding this comment

pjbgf commented Oct 14, 2022 •

edited

Loading