[Feature Request] Make `StreamContext` and `WriteContext` more extensible #17245

jed326 · 2025-02-04T04:27:43Z

Is your feature request related to a problem? Please describe

I am working on a feature in the k-NN plugin (see: opensearch-project/k-NN#2465) for which I want to upload flat vectors to a remote repository using the asyncBlobUpload path, which requires both a WriteContext and a corresponding StreamContext.

Today this asyncBlobUpload is used by remote store to copy segment files to the remote store after refresh, however in my use case I would like to write the flat vector files to the remote object storage during either the flush or merge operations, so in my case there is no existing file on disk yet, which is one of the assumptions taken by WriteContext.

Describe the solution you'd like

I would like to convert both WriteContext and StreamContext into interfaces and refactor the current concrete implementations into RemoteStoreWriteContext and RemoteStoreStreamContext so I can provide my own custom interface implementations in the k-NN plugin without the same assumptions made by the existing classes.

Related component

Storage:Remote

Describe alternatives you've considered

Both WriteContext and StreamContext are already public classes today, so I can still extend them from the k-NN plugin. However, this isn't ideal as, for example, I will still need to provide a fileName to the super() call to WriteContext. Additionally, these classes are annotated as @opensearch.internal today which implies we probably should not be extending them like so.

Additional context

Other related design docs for k-NN feature:

The text was updated successfully, but these errors were encountered:

jed326 · 2025-02-04T04:30:09Z

Tagging some folks who worked on #7000 for feedback: @raghuvanshraj @vikasvb90 @Bukhtawar @ashking94 @reta @gbbafna

vikasvb90 · 2025-02-04T05:28:41Z

so in my case there is no existing file on disk yet, which is one of the assumptions taken by WriteContext

@jed326 This statement is incorrect. There's a streamContextSupplier which is a supplier to create streams based on the provided part sizes. Users are free to use it for single part or multiple parts. Also, since these interfaces are stream based, it doesn't matter whether streams are backed by file or by any in memory buffer.
Coming to why there are file specific attributes. File name, size, part sizes are all attributes of a remote store and are mandatory before the upload starts. File size is needed upfront to allocate resources for the upcoming upload. If you don't do so, then entire content may be loaded into memory before the upload even begins. I suggest that you go through existing remote store constraints first before designing your uploads.

jed326 · 2025-02-04T05:53:53Z

Thanks @vikasvb90! I somehow missed that StreamContextSupplier is defined as a functional interface so I can use that extension point instead of rolling my own StreamContext.

File size is needed upfront to allocate resources for the upcoming upload. If you don't do so, then entire content may be loaded into memory before the upload even begins.

The rest I am already handling as, like you said, instead of backing the stream by file I am backing it my a memory buffer to limit the memory consumption (for example).

To that point, though, today the part size in question is calculated by the specific vendor implementation:

OpenSearch/plugins/repository-s3/src/main/java/org/opensearch/repositories/s3/async/AsyncTransferManager.java

Lines 331 to 344 in c328c18

    
               /** 
        
                * Calculates the optimal part size of each part request if the upload operation is carried out as multipart upload. 
        
                */ 
        
               public long calculateOptimalPartSize(long contentLengthOfSource, WritePriority writePriority, boolean uploadRetryEnabled) { 
        
                   if (contentLengthOfSource < ByteSizeUnit.MB.toBytes(5)) { 
        
                       return contentLengthOfSource; 
        
                   } 
        
                   if (uploadRetryEnabled && (writePriority == WritePriority.HIGH || writePriority == WritePriority.URGENT)) { 
        
                       return new ByteSizeValue(5, ByteSizeUnit.MB).getBytes(); 
        
                   } 
        
                   double optimalPartSize = contentLengthOfSource / (double) MAX_UPLOAD_PARTS; 
        
                   optimalPartSize = Math.ceil(optimalPartSize); 
        
                   return (long) Math.max(optimalPartSize, minimumPartSize); 
        
               }

Are there any other ways to control this part size? Since our InputStream is ultimately backed by a DocIdSetIterator it may not be as performant to create 10k parts as that would require 10k iterators, although that is something we will only know concretely after performing more benchmarking.

vikasvb90 · 2025-02-04T06:05:50Z

Talking specifically about S3 plugin,

If 10K parts are created then that means size of the file is 10K * 5MB (Minimum part size) = ~50gb file. Are you expecting to upload a 50gb file? In default merge policy, 5gb is the max size of a segment file.
Even if you end up with a very huge segment file say as a result of force merge, then also not all parts are created at once. They are blocked if there's isn't sufficient capacity to accommodate uploads. Please check this code pointer.

vikasvb90 · 2025-02-04T06:12:20Z

To add, there is a priority order defined to distinguish between uploads of remote cluster state (URGENT), remote translog (HIGH), remote segments (NORMAL) and large files (>15gb) + background jobs (LOW). Semaphore is acquired for NORMAL and LOW priority transfers. I believe your use case falls into NORMAL priority, so you should be good to create any number of parts.

jed326 · 2025-02-04T06:45:44Z

Are you expecting to upload a 50gb file?

As you said this would only be in force merge case and we wouldn't expect this to be a common occurrence. Rough math: 1k dimension fp32 vector takes 4k bytes to store each vector, so with 10m documents the vectors would take up 40GB

Thanks for the pointers on the semaphore and the priority transfers as well, will keep those in mind in our perf tuning/analysis as well.

jed326 · 2025-02-04T06:46:01Z

I think my questions are answered here so closing this issue, thanks again @vikasvb90!

jed326 added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 4, 2025

jed326 self-assigned this Feb 4, 2025

github-actions bot added the Storage:Remote label Feb 4, 2025

github-project-automation bot added this to Storage Project Board Feb 4, 2025

github-project-automation bot moved this to 🆕 New in Storage Project Board Feb 4, 2025

jed326 mentioned this issue Feb 4, 2025

[Meta] Remote Vector Index Build Component in OpenSearch Vector Engine opensearch-project/k-NN#2391

Open

14 tasks

jed326 closed this as completed Feb 4, 2025

github-project-automation bot moved this from 🆕 New to ✅ Done in Storage Project Board Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Make `StreamContext` and `WriteContext` more extensible #17245

[Feature Request] Make `StreamContext` and `WriteContext` more extensible #17245

jed326 commented Feb 4, 2025

jed326 commented Feb 4, 2025 •

edited

Loading

vikasvb90 commented Feb 4, 2025 •

edited

Loading

jed326 commented Feb 4, 2025

vikasvb90 commented Feb 4, 2025

vikasvb90 commented Feb 4, 2025

jed326 commented Feb 4, 2025 •

edited

Loading

jed326 commented Feb 4, 2025

[Feature Request] Make StreamContext and WriteContext more extensible #17245

[Feature Request] Make StreamContext and WriteContext more extensible #17245

Comments

jed326 commented Feb 4, 2025

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Related component

Describe alternatives you've considered

Additional context

jed326 commented Feb 4, 2025 • edited Loading

vikasvb90 commented Feb 4, 2025 • edited Loading

jed326 commented Feb 4, 2025

vikasvb90 commented Feb 4, 2025

vikasvb90 commented Feb 4, 2025

jed326 commented Feb 4, 2025 • edited Loading

jed326 commented Feb 4, 2025

[Feature Request] Make `StreamContext` and `WriteContext` more extensible #17245

[Feature Request] Make `StreamContext` and `WriteContext` more extensible #17245

jed326 commented Feb 4, 2025 •

edited

Loading

vikasvb90 commented Feb 4, 2025 •

edited

Loading

jed326 commented Feb 4, 2025 •

edited

Loading