Skip to content

Is it possible to allow changes to the encoding_format of the embedded options? #51

@JadynWong

Description

@JadynWong

Currently EmbeddingClient fixes the encoding_format value to base64 for better performance.

// CUSTOM: Made internal. We always request the embedding as a base64-encoded string for better performance.
/// <summary>
/// The format to return the embeddings in. Can be either `float` or
/// [`base64`](https://pypi.org/project/pybase64/).
/// </summary>
internal InternalEmbeddingGenerationOptionsEncodingFormat? EncodingFormat { get; set; }

internal Embedding(int index, BinaryData embeddingProperty, InternalEmbeddingObject @object, IDictionary<string, BinaryData> serializedAdditionalRawData)
{
Index = (int)index;
EmbeddingProperty = embeddingProperty;
Object = @object;
_serializedAdditionalRawData = serializedAdditionalRawData;
// Handle additional custom properties.
Vector = ConvertToVectorOfFloats(embeddingProperty);
}

It can't be changed, even if I want to use float format. I want to use this client for text-embeddings-inference, which currently does not support the encoding_format parameter.
This results in the following error

The input is not a valid Base64 string of encoded floats.

I know that encoding_format compatibility would be a better approach in other projects, but a lot of compatible openai api's don't update as fast as they should.

Is it possible to allow users to change the encoding_format value?
Of course, as the official SDK of OpenAI, I would respect it if it was only compatible with OpenAI.

For now I can serialize it myself using protocol methods.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions