Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

language: re-work encoding property #2054

Merged
merged 2 commits into from
Mar 9, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 55 additions & 10 deletions packages/language/src/document.js
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,6 @@ var prop = require('propprop');
* object to specify the encoding and/or language of the document, use this
* property to pass the inline content of the document or a Storage File
* object.
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.language - The language of the text.
* @return {module:language/document}

This comment was marked as spam.

This comment was marked as spam.

*
Expand All @@ -69,13 +67,10 @@ function Document(language, config) {
var content = config.content || config;

this.api = language.api;
this.encodingType = this.detectEncodingType_(config);

this.document = {};

if (config.encoding) {
this.encodingType = config.encoding.toUpperCase().replace(/[ -]/g, '');
}

if (config.language) {
this.document.language = config.language;
}
Expand All @@ -98,6 +93,10 @@ function Document(language, config) {
});
} else {
this.document.content = content;

if (Buffer.isBuffer(content)) {
this.encodingType = 'UTF8';
}
}
}

Expand Down Expand Up @@ -232,6 +231,10 @@ Document.PART_OF_SPEECH = {
*
* @param {object=} options - Configuration object. See
* [documents.annotateText](https://cloud.google.com/natural-language/docs/reference/rest/v1/documents/annotateText#features).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {boolean} options.entities - Detect the entities from this document.
* By default, all features (`entities`, `sentiment`, and `syntax`) are
* enabled. By overriding any of these values, all defaults are switched to
Expand Down Expand Up @@ -547,7 +550,7 @@ Document.prototype.annotate = function(options, callback) {
this.api.Language.annotateText({
document: this.document,
features: features,
encodingType: this.encodingType
encodingType: this.detectEncodingType_(options)
}, function(err, resp) {
if (err) {
callback(err, null, resp);
Expand Down Expand Up @@ -587,6 +590,10 @@ Document.prototype.annotate = function(options, callback) {
*
* @param {object=} options - Configuration object. See
* [documents.annotateText](https://cloud.google.com/natural-language/reference/rest/v1/documents/analyzeEntities#request-body).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {boolean} options.verbose - Enable verbose mode for more detailed
* results. Default: `false`
* @param {function} callback - The callback function.
Expand Down Expand Up @@ -699,7 +706,7 @@ Document.prototype.detectEntities = function(options, callback) {

this.api.Language.analyzeEntities({
document: this.document,
encodingType: this.encodingType
encodingType: this.detectEncodingType_(options)
}, function(err, resp) {
if (err) {
callback(err, null, resp);
Expand All @@ -720,6 +727,10 @@ Document.prototype.detectEntities = function(options, callback) {
*
* @param {object=} options - Configuration object. See
* [documents.annotateText](https://cloud.google.com/natural-language/reference/rest/v1/documents/analyzeSentiment#request-body).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {boolean} options.verbose - Enable verbose mode for more detailed
* results. Default: `false`
* @param {function} callback - The callback function.
Expand Down Expand Up @@ -784,7 +795,7 @@ Document.prototype.detectSentiment = function(options, callback) {

this.api.Language.analyzeSentiment({
document: this.document,
encodingType: this.encodingType
encodingType: this.detectEncodingType_(options)
}, function(err, resp) {
if (err) {
callback(err, null, resp);
Expand Down Expand Up @@ -812,6 +823,10 @@ Document.prototype.detectSentiment = function(options, callback) {
*
* @param {object=} options - Configuration object. See
* [documents.annotateSyntax](https://cloud.google.com/natural-language/reference/rest/v1/documents/analyzeSyntax#request-body).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {boolean} options.verbose - Enable verbose mode for more detailed
* results. Default: `false`
* @param {function} callback - The callback function.
Expand Down Expand Up @@ -952,7 +967,7 @@ Document.prototype.detectSyntax = function(options, callback) {

this.api.Language.analyzeSyntax({
document: this.document,
encodingType: this.encodingType
encodingType: this.detectEncodingType_(options)
}, function(err, resp) {
if (err) {
callback(err, null, resp);
Expand Down Expand Up @@ -1126,6 +1141,36 @@ Document.sortByProperty_ = function(propertyName) {
};
};

/**
* Check if the user provided an encodingType, and map it to its API value.
*
* @param {object} options - Configuration object.
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @return {string} - The encodingType, as understood by the API.
*/
Document.prototype.detectEncodingType_ = function(options) {
var encoding = options.encoding || options.encodingType || this.encodingType;

if (!encoding) {
return;
}

encoding = encoding.toUpperCase().replace(/[ -]/g, '');

if (encoding === 'BUFFER') {
encoding = 'UTF8';
}

if (encoding === 'STRING') {
encoding = 'UTF16';
}

return encoding;
};

/*! Developer Documentation
*
* All async methods (except for streams) will return a Promise in the event
Expand Down
30 changes: 16 additions & 14 deletions packages/language/src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,10 @@ function Language(options) {
* File object.
* @param {object=} options - Configuration object. See
* [documents.annotateText](https://cloud.google.com/natural-language/reference/rest/v1/documents/annotateText#request-body).
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {string} options.language - The language of the text.
* @param {string} options.type - The type of document, either `html` or `text`.
* @param {boolean} options.verbose - Enable verbose mode for more detailed
Expand Down Expand Up @@ -173,8 +175,10 @@ Language.prototype.annotate = function(content, options, callback) {
* File object.
* @param {object=} options - Configuration object. See
* [documents.annotateText](https://cloud.google.com/natural-language/reference/rest/v1/documents/analyzeEntities#request-body).
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {string} options.language - The language of the text.
* @param {string} options.type - The type of document, either `html` or `text`.
* @param {boolean} options.verbose - Enable verbose mode for more detailed
Expand Down Expand Up @@ -264,8 +268,10 @@ Language.prototype.detectEntities = function(content, options, callback) {
* File object.
* @param {object=} options - Configuration object. See
* [documents.annotateText](https://cloud.google.com/natural-language/reference/rest/v1/documents/analyzeSentiment#request-body).
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {string} options.language - The language of the text.
* @param {string} options.type - The type of document, either `html` or `text`.
* @param {boolean} options.verbose - Enable verbose mode for more detailed
Expand Down Expand Up @@ -346,8 +352,10 @@ Language.prototype.detectSentiment = function(content, options, callback) {
* File object.
* @param {object=} options - Configuration object. See
* [documents.analyzeSyntax](https://cloud.google.com/natural-language/reference/rest/v1/documents/analyzeSyntax#request-body).
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.encoding - `UTF8` (also, `buffer`), `UTF16` (also
* `string`), or `UTF32`. (Alias for `options.encodingType`). Default:
* 'UTF8' if a Buffer, otherwise 'UTF16'. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType)
* @param {string} options.language - The language of the text.
* @param {string} options.type - The type of document, either `html` or `text`.
* @param {boolean} options.verbose - Enable verbose mode for more detailed
Expand Down Expand Up @@ -436,8 +444,6 @@ Language.prototype.detectSyntax = function(content, options, callback) {
* object to specify the encoding and/or language of the document, use this
* property to pass the inline content of the document or a Storage File
* object.
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.language - The language of the text.
* @return {module:language/document}
*
Expand Down Expand Up @@ -481,8 +487,6 @@ Language.prototype.document = function(config) {
* @param {string|module:storage/file} content - Inline HTML content or a
* Storage File object.
* @param {object=} options - Configuration object.
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.language - The language of the text.
* @return {module:language/document}
*
Expand Down Expand Up @@ -531,8 +535,6 @@ Language.prototype.html = function(content, options) {
* @param {string|module:storage/file} content - Inline text content or a
* Storage File object.
* @param {object=} options - Configuration object.
* @param {string} options.encoding - `UTF8`, `UTF16`, or `UTF32`. See
* [`EncodingType`](https://cloud.google.com/natural-language/reference/rest/v1/EncodingType).
* @param {string} options.language - The language of the text.
* @return {module:language/document}
*
Expand Down
Loading