Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added gcs read for audio file #249

Merged
merged 20 commits into from
Jun 2, 2016
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions speech/grpc/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,11 @@ limitations under the License.

<!-- // [START dependency] -->
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this gcloud-java dependency necessary? I don't see where you use anything from this package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nevermind, I see the storage imports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after switching to use the URI parameter, https://cloud.google.com/speech/reference/rest/v1/speech/recognize#audiorequest you should not need this dependency.

<artifactId>gcloud-java</artifactId>
<version>0.2.2</version>
</dependency>
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
/*
* Copyright 2016 Google Inc. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/


package com.google.cloud.speech.grpc.demos;

import com.google.cloud.speech.v1.AudioRequest;
import com.google.cloud.storage.Blob;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import com.google.protobuf.ByteString;

import java.io.IOException;

import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

/*
* AudioRequestFactory takes a URI as an input and creates an AudioRequest. The URI can point to a
* local file or a file on google cloud storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. Google Cloud Storage.

*/
public class AudioRequestFactory {

private static final String FILE = "file";
private static final String GS = "gs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of terrible to name a string variable GS when it's contents are "gs". It doesn't really describe the intention. (File is okay, since that's what it is.) I think CLOUD_STORAGE would be more appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tswast Its a uri scheme. CLOUD_STORAGE is not appropriate. Hmmm May be GS_SCHEME?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually see the thing is this final is longer than this "gs". Unnecessary.


/**
* Takes an input URI of form $scheme:// and converts to audio request.
*
* @param uri input uri
* @return AudioRequest audio request
*/
public static AudioRequest createRequest(URI uri)
throws IOException {
if (uri.getScheme() == null || uri.getScheme().equals(FILE)) {
Path path = Paths.get(uri);
return audioFromBytes(Files.readAllBytes(path));
} else if (uri.getScheme().equals(GS)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerjou Can the speech API read directly from Cloud Storage in the way that the Vision API does? Or do we have to download and re-upload like this does?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it can. https://cloud.google.com/speech/reference/rest/v1/speech/recognize#audiorequest Do not download and re-upload the image, just pass it in to the URI parameter.

Storage storage = StorageOptions.defaultInstance().service();
String path = uri.getPath();
BlobId blobId = BlobId.of(uri.getHost(), path.substring(1,path.length()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. Missing space after comma. substring(1, path.length())

Blob blob = storage.get(blobId);
return audioFromBytes(blob.content());
}
throw new RuntimeException("scheme not supported " + uri.getScheme());
}

/**
* Convert bytes to AudioRequest.
*
* @param bytes input bytes
* @return AudioRequest audio request
*/
private static AudioRequest audioFromBytes(byte[] bytes) {
return AudioRequest.newBuilder()
.setContent(ByteString.copyFrom(bytes))
.build();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@
import com.google.cloud.speech.v1.NonStreamingRecognizeResponse;
import com.google.cloud.speech.v1.RecognizeRequest;
import com.google.cloud.speech.v1.SpeechGrpc;
import com.google.protobuf.ByteString;
import com.google.protobuf.TextFormat;

import io.grpc.ManagedChannel;
Expand All @@ -49,9 +48,7 @@
import org.apache.commons.cli.ParseException;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.net.URI;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.Executors;
Expand All @@ -72,7 +69,7 @@ public class NonStreamingRecognizeClient {

private final String host;
private final int port;
private final String file;
private final URI input;
private final int samplingRate;

private final ManagedChannel channel;
Expand All @@ -81,11 +78,11 @@ public class NonStreamingRecognizeClient {
/**
* Construct client connecting to Cloud Speech server at {@code host:port}.
*/
public NonStreamingRecognizeClient(String host, int port, String file, int samplingRate)
public NonStreamingRecognizeClient(String host, int port, URI input, int samplingRate)
throws IOException {
this.host = host;
this.port = port;
this.file = file;
this.input = input;
this.samplingRate = samplingRate;

GoogleCredentials creds = GoogleCredentials.getApplicationDefault();
Expand All @@ -99,10 +96,7 @@ public NonStreamingRecognizeClient(String host, int port, String file, int sampl
}

private AudioRequest createAudioRequest() throws IOException {
Path path = Paths.get(file);
return AudioRequest.newBuilder()
.setContent(ByteString.copyFrom(Files.readAllBytes(path)))
.build();
return AudioRequestFactory.createRequest(this.input);
}

public void shutdown() throws InterruptedException {
Expand All @@ -115,10 +109,10 @@ public void recognize() {
try {
audio = createAudioRequest();
} catch (IOException e) {
logger.log(Level.WARNING, "Failed to read audio file: " + file);
logger.log(Level.WARNING, "Failed to read audio uri input: " + input);
return;
}
logger.info("Sending " + audio.getContent().size() + " bytes from audio file: " + file);
logger.info("Sending " + audio.getContent().size() + " bytes from audio uri input: " + input);
InitialRecognizeRequest initial = InitialRecognizeRequest.newBuilder()
.setEncoding(AudioEncoding.LINEAR16)
.setSampleRate(samplingRate)
Expand Down Expand Up @@ -147,8 +141,8 @@ public static void main(String[] args) throws Exception {
CommandLineParser parser = new DefaultParser();

Options options = new Options();
options.addOption(OptionBuilder.withLongOpt("file")
.withDescription("path to audio file")
options.addOption(OptionBuilder.withLongOpt("uri")
.withDescription("path to audio uri")
.hasArg()
.withArgName("FILE_PATH")
.create());
Expand All @@ -170,10 +164,10 @@ public static void main(String[] args) throws Exception {

try {
CommandLine line = parser.parse(options, args);
if (line.hasOption("file")) {
audioFile = line.getOptionValue("file");
if (line.hasOption("uri")) {
audioFile = line.getOptionValue("uri");
} else {
System.err.println("An Audio file path must be specified (e.g. /foo/baz.raw).");
System.err.println("An Audio uri must be specified (e.g. file:///foo/baz.raw).");
System.exit(1);
}

Expand Down Expand Up @@ -203,7 +197,7 @@ public static void main(String[] args) throws Exception {
}

NonStreamingRecognizeClient client =
new NonStreamingRecognizeClient(host, port, audioFile, sampling);
new NonStreamingRecognizeClient(host, port, URI.create(audioFile), sampling);
try {
client.recognize();
} finally {
Expand Down