You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The results are separated by multiple HTTP responses. Those results are independent JSON objects like multiple files.
However, It constructs a single FileInputInputStream which contains multiple InputStreams.
embulk-parser-json just parses the first inputStream, as a result, It outputs three entires only.
Most plugin's TransactionalFileInput has only one file (input stream), but the embulk specifications also supports multiple files (input streams) .
In the latter case, only the first file(input stream) is read in the current implementation.
Execution results.
embulk-input-http invoked the GET request six times.
2023-08-24 09:29:13.973 +0900 [INFO] (0015:task-0000): GET "http://express.heartrails.com/api/json?method=getStations&x=135.0&y=35"
2023-08-24 09:29:15.686 +0900 [INFO] (0015:task-0000): GET "http://express.heartrails.com/api/json?method=getStations&x=135.0&y=35&name=%E9%BB%92%E7%94%B0%E5%BA%84"
2023-08-24 09:29:15.754 +0900 [INFO] (0015:task-0000): GET "http://express.heartrails.com/api/json?method=getStations&x=135.0&y=35&name=%E6%9C%AC%E9%BB%92%E7%94%B0"
2023-08-24 09:29:15.799 +0900 [INFO] (0015:task-0000): GET "http://express.heartrails.com/api/json?method=getStations&x=135.0&y=35&name=%E8%88%B9%E7%94%BA%E5%8F%A3"
2023-08-24 09:29:15.840 +0900 [INFO] (0015:task-0000): GET "http://express.heartrails.com/api/json?method=getStations&x=135.0&y=35&name=%E4%B9%85%E4%B8%8B%E6%9D%91"
2023-08-24 09:29:15.986 +0900 [INFO] (0015:task-0000): GET "http://express.heartrails.com/api/json?method=getStations&x=135.0&y=35&name=%E8%B0%B7%E5%B7%9D"
Forked version of the embulk-input-http by trocco use InputStreamFileInput.IteratorProvider
(As far as I know, there is no input plugin that uses InputStreamFileInput.IteratorProvider except this plugin)
The original embulk-input-http doesn't use it. So the input plugin requested just one HTTP GET.
embulk: 0.9.25
embulk-input-httpd: 0.25.0 (rubygems version)
embulk-parser-jsonpath: 0.4.0
2023-08-25 10:21:30.748 +0900: Embulk v0.9.25
2023-08-25 10:21:31.645 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2023-08-25 10:21:33.555 +0900 [INFO] (main): Gem's home and path are set by default: "/home/user/.embulk/lib/gems"
2023-08-25 10:21:34.286 +0900 [INFO] (main): Started Embulk v0.9.25
2023-08-25 10:21:34.432 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-http (0.25.0)
2023-08-25 10:21:34.556 +0900 [INFO] (0001:transaction): Using local thread executor with max_threads=16 / output tasks 8 = input tasks 1 * 8
2023-08-25 10:21:34.568 +0900 [INFO] (0001:transaction): {done: 0 / 1, running: 0}
2023-08-25 10:21:34.615 +0900 [INFO] (0015:task-0000): GET "http://express.heartrails.com/api/json?method=getStations&x=135.0&y=35"
{"prefecture":"兵庫県","distance":"320m","line":"JR加古川線","next":"黒田庄","prev":"比延","x":134.997633,"y":35.002069,"postal":"6770039","name":"日本へそ公園"}
{"prefecture":"兵庫県","distance":"1310m","line":"JR加古川線","next":"日本へそ公園","prev":"新西脇","x":134.995733,"y":34.988773,"postal":"6770033","name":"比延"}
{"prefecture":"兵庫県","distance":"2620m","line":"JR加古川線","next":"本黒田","prev":"日本へそ公園","x":134.992522,"y":35.022689,"postal":"6790313","name":"黒田庄"}
2023-08-25 10:21:34.936 +0900 [INFO] (0001:transaction): {done: 1 / 1, running: 0}
2023-08-25 10:21:34.941 +0900 [INFO] (main): Committed.
2023-08-25 10:21:34.941 +0900 [INFO] (main): Next config diff: {"in":{},"out":{}}
Overview.
The following configuration gets JSON data using HTTP results
It outputs nine entries, But the current parser outputs three entries only.
Envnironment
The reason.
The results are separated by multiple HTTP responses. Those results are independent JSON objects like multiple files.
However, It constructs a single FileInputInputStream which contains multiple InputStreams.
embulk-parser-json just parses the first inputStream, as a result, It outputs three entires only.
It is the same issue embulk-parser-jsonpath
Execution results.
embulk-input-http invoked the GET request six times.
Simulate with the
curl
command.....
Example reproduce outputs
The text was updated successfully, but these errors were encountered: