Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scan() with grouping returns empty string #14

Open
voke opened this issue Jun 25, 2019 · 7 comments
Open

scan() with grouping returns empty string #14

voke opened this issue Jun 25, 2019 · 7 comments

Comments

@voke
Copy link

voke commented Jun 25, 2019

I'm having a case where scan() returns empty string.

Using the jq command:

echo '{ "foo": "thisthat" }' | jq '.foo | scan("this(.+)")'

Output:

[
  "that"
]

Using ruby-jq gem:

irb(main):001:0> require 'jq'
=> true
irb(main):002:0> JQ::VERSION
=> "0.2.1"
irb(main):002:0> JQ('{ "foo": "thisthat" }').search('.foo | scan("this(.+)")')
=> [[""]]

I can only reproduce the problem when deployed to heroku. To reproduce the problem click the deploy button at https://github.com/voke/heroku-jq-debugger (which includes the libjq 1.6 buildpack)

Run heroku run console --app YOUR_APP_NAME and follow the steps above.


First I thought it must be an oniguruma issue since regexp is used. But gsub command works as expected.

My second guess was that since my local version of jq (installed via homebrew) and the ruby-jq gem returns the correct output the problem must be within my buildpack. But running the jq command at heroku also returns the correct output.

heroku run "cd jq && echo '{ \"foo\": \"thisthat\" }' | ./jq '.foo | scan(\"this(.+)\")'" --app YOUR_APP_NAME

Output:

[
  "that"
]

Do you have any idea what the issue may be?

@voke
Copy link
Author

voke commented Jun 30, 2019

Use the following Dockerfile to reproduce the issue:

FROM ruby:2.6

ENV RUBYJQ_USE_SYSTEM_LIBRARIES=yes

RUN apt-get update && apt-get install -y libonig-dev

RUN \
  cd /tmp && \
  git clone https://github.com/stedolan/jq.git && \
  cd jq && \
  autoreconf -i && \
  ./configure --enable-shared --disable-docs --disable-maintainer-mode && \
  make && \
  make install && \
  ldconfig

RUN gem install ruby-jq
> docker build -t ruby-with-jq .
> docker run -it ruby-with-jq irb -r jq
JQ('{ "foo": "thisthat" }').search('.foo | scan("this(.+)")') # => [[""]]

@voke
Copy link
Author

voke commented Jul 1, 2019

I investigated this further by adding JQ_DEBUG_TRACE flag to jq_start(jq, value, JQ_DEBUG_TRACE);

Running JQ('"foobar"').search('scan("foo(.+)")') # => [[""]] gives:

0000 TOP	
0001 CALL_JQ scan:1 @lambda:2	"foobar" (1)
0000 CALL_JQ match:0^1 @lambda:0 @lambda:1	"foobar" (1)
0000 PUSHK_UNDER false	"foobar" (1)
0002 SUBEXP_BEGIN	"foobar" (1)
0003 CALL_JQ mode:1	"foobar" (2)
0000 LOADK "g"	"foobar" (2)
0002 RET	"g" (2)
0007 SUBEXP_END	"g" (2) | "foobar" (1)
0008 SUBEXP_BEGIN	"foobar" (1)
0009 CALL_JQ re:0	"foobar" (2)
0000 TAIL_CALL_JQ re:0^1	"foobar" (2)
0000 LOADK "foo(.+)"	"foobar" (2)
0002 RET	"foo(.+)" (2)
0013 SUBEXP_END	"foo(.+)" (2) | "foobar" (1)
0014 CALL_BUILTIN _match_impl	"foobar" (1) | "foo(.+)" (2) | "g" (2) | false
0017 EACH	[{"offset":0,"length":6,"string":"foobar" (1),"captures":[{"offset":0,"string":"" (1),"length":0,"name":null} (1)] (1)} (1)] (1)

As you can see the last line shows an empty capture.

I wrote a simple C program to test libjq and to imitate the code in jq_core:

#include <stdio.h>
#include <string.h>
#include <jq.h>

int main(){

  jq_state *jq = NULL;
  jq = jq_init();

  char *buf = "\"foobar\"";
  char *prog = "scan(\"foo(.+)\")";

  jv input = jv_parse(buf);
  int compiled = jq_compile(jq, prog);

  jq_start(jq, input, JQ_DEBUG_TRACE);

  jv result;
  result = jq_next(jq);

  jv dumped = jv_dump_string(result, 0);
  const char *str = jv_string_value(dumped);

  printf("OUTPUT: %s\n", str);

  jq_teardown(&jq);

  printf("Done!\n");
  return 0;
}

Running this program gives:

0000 TOP	
0001 CALL_JQ scan:1 @lambda:2	"foobar" (1)
0000 CALL_JQ match:0^1 @lambda:0 @lambda:1	"foobar" (1)
0000 PUSHK_UNDER false	"foobar" (1)
0002 SUBEXP_BEGIN	"foobar" (1)
0003 CALL_JQ mode:1	"foobar" (2)
0000 LOADK "g"	"foobar" (2)
0002 RET	"g" (2)
0007 SUBEXP_END	"g" (2) | "foobar" (1)
0008 SUBEXP_BEGIN	"foobar" (1)
0009 CALL_JQ re:0	"foobar" (2)
0000 TAIL_CALL_JQ re:0^1	"foobar" (2)
0000 LOADK "foo(.+)"	"foobar" (2)
0002 RET	"foo(.+)" (2)
0013 SUBEXP_END	"foo(.+)" (2) | "foobar" (1)
0014 CALL_BUILTIN _match_impl	"foobar" (1) | "foo(.+)" (2) | "g" (2) | false
0017 EACH	[{"offset":0,"length":6,"string":"foobar" (1),"captures":[{"offset":3,"length":3,"string":"bar" (1),"name":null} (1)] (1)} (1)] (1)

The capture works and the output is ["bar"].

I still have no idea why it acts differently. Btw, all of this was run inside the Docker container (specified above).

@mbajur
Copy link

mbajur commented Dec 4, 2020

I have a pretty similar problem (i suppose). ruby-jq gives me different output on my local machine and on production machine (docker-based).

Local:

{ Start_Date: '07-05-2021' }.jq('.Start_Date | capture("(?<day>\\\d+)-(?<month>\\\d+)-(?<year>\\\d+)")')
# => [{"day"=>"07", "month"=>"05", "year"=>"2021"}]

Production:

{ Start_Date: '07-05-2021' }.jq('.Start_Date | capture("(?<day>\\\d+)-(?<month>\\\d+)-(?<year>\\\d+)")')
# => [{"day"=>"", "month"=>"07", "year"=>""}]

Notice that month on production is actually mapped to a day so it's working entirely wrong from top to bottom.

My Dockerfile:

FROM ruby:2.5.3-alpine3.9

ENV RAILS_ENV production

RUN apk --update --upgrade add \
  autoconf \
  automake \
  libtool \
  oniguruma-dev \
  build-base \
  libxml2-dev \
  libxslt-dev \
  postgresql-dev \
  tzdata \
  imagemagick \
  git \
  && rm -rf /var/cache/apk/*

And similar as in above comments, running the query directly via command line gives me a correct result:

bash-4.4# echo '{"Start_Date": "07-05-2021"}' | /usr/local/bundle/gems/ruby-jq-0.2.1/ext/ports/x86_64-alpine-linux-musl/jq/1.6/bin/jq '.Start_Date | capture("(?<day>\\d+)-(?<month>\\d+)-(?<year>\\d+)")'
{
  "day": "07",
  "month": "05",
  "year": "2021"
}

@voke
Copy link
Author

voke commented Dec 4, 2020

@mbajur I solved my issue by compiling a different version of JQ that uses onigmo (like Ruby 2.0 or later) instead of oniguruma. You can find it here: https://github.com/voke/jq

@mbajur
Copy link

mbajur commented Dec 7, 2020

thank you @voke ! I finally dropped jq entirely (partially because of the fact this gem seems to be abandoned)

@sergio-opslevel
Copy link

Had this problem as well. Compiling onigmo + voke-jq, then gem installing with RUBYJQ_USE_SYSTEM_LIBRARIES=yes worked for me.

@h0tw1r3
Copy link

h0tw1r3 commented Oct 16, 2022

This is indeed a strange issue which I could reproduce. System libraries or not, whenever I linked against oniguruma captures never worked right (always a blank capture before the actual capture).

#20 should just build without any external dependencies against onigmo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants