Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot write to /var/solr as 8983:8983 - dp-solr log #13

Open
jonwolds opened this issue Jan 16, 2023 · 19 comments
Open

Cannot write to /var/solr as 8983:8983 - dp-solr log #13

jonwolds opened this issue Jan 16, 2023 · 19 comments

Comments

@jonwolds
Copy link

Hi,

I created the docker containers using docker-compose.yaml and got many of the same issues as stated in the now closed issue. There are errors in the dp-front and dp-back logs as well (gunicorn-error.log isn't writable, redis.log -> can't open the log file: No such file or directory). The dp-solr container doesn't actually seem to start up though.

Any help much appreciated.

Jon

@mbanon
Copy link
Member

mbanon commented Jan 16, 2023

Hi @jonwolds ,
the redis issue usually happens when the path where the redis config file is created does not exist inside the docker container. It's fixed by creating this path by hand.

@jonwolds
Copy link
Author

Many thanks for that. I'm still getting errors in dp-front and dp-back "gunicorn.errors.HaltServer: <HaltServer 'Worker failed to book.' 3>,

Also, the Cannot write to /var/solr as 8983:8983 persists. I could edit the permissions of /var/solr by exec-ing in, but the container stops as soon as it starts, so I'd have to create a new container, I suppose, but then I'm not sure about how to link that back up with docker-compose. Any ideas?

@jonwolds
Copy link
Author

Answering my own question, this worked:
(sudo) chown 8939:8938 /mnt/solr-data/solr

I hadn’t realised what the dp-solr container was when I first asked the question.

@mbanon
Copy link
Member

mbanon commented Jan 19, 2023

Awesome!
Is already everything working for you?

@jonwolds
Copy link
Author

localhost:5000 page fires up fine, but I guess I need to do a lot of configuration work.

Basically, I just want to load up a couple of the manufactured corpora from paracrawl.eu for my own personal use, so I’ve no need for the authentication system, and I don’t think I’m able to install the Google app because I don’t have access to Google Workspace.

Any tips on the best order to do things in would be very welcome.

Many thanks in advance!

@jonwolds
Copy link
Author

I'm still struggling to get the set up working. The login process appears to work fine, but I then get sent to localhost:5000/search where I get a 500 - Internal Server Error. The log produced (from dp-front) is below

2023-01-26 19:25:53 +0000] [15] [INFO] Booting worker with pid: 15
[2023-01-26 19:26:11,728] ERROR in app: Exception on /search/ [GET]
Traceback (most recent call last):
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functionsrule.endpoint
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view
return func(*args, **kwargs)
File "/opt/dp/front/app/blueprints/search/views.py", line 54, in search_view
corpus_collection = base_corpus.solr_collection
AttributeError: 'NoneType' object has no attribute 'solr_collection'

I guess that's related to the set-up of the solr collection, which is where I'm struggling to follow the deployment instructions. I copied the solr.xml to the directory referenced by docker-composer.yaml and changed the permissions, but the dp-solr log still says:

2023-01-26 19:25:56.480 INFO (main) [] o.a.s.s.CoreContainerProvider Solr Home: /var/solr/data (source: system property: solr.solr.home)
2023-01-26 19:25:56.483 INFO (main) [] o.a.s.c.SolrXmlConfig solr.xml not found in SOLR_HOME, using built-in default

I put a core.properties file there, too, but it references a schema.xml and a solrconfig.xml, which I do not know how to set up (no instructions in the deployment guide).

Also, the deployment section (I may be jumping the gun here) says "Go to the web interface of your Solr instance", but localhost:5000 is the only port open, so I don't really understand what this means.

Any ideas?
Many thanks in advance,

Jon

@mbanon
Copy link
Member

mbanon commented Jan 27, 2023

Hi again Jon!
I've been taking a look into the Dockerfiles and, according to https://github.com/paracrawl/corset/blob/master/docker-compose.yaml#L53 , I think the Solr web interface should be reachable at localhost:8090 (or maybe localhost:8090/solr).

Regarding the missing schema.xml it's in the root folder of the repository (and also here. As for the solrconfig.xml I am not 100% confident, but I think it's self-generated by solr.

@jonwolds
Copy link
Author

Thanks for following up, Marta. It's much appreciated!

Here's my progress so far (I won't be working on this for the next week).

I got into the solr web interface by adding
ports:
- 8983:8983
to the dp-solr section of the docker-compose.yaml file

In the end, I created the new core using:
./solr create -c name-of-your-new-core

I had to exec in to the dp-solr container to do this as my efforts via the web interface were not successful in creating a solrconfig.xml file. This new core then shows up in the solr web interface correctly. It probably needs adjusting using the schema.xml file from the corset directory, too. Changing the permissions (chown 8983:8983) is always necessary, too

I'm still getting an internal server error at localhost:5000/search, but hopefully once I load some data into the core I've created things might improve.

@jonwolds
Copy link
Author

jonwolds commented Feb 5, 2023

This is the error message I'm getting from dp-front

2023-02-05 11:40:42,481] ERROR in app: Exception on /search/ [GET]
Traceback (most recent call last):
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view
    return func(*args, **kwargs)
  File "/opt/dp/front/app/blueprints/search/views.py", line 54, in search_view
    corpus_collection = base_corpus.solr_collection
AttributeError: 'NoneType' object has no attribute 'solr_collection'

I've tried to upload some data using tmxutils, but that hasn't been successful yet

@mbanon
Copy link
Member

mbanon commented Feb 6, 2023

Hi again!
The error suggests that no corpus are registered in the DB (which makes sense because you are having trouble with that :))
What error are you getting when trying to upload data?

@jonwolds
Copy link
Author

jonwolds commented Mar 9, 2023

Hi again,
I've been having a look at this again, and I've now managed to upload data into solr, but I still can't manage to sort out the link between solr and dp-front.

My configuration is most likely wrong, but the information provided is not quite enough to get it working.

The specific error I'm getting in the gunicorn-error.log is:

[2023-03-09 18:37:23 +0000] [13] [INFO] Booting worker with pid: 13
[2023-03-09 18:37:37,216] ERROR in app: Exception on /search/ [GET]
Traceback (most recent call last):
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view
    return func(*args, **kwargs)
  File "/opt/dp/front/app/blueprints/search/views.py", line 54, in search_view
    corpus_collection = base_corpus.solr_collection
AttributeError: 'NoneType' object has no attribute 'solr_collection'

http://localhost:5000/search/ produces a 500 Internal Server Error.

Any ideas?

Cheers,
Jon

@jonwolds
Copy link
Author

I got to the next stage and finally managed to get the /search/ page to appear properly by using an INSERT SQL command tailored to the solr collection I had created based on the model in the greyed-out part of the dpdb_initdb.sql file.

Unfortunately, the search function still doesn't find anything. I'm guessing there's more configuration to do with the dpdb tables in postgres.

@jonwolds
Copy link
Author

jonwolds commented Mar 12, 2023

This is the error message in the gunicorn-error.log (dp-front)

[2023-03-12 17:34:53,188] ERROR in app: Exception on /query/ [GET]
Traceback (most recent call last):
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view
    return func(*args, **kwargs)
  File "/opt/dp/front/app/blueprints/query/views.py", line 32, in query_view
    base_corpus = base_corpus_bo.get_base_corpora_by_pair(source_lang.code, target_langs[0].code)[0]
IndexError: list index out of range

@mbanon
Copy link
Member

mbanon commented Mar 14, 2023

Hi Jon,
all your errors seem related to the fact that "get_base_corpora_by_pair" is not returning anything. This is probably caused by the DB being empty (or not properly filled by the INSERT you made by hand), or the connection between the front, the back and the DB does not work.

Some hints:

  • Check if there are log messages or errors in the DB log (indicating that connections from front/back are happening)
  • Get a SELECT from the DB "basecorpora" table to see that everything is in place.

@jonwolds
Copy link
Author

jonwolds commented Mar 14, 2023

Hi again Marta,

This is what I have in the basecorpora table:

"id"	"name"	"description"	"source_lang"	"target_lang"	"sentences"	"size_mb"	"solr_collection"	"is_active"	"is_highlight"
1	"TMXcore FR-EN"	"French English tmx"	12	1	22093	20	"tmxcore"	true	true

Can you see anything obviously wrontg? tmxcore is the name of the solr core.

Thanks again for your help!

Jon

@jonwolds
Copy link
Author

jonwolds commented Mar 14, 2023

I can see that the search terms (e.g. charter here) are making it from dp-front to dp-solr, but no hits are displayed.
This is the log from dp-solr:

2023-03-14 19:52:12.238 INFO  (qtp1622458036-22) [ x:tmxcore] o.a.s.c.S.Request webapp=/solr path=/select params={q=trg:"charter"&hl=true&start=0&hl.fragsize=0&hl.fl=trg&sort=custom_score+desc&rows=50&wt=json} hits=38 status=0 QTime=11
2023-03-14 19:52:38.274 INFO  (qtp1622458036-25) [ x:tmxcore] o.a.s.c.S.Request webapp=/solr path=/select params={q=src:"charter"&hl=true&start=0&hl.fragsize=0&hl.fl=src&sort=custom_score+desc&rows=50&wt=json} hits=1 status=0 QTime=1

@jonwolds
Copy link
Author

jonwolds commented Mar 15, 2023

OK, I think by reversing the order of the languages, so English is first (source) and French the second (target), that initial error is averted. However, the following error is now showing up in the gunicorn-api-error.log in dp-back:

[2023-03-15 21:00:55,678] ERROR in app: Exception on /search [GET]
Traceback (most recent call last):
  File "/opt/dp/back/venv/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/dp/back/venv/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/flask_restx/api.py", line 375, in wrapper
    resp = resource(*args, **kwargs)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/flask/views.py", line 89, in view
    return self.dispatch_request(*args, **kwargs)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/flask_restx/resource.py", line 44, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/flask_login/utils.py", line 272, in decorated_view
    return func(*args, **kwargs)
  File "/opt/dp/back/api/resources/search.py", line 55, in get
    return SearchResponse.schema().dump(search_response), 200
  File "/opt/dp/back/venv/lib/python3.7/site-packages/dataclasses_json/mm.py", line 343, in dump
    dumped = Schema.dump(self, obj, many=many)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 558, in dump
    result = self._serialize(processed_obj, many=many)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 523, in _serialize
    value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 328, in serialize
    return self._serialize(value, attr, obj, **kwargs)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 716, in _serialize
    return [self.inner._serialize(each, attr, obj, **kwargs) for each in value]
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 716, in <listcomp>
    return [self.inner._serialize(each, attr, obj, **kwargs) for each in value]
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 583, in _serialize
    return schema.dump(nested_obj, many=many)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/dataclasses_json/mm.py", line 343, in dump
    dumped = Schema.dump(self, obj, many=many)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 558, in dump
    result = self._serialize(processed_obj, many=many)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 523, in _serialize
    value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 328, in serialize
    return self._serialize(value, attr, obj, **kwargs)
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 916, in _serialize
    ret = self._format_num(value)  # type: _T
  File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 891, in _format_num
    return self.num_type(value)
TypeError: float() argument must be a string or a number, not 'list'

@mbanon
Copy link
Member

mbanon commented Mar 16, 2023

Hi! Yes, I think that having English first is mandatory.

As for the last error, I had never seen that. I see that, in the error, "flask_login" is mentioned. As mentioned above, you were not using google login. How are you managing authorization and users? Login is needed in search requests (https://github.com/paracrawl/corset/blob/master/back/api/resources/search.py#L18)

@jonwolds
Copy link
Author

Yes, it's a weird error.

I don't think it's linked to login, because that is now working perfectly, My earlier comment was incorrect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants