Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

searx.space - shows incorrect data #29

Closed
ghost opened this issue Mar 10, 2020 · 7 comments
Closed

searx.space - shows incorrect data #29

ghost opened this issue Mar 10, 2020 · 7 comments

Comments

@ghost
Copy link

ghost commented Mar 10, 2020

https://searx.space/ does not refresh results active instances, nibblehole.com - missing CSP grade. The search response time often shows incorrect data, suddenly 70% in red? I think the problem is with your network, maybe too much traffic on the server?

@dalf
Copy link
Member

dalf commented Mar 11, 2020

TLDR: I'm going to check more in detail about nibblehole.com

Some technical details below:


missing CSP grade

All measures are done by searx-stats2 with two exceptions:

I notice that two days ago, there were many missing CSP grades, and at the same time observatory.mozilla.org took a very long time to respond.

Note about the cache in searx-stats2: CSP grade is updated every 24 hours, but not all searx instances are updated at the same time. For example: if nibblehole.com was first added at 03:00, then updates will happen at every 03:00. The other instances can be updated at different time of the day. I know it is disturbing, and perhaps it needs some improvement.

The CSP grade measure can be embedded into searx-stats2 using https://github.com/mozilla/http-observatory. This could be a good thing, because all requests would come from the same server: make searx-stats2 clear for the instance owner. Downside searx-stats2 needs to follow the updates of http-observatory.


The search response time often shows incorrect data, suddenly 70% in red? I think the problem is with your network, maybe too much traffic on the server?

Perhaps there is too much traffic. Perhaps Kimsufi network is not good enough. Perhaps it is related to the network of some instances.

Some notes:

It would be better to measure on dedicated server. This is related to https://github.com/dalf/searx-stats2/issues/1 : basically I was thinking to launch an AWS EC2 instance every 3 hours, can be expended to instances in different location later. If you have a better idea, you are welcome to share it.

Currently I'm working on https://github.com/dalf/searx-stats2/issues/9 so it will take some time before it is fixed.

@dalf
Copy link
Member

dalf commented Mar 11, 2020

Actually geckodriver (to get HTML grade) was writing gigabytes of logs.
I disable the HTML grade until there is a fix.

@ghost
Copy link
Author

ghost commented Mar 26, 2020

hi, two days ago we changed the IP address for https://nibblehole.com/, unfortunately according to https://searx.space/ we get the status: Searx not found

@dalf
Copy link
Member

dalf commented Mar 26, 2020

Yes because of this line:
https://github.com/dalf/searx-stats2/blob/634c30527e1c31dd25558780873408898b33ef16/searxstats/fetcher/basic.py#L15

The regex doesn't match
<meta name="generator" content="nibblehole powered by searx/0.16.0-60-822aee94">

Either:

  1. make the regex more lazy.
  2. use https://nibblehole.com/config to get the version.
  3. you remove the "nibblehole powered by " prefix.

The solution 2 (the .../config URL) is the most robust. I have not use it that to avoid fetching of the .../config URL twice (also use in selfreport.py).

But I think I'm going to implement this solution 2, just give me some time.

@ghost
Copy link
Author

ghost commented Mar 26, 2020

done! we removed "nibblehole powered by"

@dalf
Copy link
Member

dalf commented Mar 26, 2020

So https://nibblehole.com/ will appear online in less than 3 hours.

@ghost
Copy link
Author

ghost commented Mar 26, 2020

thank you

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant