Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A equivalent for nginx map module #2824

Closed
git001 opened this issue Oct 21, 2019 · 9 comments · Fixed by #3199
Closed

A equivalent for nginx map module #2824

git001 opened this issue Oct 21, 2019 · 9 comments · Fixed by #3199
Labels
feature ⚙️ New feature or request
Milestone

Comments

@git001
Copy link
Contributor

git001 commented Oct 21, 2019

1. What would you like to have changed?

This is the issue for https://caddy.community/t/whats-a-equivalent-for-nginx-map-module/6111/2

We use the nginx map for some dynamic directory decisions.

That's the config snipped.

# configuration file /home/nginx/server/conf/global_map_data_storage.conf:
map $camid_date $storageid {
  default 3;

  include /home/nginx/server/conf/global_map_data_storage_data.conf;
}

# configuration file /home/nginx/server/conf/global_map_data_storage_data.conf:
'7/2010/08/10' 10;
'7/2010/08/11' 10;
'7/2010/08/12' 10;
'7/2010/08/13' 10;
'7/2010/08/14' 10;
...
536491 more lines
11M file size

# configuration file /home/nginx/nginx.conf

    location ~ /cams/(\d+)/\w+\.(jpg|xml)$ {
      set  $camid $1;
      root /web/data/data$storageid;
   }

2. Why is this feature a useful, necessary, and/or important addition to this project?

With the map is it possible to create dynamically routing decisions.

3. What alternatives are there, or what are you doing in the meantime to work around the lack of this feature?

I still use nginx as webserver.

4. Please link to any relevant issues, pull requests, or other discussions.

https://caddy.community/t/whats-a-equivalent-for-nginx-map-module/6111

@git001 git001 added the feature ⚙️ New feature or request label Oct 21, 2019
@mholt
Copy link
Member

mholt commented Oct 21, 2019

Is a literal map lookup all that is needed? Or do we have to consider fancy types of lookups like IP ranges? I ask because at scale the data structure matters significantly. Iterating a list of CIDR ranges for example is not efficient like a map lookup, but a hashmap lookup won't really do for CIDR ranges.

Do you just need to map a string to some value?

(When I say "you" I mean people who need this, in general -- I read the forum thread which you detailed pretty well for your case)

@git001
Copy link
Contributor Author

git001 commented Oct 21, 2019

Do you just need to map a string to some value?

Exactly, well it's like a string or regex search. That's the exact definition for the map module

"The ngx_http_map_module module creates variables whose values depend on values of other variables."

That's the config signature.
map string $variable { ... }

Have you seen the doc
http://nginx.org/en/docs/http/ngx_http_map_module.html#map

@mholt
Copy link
Member

mholt commented Oct 21, 2019

Yep, I've seen it. But I frankly don't care what nginx does 😉 I want to know what your actual needs are.

Regexp is tricky because we can't just do a simple hashmap lookup. We'd need a different data structure. I don't know of one that implements lookup capabilities like a hashmap that is faster than O(n).

@git001
Copy link
Contributor Author

git001 commented Oct 24, 2019

Yep, I've seen it. But I frankly don't care what nginx does 😉 I want to know what your actual needs are.

Okay.
My current requirement is mainly static string matches as shown in the initial description.

We also use the map for blocking requests like the below one.

# http request line: "GET /index.php?culture=../../../../../../../../../../windows/win.ini&name=SP.JSGrid.Res&rev=laygpE0lqaosnkB4iqx6mA%3D%3D&sections=All%3Cscript%3Ealert(12345)%3C/script%3Ez HTTP/1.1"
# http uri: "/index.php"
# http args: "culture=../../../../../../../../../../windows/win.ini&name=SP.JSGrid.Res&rev=laygpE0lqaosnkB4iqx6mA%3D%3D&sections=All%3Cscript%3Ealert(12345)%3C/script%3Ez"
# http exten: "php"

map $args $block {
  default 0;
  "~(boot|win)\.ini" 1;
  "~etc/passwd" 1;
}

And for redirecting like the example below.

map $http_host $dest_name {
    hostnames;
    default notmapped;

    include /home/nginx/server/conf/frontentries-map.conf;
}
------
cat include /home/nginx/server/conf/frontentries-map.conf
https://7.DOMAIN.com  https://sub-dom.DOMAIN.com;
https://8.DOMAIN.com  https://sub-dom.DOMAIN.com;
https://9.DOMAIN.com  https://sub-dom.DOMAIN.com;
https://13.DOMAIN.com https://sub-dom.DOMAIN.com;
https://16.DOMAIN.com https://sub-dom.DOMAIN.com;
https://17.DOMAIN.com https://sub-dom.DOMAIN.com;
https://19.DOMAIN.com https://sub-dom.DOMAIN.com;
https://21.DOMAIN.com https://sub-dom.DOMAIN.com;
https://22.DOMAIN.com https://sub-dom.DOMAIN.com;
-----
cat detail.conf
                   if ($dest_name != "notmapped") {
                       return 302 https://$dest_name$request_uri;
                   }

Regexp is tricky because we can't just do a simple hashmap lookup. We'd need a different data structure. I don't know of one that implements lookup capabilities like a hashmap that is faster than O(n).

Full Ack.
It looks like that nginx and haproxy run a loop against the list of regex, but the create somehow a map/list/array before they start regex match, maybe you can get some ideas from that implementations .I'm not the algorithm guy so I can only suggest some Ideas.

https://hg.nginx.org/nginx/file/tip/src/http/modules/ngx_http_map_module.c#l527
http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/pattern.c;hb=68680bb14e5e4b3f1c0245ab956e9aee669cdac0#l565

Last time when I looked into the pattern matching was when I helped to integrate https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm into awffull https://salsa.debian.org/debian/awffull/blob/master/src/linklist.c but this is a string matching algorithm not a regex one.

@mholt
Copy link
Member

mholt commented Oct 26, 2019

Excellent, thank you for the details!! I am excited to work on this more, but right now my priorities are finishing up logging, some needed enhancements to the TLS app, working on the admin endpoint, and adding more tests.

In the meantime, feel free to continue discussion here and even submit a PR after that. Especially welcome would be specific implementation details for this feature.

@mholt mholt added this to the 2.0 milestone Oct 26, 2019
@mholt mholt added the v2 label Oct 26, 2019
@git001
Copy link
Contributor Author

git001 commented Dec 15, 2019

any update on this issue?

@mholt
Copy link
Member

mholt commented Dec 16, 2019

Nope. It's not a priority right now, but we can get it implemented right away if you're able to fund its development. Right now our primary goal is to get 2.0 released.

Let me know if you'd like us to develop this for you!

@git001
Copy link
Contributor Author

git001 commented Dec 16, 2019

Thank you for your offer, it's not as important as the last funded feature ( #2012 ) , therefore I will close the issue.

@git001 git001 closed this as completed Dec 16, 2019
@mholt mholt reopened this Dec 16, 2019
@mholt
Copy link
Member

mholt commented Dec 16, 2019

I still want it eventually though so I'll keep this open :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants