Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]:When using an HDFS cluster protected by Kerberos, the fs.hdfs.impl.disable.cache=false configuration is invalid, FileSystem is unable to hit the cache #1857

Closed
1 task done
Tracked by #1930
861752346 opened this issue Aug 16, 2023 · 3 comments · Fixed by #1990
Labels
good first issue Good for newcomers priority:major type:bug Something isn't working

Comments

@861752346
Copy link
Contributor

What happened?

Kerberos accounts usually consist of the principal and realm parts, in the form of principal@Realm. When using ams to create a catalog, since the uploaded krb5.conf file already contains the realm information, the account can be abbreviated to just fill in the principal. If the kerberos account information filled in during catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error.
It is recommended to modify the validation logic on line 239 of com.netease.arctic.table.TableMetaStore from !ugi.getUserName().equals(krbPrincipal) to !ugi.getUserName().startsWith(krbPrincipal) to adapt to the abbreviated kerberos account pattern.

Affects Versions

0.5.0

What engines are you seeing the problem on?

No response

How to reproduce

When creating a catalog with Kerberos authentication, only the principal part is filled in the Kerberos principal field, without the realm information.

Relevant log output

No response

Anything else

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@861752346 861752346 added the type:bug Something isn't working label Aug 16, 2023
@zhoujinsong
Copy link
Contributor

Thank you very much for reporting this issue and identifying the root cause.

Would you be interested in fixing this problem?

@shidayang shidayang added the good first issue Good for newcomers label Aug 25, 2023
@861752346
Copy link
Contributor Author

Yes, I'm interested in fixing this problem.

@861752346 861752346 reopened this Sep 9, 2023
861752346 pushed a commit to 861752346/amoro that referenced this issue Sep 9, 2023
… catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (apache#1857)

* extract principal from username of ugi,and compare to krbPrincipal
@wangtaohz
Copy link
Contributor

wangtaohz commented Sep 14, 2023

Yes, I'm interested in fixing this problem.

@861752346 Can you submit a PR to fix this issue? I'd like to include this fix in the 0.5.1 road map. 😄

I think you have already fixed this issue with this commit. If you have any questions about submitting a PR, please feel free to contact us.

@wangtaohz wangtaohz mentioned this issue Sep 14, 2023
56 tasks
861752346 pushed a commit to 861752346/amoro that referenced this issue Sep 18, 2023
… catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (apache#1857)

* extract principal from username of ugi,and compare to krbPrincipal
861752346 pushed a commit to 861752346/amoro that referenced this issue Sep 18, 2023
… catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (apache#1857)

* extract principal from username of ugi,and compare to krbPrincipal
baiyangtx added a commit that referenced this issue Sep 18, 2023
…() (#1990)

* [AMORO-1857] Fix If the kerberos account information filled in during catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (#1857)

* extract principal from username of ugi,and compare to krbPrincipal

* [AMORO-1857] Fix If the kerberos account information filled in during catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (#1857)

* extract principal from username of ugi,and compare to krbPrincipal

* [AMORO-1857] Fix If the kerberos account information filled in during catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (#1857)

* extract principal from username of ugi,and compare to krbPrincipal

---------

Co-authored-by: 罗旭 <luo.xu@trs.com.cn>
Co-authored-by: baiyangtx <xiangnebula@163.com>
ShawHee pushed a commit to ShawHee/arctic that referenced this issue Dec 29, 2023
…() (apache#1990)

* [AMORO-1857] Fix If the kerberos account information filled in during catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (apache#1857)

* extract principal from username of ugi,and compare to krbPrincipal

* [AMORO-1857] Fix If the kerberos account information filled in during catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (apache#1857)

* extract principal from username of ugi,and compare to krbPrincipal

* [AMORO-1857] Fix If the kerberos account information filled in during catalog creation is abbreviated, it will cause the FileSystem to be unable to hit the cache, constantly create new FileSystem instances without being garbage collected, eventually leading to an out of memory error (apache#1857)

* extract principal from username of ugi,and compare to krbPrincipal

---------

Co-authored-by: 罗旭 <luo.xu@trs.com.cn>
Co-authored-by: baiyangtx <xiangnebula@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers priority:major type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants