Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display and accept binary HBase data in escaped form #3729

Open
stoty opened this issue May 7, 2024 · 9 comments
Open

Display and accept binary HBase data in escaped form #3729

stoty opened this issue May 7, 2024 · 9 comments
Labels
ENHANCEMENT Issue type for requesting new feature or improvement

Comments

@stoty
Copy link

stoty commented May 7, 2024

Description

Much of the time data in HBase (not just values, but often rowkeys and even the column qualifier) is binary, with the encoding determined by an application.

Currently Hue is incapable of displaying these in a usable manner. It tries to interpret the data as an (UTF-8 ?) string, and the ouput is full of placeholders for unprintable characters. (at least I was not able to change this from the UI)

Due to the lack of standard encodings, and metadata on the encoding, it is not possible to display the decoded contents in a reliable manner.

This same problem is solved in HBase shell by escaping binary data. Bytes that are printable ASCII characters are displayed as their ASCII character value, while bytes outside this range are displayes as escaped hex codes.

While this is still not a super-user friendly format, most HBase users are familiar with it, and have workflows to handle it.

I propose doing the same in Hue, using the same encoding to display all data (that is not otherwise identfied and handled).

Additionally, this encoding could also be supported in the editor, by accepting an escaped string and converting to its binary representation.

The escaping code is very simple, these are the java methods for escaping / unescaping:

https://github.com/apache/hbase/blob/156e430dc56211c0aea15d792e8733b1b0e3de5c/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Bytes.java#L574
https://github.com/apache/hbase/blob/156e430dc56211c0aea15d792e8733b1b0e3de5c/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Bytes.java#L607

There are further possible enhancements like being able to interpret the data as hex strings, or being able to switch the data encoding for cells/rows etc dynamicall to one of the standard encodings in org.apache.hadoop.hbase.util.Bytes, but those are less critical, and can be handled separately.

@stoty stoty added the ENHANCEMENT Issue type for requesting new feature or improvement label May 7, 2024
@bjornalm
Copy link
Collaborator

bjornalm commented May 8, 2024

Hi @stoty and thanks for reaching out. Can you provide an screenshot example of how this looks in HBase for a clearer picture?

@stoty
Copy link
Author

stoty commented May 8, 2024

Creating and displaying binary data in hbase shell:

hbase:006:0>create 'demo', 'cf1';
Created table demo
Took 0.8557 seconds
=> Hbase::Table - demo
hbase:007:0>put 'demo', 'ascii_key', 'cf1:ascii_qialifier', 'ascii_value';
Took 1.2427 seconds
hbase:010:0> put 'demo', "binary_key\x00\x01\xff", "cf1:binary_qualifier\x00\x01\xff", "binary_value\x00\x01\xff";
Took 0.0089 seconds
hbase:019:0> scan 'demo'
ROW COLUMN+CELL
ascii_key column=cf1:ascii_qialifier, timestamp=2024-05-08T11:40:29.760, value=ascii_value
binary_key\x00\x01\xFF column=cf1:binary_qualifier\x00\x01\xFF, timestamp=2024-05-08T11:42:55.064, value=binary_value\x00\x01\xFF
2 row(s)
Took 0.0112 seconds

As you can see, the binary values are entered as escaped hex characters, and the results are displayed the same way.

The same data in Hue looks like this:

Screenshot from 2024-05-08 13-52-20

The easiest and most HBase-like solution would be using the same hex escaped format in Hue.

This could be a toggle in the toolbar for backward compatibility.

@stoty
Copy link
Author

stoty commented May 8, 2024

For ease of identification, I have used an ascii prefix, but the data is often pure binary, like a long or an integer.

@bjornalm
Copy link
Collaborator

bjornalm commented May 8, 2024

Thanks, let's leave this issue open to see if any one in the community can create a PR for it.

@stoty
Copy link
Author

stoty commented May 9, 2024

#77 is a similar issue.

Copy link

github-actions bot commented Jun 9, 2024

This issue is stale because it has been open 30 days with no activity and is not labeled "Prevent stale". Remove "stale" label or comment or this will be closed in 10 days.

@stoty
Copy link
Author

stoty commented Jun 20, 2024

Fot the record this is NOT completed.

@MikhailShapovalov26
Copy link

It would be cool to implement this task

@bjornalm
Copy link
Collaborator

bjornalm commented Dec 2, 2024

@MikhailShapovalov26 Cool, why don't you give it a go?

@bjornalm bjornalm reopened this Dec 2, 2024
@github-actions github-actions bot removed the Stale label Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENHANCEMENT Issue type for requesting new feature or improvement
Projects
None yet
Development

No branches or pull requests

3 participants