Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

希望增加对阿里云日志服务(SLS)投递到OSS的snappy压缩文档的支持 #75

Open
cfgxy opened this issue Jul 5, 2021 · 2 comments

Comments

@cfgxy
Copy link

cfgxy commented Jul 5, 2021

目前对hadoop-snappy的支持是正常的。
SLS投递到OSS的snappy压缩文档好像不是 hadoop-snappy ;
二进制对比发现SLS投递的snappy文档头部比正常的hadoop-snappy少了几个字节,snzip 工具需要添加参数 -t raw才能正常解压缩。
阿里云自己的生态链下,对这种格式添加支持应该是比较合理的。

附snzip supported formats列表:

snzip 1.0.4

  Usage: snzip [option ...] [file ...]

  general options:
   -c       output to standard output, keep original files unchanged
   -d       decompress
   -k       keep (don't delete) input files
   -t name  file format name. see below. The default format is framing2.
   -h       give this help

  raw_format option:
   -s size  size of input data when compressing.
            The default value is the file size if available.

  tuning options:
   -b num   internal block size in bytes
   -B num   internal block size. 'num'-th power of two.
   -R num   size of read buffer in bytes
   -W num   size of write buffer in bytes
   -T       trace for debug

  supported formats:
    NAME            SUFFIX  URL
    ----            ------  ---
    framing2        sz      https://github.com/google/snappy/blob/master/framing_format.txt
    hadoop-snappy   snappy  https://code.google.com/p/hadoop-snappy/
    raw             raw     https://github.com/google/snappy/blob/master/format_description.txt
    iwa             iwa     https://github.com/obriensp/iWorkFileFormat/blob/master/Docs/index.md#snappy-compression
    framing         sz      https://github.com/google/snappy/blob/0755c815197dacc77d8971ae917c86d7aa96bf8e/framing_format.txt
    snzip           snz     https://github.com/kubo/snzip
    snappy-java     snappy  https://github.com/xerial/snappy-java
    snappy-in-java  snappy  https://github.com/dain/snappy
    comment-43      snappy  http://code.google.com/p/snappy/issues/detail?id=34#c43
@adrian-wang
Copy link
Collaborator

这个和jindofs关系不大,使用emr-hadoop可以解决你的问题

@drankye
Copy link

drankye commented Jul 7, 2021

》阿里云自己的生态链下,对这种格式添加支持应该是比较合理的。

这个能 clarify 一下吗?比如具体需要 JindoFS SDK 对 OSS 这部分格式数据提供什么样的支持?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants