Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw access denied #29

Closed
alexivkin opened this issue Feb 28, 2019 · 6 comments
Closed

Raw access denied #29

alexivkin opened this issue Feb 28, 2019 · 6 comments

Comments

@alexivkin
Copy link

alexivkin commented Feb 28, 2019

Private group, crawler works fine. When running the resulting bash no messages are downloaded. When copy-pasting the .../forum/message/raw?msg=... url to the same browser that exported cookies I get "access to groups.google.com was denied". Browsing works fine, messages show up ok in the normal UI, but I noticed that there is no option to "show original" on the drop down next to the message.

I've checked all permissions and there is nothing I could find that would reference the "show original" option.

  1. Is there something I need to configure to get the original message?
  2. If no, how could I change the url to download the message text. It's ok if it does not come in the RFC 822 format, but as an html/text
@icy
Copy link
Owner

icy commented Feb 28, 2019

Hi @alexivkin , what kind of that group? The script can't download from Adult content group. This kind of group causes problem similar to your one.

@alexivkin
Copy link
Author

It's a normal private GSuite group

@icy
Copy link
Owner

icy commented Feb 28, 2019

It's a normal private GSuite group

Please make sure that you have set environment variable _ORG=Your_Gsuite_Domain before you start the script. If you already did that, please check if the group allows archive access

  • Group Settings -> Content Control -> Group content classification: Should be Everyone, and Archive Options should be enabled.

My Gsuite account was expired I don't have any better idea now. As long as you can't have Show original message in your browser, the script can't download anything.

@alexivkin
Copy link
Author

For some reason I can't find the content classification in the group settings, even though I am the owner.
I figured out a different solution - changing wget from raw to
https://groups.google.com/a/....com/forum/print/msg/....

Although I am missing the message headers and other metadata, it's good enough for my needs. Thank you very much for the excellent script that you wrote!

@icy
Copy link
Owner

icy commented Mar 2, 2019

Thanks a lot @alexivkin . I'm happy that you can get around the problem. That's really interesting about the printing function and how you found it. Let's leave this issue open in case someone can figure out more details.

Nice weekend.

@icy icy closed this as completed Apr 12, 2020
@icy
Copy link
Owner

icy commented Apr 12, 2020

I have updated the README to mention the trick: https://github.com/icy/google-group-crawler#contributions . thanks again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants