Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How about Stackexchange? #17

Closed
r7l opened this issue Feb 14, 2023 · 38 comments · Fixed by #99
Closed

How about Stackexchange? #17

r7l opened this issue Feb 14, 2023 · 38 comments · Fixed by #99

Comments

@r7l
Copy link
Contributor

r7l commented Feb 14, 2023

I've found AnonymousOverflow through Libredirect. But it seems it's limited to Stackoverflow only. Are there any plans to extend it to Stackexchange?

@httpjamesm
Copy link
Owner

Totally possible since they use the same layouts, but the issue is distinguishing between stack exchange and stack overflow threads.

Both platforms use the same URL path format, so it's a question of the best way to either guess which platform the user wants or give them a choice, which may come off as annoying to some.

@r7l
Copy link
Contributor Author

r7l commented Feb 14, 2023

You're right. Guessing the platform would probably be hard if not impossible. To make it even worse, they have subdomains on Stackexchange.

@r7l
Copy link
Contributor Author

r7l commented Feb 16, 2023

Would it possible to add some sort of configuration option if a hosted instance is for Stackexchange or Stackoverflow? So you would have instances handling Stackoverflow and others Stackexchange.

To overcome the issue with subdomains of Stackexchange, you could move them into the path string. So for example a meta.stackexchange.com question would become domain.com/meta/question/....

@httpjamesm
Copy link
Owner

That's a great idea. I'll implement it soon.

@httpjamesm
Copy link
Owner

After thinking about it a little, I actually think just creating a /exchange/:sub endpoint would be better.

@httpjamesm
Copy link
Owner

httpjamesm commented Feb 20, 2023

Version 1.8 has been released with StackExchange support and is deployed on the Whatever Social instance. Let me know if this fits your needs.

Example: https://code.whatever.social/exchange/apple/questions/445555/macbook-pro-2017-folder-when-booting-on-battery-boots-fine-when-connected

@r7l
Copy link
Contributor Author

r7l commented Feb 20, 2023

Thanks allot for your work! Somehow it does not seem work yet. I see a 404 on the link you've added and i am not able to add any Stackexchange URL into the input bar on the landing site.

@httpjamesm
Copy link
Owner

Can you send a screenshot of the 404 message? It's working on both my end and others'.

@r7l
Copy link
Contributor Author

r7l commented Feb 20, 2023

There isn't much to see. It's a blank white page showing only 404 page not found.

The frontpage loads fine. But when entering to the input field, it will show: Error: Invalid stack overflow URL in a orange box above the field.

Both happens on your https://code.whatever.social site.

@r7l
Copy link
Contributor Author

r7l commented Feb 20, 2023

Not sure if you did something but it works now. Both of it.

@privacytime101
Copy link
Contributor

Hi, we deployed it on a new node yesterday, and I forgot to update it on there just now. The reason why it was working for multiple people was that they were routed to the US server (proximity-based), which was on 1.8. Thanks for reporting this.

@r7l
Copy link
Contributor Author

r7l commented Feb 20, 2023

Thanks allot! Great work!

On a side note: you may mention Libredirect in your README file as they've implemented your tool already. It's probably easier for most people to use that in their browsers instead of custom redirect scripts as suggested currently.

I'll head over there to inform them about this new feature and have them extend their browser extentions to automatically redirect Stackexchange as well.

@adastx
Copy link

adastx commented Feb 22, 2023

How about superuser.com?

For example:
https://superuser.com/questions/769452/what-is-a-openpgp-gnupg-key-id
https://code.whatever.social/superuser/questions/769452/what-is-a-openpgp-gnupg-key-id

seriously why is stack exchange structured like this??

@adastx
Copy link

adastx commented Feb 23, 2023

Hey, looking at this again now I'm wondering, why not just implement a
general solution that works across all Stack Exchange Network
sites
?

Seems to me like they all use the same format. (question, description,
comments, votes, answers, chosen answer, related, ...)

An example url rewrite could look like:
{instance_url}/{site}/[meta/]{question_id}/{question_title}

So in practice:
https://exchange.adast.xyz/travel/meta/8451/january-2023-photo-competition-new

Or:
https://exchange.adast.xyz/superuser/769452/what-is-a-openpgp-gnupg-key-id

I might be wrong here, but it seems like the /question/ sub-path is kinda
redundant. Also, I'm suggesting that meta sub-domains be handled adding a
/meta/ sub-dir.

@httpjamesm
Copy link
Owner

Question IDs are NOT exclusive to each platform. This means that it's impossible for the server to take a correct guess as to which platform you intended, as discussed above.

@adastx
Copy link

adastx commented Feb 23, 2023

it's impossible for the server to take a correct guess as to which
platform you intended, as discussed above.

I don't understand.

Assuming we wanted to generalize to support all sites then:

It's not the servers job to guess the exchange. The server just receives a
request to some resource in the format I suggested, and reverses the url
transformation to find the original instance of the question.

Then, using some scripts or something like libredirect we have the 'hook'
for when the user makes a request to an official exchange site. We pick an
instance, transform the url, reach the minimal version of the question.

Am I misunderstanding something?

@httpjamesm
Copy link
Owner

Oh I think I see what you mean. I missed the part about the site and meta path.

I suppose this would work, but we'd have to compile a list of these types of sites.

@adastx
Copy link

adastx commented Feb 23, 2023

Could be something for me to look into when I get some free time then :)
But you would be open to the idea?

@httpjamesm
Copy link
Owner

Sure. Backwards compatibility may be an issue, but it's totally solvable.

@jonas-w
Copy link

jonas-w commented Mar 15, 2023

@adastx @httpjamesm

You can easily get a list of all current sites by using xq from https://github.com/kislyuk/yq#xml-support and the xml file https://stackexchange.com/feeds/sites.

This is just a proof of concept, later on you probably want to automatically generate this for example when starting AnonymousOverflow.

curl https://stackexchange.com/feeds/sites | xq '[.feed.entry[] | {name: .title["#text"], url: .id}]'

This will give you currently this json array:

[
  {
    "name": "Solana Stack Exchange",
    "url": "https://solana.stackexchange.com"
  },
  {
    "name": "Bioacoustics Stack Exchange",
    "url": "https://bioacoustics.stackexchange.com"
  }
... etc, with 180 entries
]

I justed checked the urls and looked at your implementation.
Your implementation would need to make changes for:

https://es.stackoverflow.com
https://ru.stackoverflow.com
https://ja.stackoverflow.com
https://pt.stackoverflow.com
https://askubuntu.com
https://stackapps.com
https://mathoverflow.net
https://superuser.com
https://serverfault.com

The best would be to just fetch the urls on startup or at an interval and let it act like a whitelist, instead of hardcoding with %s.stackexchange.com etc.

@GreenLunar
Copy link

Totally possible since they use the same layouts, but the issue is distinguishing between stack exchange and stack overflow threads.

You migh want to use URL parameters, similarly to ?lang= of Wikiless.

https://AnonymousOverflow/question/some-title?brand=stackexchange&subdomain=phosh&lang=it

@httpjamesm
Copy link
Owner

This issue has already been resolved as of release 1.8. Forgot to close the issue.

@jonas-w
Copy link

jonas-w commented Mar 19, 2023

@httpjamesm as mentioned in my previous comment some stackexchange urls (that are not *.stackexchange.com urls) will still get flagged as invalid

@httpjamesm httpjamesm reopened this Mar 19, 2023
@GreenLunar
Copy link

This issue has already been resolved as of release 1.8.

Are there example URLs of several domains?

@p0da
Copy link

p0da commented Jul 2, 2023

Any update on this? The aforementioned superuser url: https://superuser.com/questions/769452/what-is-a-openpgp-gnupg-key-id is still being flagged as invalid.

@httpjamesm
Copy link
Owner

If you'd like to pick this up, you can create a PR. I may add this soon, but no guarantees

@GreenLunar
Copy link

GreenLunar commented Jul 3, 2023 via email

@httpjamesm
Copy link
Owner

I actually haven't thought of that. Using a query parameter could be a more viable solution than the restructuring of path parameters as mentioned above. I'll end up choosing that probably

@GreenLunar
Copy link

GreenLunar commented Jul 3, 2023 via email

@banaanihillo
Copy link

Is there a consensus for sites like ask ubuntu and superuser yet?
Can they be implemented like the existing /exchange/${SITE} or is there a blocker I am unaware of?
Are the query parameter options mentioned a couple comments earlier proposing the same feature or are they a separate issue altogether?

@httpjamesm
Copy link
Owner

I think the query parameter system can make special exceptions for cases like askubuntu. Otherwise, it can default to .stackexchange.com

@Sketch6307
Copy link

https://serverfault.com/
https://superuser.com/
https://mathoverflow.net/
https://stackapps.com/
https://askubuntu.com/
+stackoverflow.com+stackexchange.com

There's your list...

Why not just allow the entire URL to be placed after the AO root: https://anonymous.overflow/https://serverfault.com/questions/...

@httpjamesm
Copy link
Owner

This would break implementations like Libredirect. The URL would also have to be verified regardless of the schema.

@mattfbacon
Copy link
Contributor

Any update on this? I agree with the idea of e.g. https://domain/superuser.com/questions/..., can you clarify why it wouldn't work with libredirect?

Very annoying to have this artificial limitation on accessing sites like superuser when the actual scraping code here is obviously applicable. I want to help if I can.

@httpjamesm
Copy link
Owner

Completely changing the way stack overflow links are converted to anonymous overflow links would break every current implementation.

I suggested the query parameter idea a while ago because it builds on the current implementation without major changes.

If you'd like to contribute, feel free to open a PR to kick-start this.

@mattfbacon
Copy link
Contributor

I've just put together #99 which implements the required functionality in a backwards-compatible way.

@KaKi87
Copy link

KaKi87 commented Apr 4, 2024

Hi,
How was this implemented ?
It would be nice to document this usage in the README as well.
Thanks

@mattfbacon
Copy link
Contributor

Good point. I changed the code so that if the exchange parameter has a dot, it is treated as a full domain. So e.g. to access superuser.com you would do /exchange/superuser.com/q/....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.