There are also some links that cannot be obtained, and the problem of abnormal acquisition #3

wongchenv · 2021-04-21T07:06:48Z

There are also some links that cannot be obtained, and the problem of abnormal acquisition:

zolrath · 2021-04-22T22:18:26Z

The mp.weixin.qq.com URLs load the title onto the page using Javascript.
If we use a headless browser or something of that nature I could let Javascript execute then grab the title after the page has loaded but that wouldn't function on mobile.

Hypothetically if we had some kind of funds for this project I could build a new CORS proxy using a headless browser and use that for URL fetching, fixing both the issue of Javascript loaded pages as well as encoding issues but as I'd need to pay to host it publicly that's not something I'm considering at the moment.

Alternatively that browser/API could be run locally by a user and turned on in settings.
I'll think on this as well!

zolrath · 2021-04-23T06:55:50Z

I've got a local scraping solution working on desktop, it still doesn't get a title out of weixin.qq.com but it succeeds at the other two. Need to perform mobile tests.

[Title Unknown](https://mp.weixin.qq.com/mp/appmsgalbum?__biz=Mzg4MjAwNTUwNw==&action=getalbum&album_id=1448541657456295937&scene=173&from_msgid=2247484083&from_itemidx=1&count=10#wechat_redirect&scene=0&subscene=90&sessionid=1606652573&enterid=1606653138)

[弱点 (豆瓣)](https://movie.douban.com/subject/3552028/)

[手绘100张，耗时1个月，我终于破解了【达芬奇密码书】的全部秘密！_哔哩哔哩 (゜-゜)つロ 干杯~-bilibili](https://www.bilibili.com/video/BV1qy4y1t7fn?spm_id_from=333.851.b_7265636f6d6d656e64.3)

zolrath · 2021-04-23T18:06:23Z

Fixed for Desktop on 1.2.0
Mobile still relies on the CORS proxy that doesn't support these characters.

DDDOH · 2023-12-05T05:01:26Z

WeChat links failed again. The link is given here: https://mp.weixin.qq.com/s/nVilywouNxnZlb-l3Buj3w

Is it possible to set a rule, and for websites following this rule we will fetch the whole page and get the title for them?

zolrath closed this as completed Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There are also some links that cannot be obtained, and the problem of abnormal acquisition #3

There are also some links that cannot be obtained, and the problem of abnormal acquisition #3

wongchenv commented Apr 21, 2021

zolrath commented Apr 22, 2021 •

edited

Loading

zolrath commented Apr 23, 2021

zolrath commented Apr 23, 2021

DDDOH commented Dec 5, 2023

There are also some links that cannot be obtained, and the problem of abnormal acquisition #3

There are also some links that cannot be obtained, and the problem of abnormal acquisition #3

Comments

wongchenv commented Apr 21, 2021

zolrath commented Apr 22, 2021 • edited Loading

zolrath commented Apr 23, 2021

zolrath commented Apr 23, 2021

DDDOH commented Dec 5, 2023

zolrath commented Apr 22, 2021 •

edited

Loading