Commit Graph

8 Commits

Author SHA1 Message Date
Atlas Quan
bd80574f76 feat(route): add slowmist news (#10044)
* feat(route): add slowmist news

* feat(utils): remove useless ul for code section

* docs: add slowmist news route docs

* docs(quick-start): new router under /lib/v2

* test(utils): add test fixArticleContent for new wechat-mp code

* feat(router): update radar source of slowmist
2022-07-04 23:44:58 +08:00
MisLink
df6f4caf8b feat(route): add 北京大学国家发展研究院 - 观点 (#9804)
* feat(route): add 北京大学国家发展研究院 - 观点

Signed-off-by: MisLink <gjq.uoiai@outlook.com>

* Fix https issue

Signed-off-by: MisLink <gjq.uoiai@outlook.com>

* fetch full text from wechat-mp and pku news

Signed-off-by: MisLink <gjq.uoiai@outlook.com>

* Fix ci

Signed-off-by: MisLink <gjq.uoiai@outlook.com>

* refactor: sort new route
2022-05-27 19:12:04 +08:00
任平生
9d9926d0bf fix(utils): 修复抓取微信已删除文章时遇到的报错 (#9589)
* fix(utils): 支持微信公众号单图片文章抓取

* fix(utils): 支持输出微信公众号转载文章阅读原文链接

* fix(utils): 修复抓取微信已删除文章时遇到的报错

* refactor: migrate to v2

Co-authored-by: blankyu(于海洋) <blankyu@tencent.com>
2022-04-21 21:40:16 +08:00
任平生
f3e069d399 fix(utils): 支持微信公众号单图片文章抓取; 增加输出阅读原文链接 (#9557)
* fix(utils): 支持微信公众号单图片文章抓取

* fix(utils): 支持输出微信公众号转载文章阅读原文链接
2022-04-19 01:29:45 +08:00
Levi Zim
958be6266e feat(route): 山东大学(威海)新闻网 (#9537)
* feat(sduwh): add extractors.

* feat(route): add route for 山东大学(威海)新闻网

* docs: for route sduwh/news

* docs: for route sduwh/news

(cherry picked from commit 831830167a)

* feat(radar): for route 山东大学(威海)新闻网

* refactor: change `got.get` to `got`.

* refactor: prefer `parseDate()` to `new Date()`

Co-authored-by: Tony <TonyRL@users.noreply.github.com>

* fix: incomplete URL substring sanitization.

Make CodeQL happy.

* fix(radar): fix target field.

* fix: change route /sduwh to /sdu/wh

* fix: remove superfluous slash character in url.

* feat: look for exact date first.

* feat: extract exact date from news extractor.

* feat: extract exact date from view extractor.

* feat: extractor for www.sdrj.sdu.edu.cn

* refactor: semantic separation of sduwh with sdu

* feat(radar): more accurate name

* docs: update documentation

* refactor: migrate to v2

* refactor: fix deprecated url.resolve

* fix: update docs url

Co-authored-by: Tony <TonyRL@users.noreply.github.com>

* fix: sdu not working routes

* fix: accurate `ctx.state.data.url`

Co-authored-by: Tony <TonyRL@users.noreply.github.com>

* fix: better error handling for extractors.

* fix: timezone

Co-authored-by: Tony <TonyRL@users.noreply.github.com>

* fix: better error handling.

Co-authored-by: Tony <TonyRL@users.noreply.github.com>
2022-04-17 00:01:39 +08:00
任平生
eb467afae1 fix(utils): 支持将微信公众号转载文章的正文抓取回来 (#9534)
* feat: 修正日期时间匹配规则、移除一些不必要评论元素

* fix(route)(fortunechina): 修正财富中文网1、双语文章中文内容重复问题;2、移除 kol 大头像

* fix(route)(wechat): 支持将微信公众号转载文章的正文抓取回来
2022-04-15 19:23:09 +08:00
Rongrong
74e1f88a32 feat(core)(utils/wechat-mp): normalize URL (#9497)
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com>
2022-04-08 18:42:52 +08:00
Rongrong
a79cc20ec1 feat(utils): add utils for WeChat MP (#9487)
Motivation:
There are multiple routes that need to fetch articles from WeChat MP.
However, letting them fetch articles by themselves could potentially
lead to cache key collisions. Even if cache key collisions do not occur,
un-normalized URL could potentially lead to duplicated requests.
What's more, articles from WeChat MP have weird formats and need to be
fixed. Creating a universal function to do this work can create some
ease for new route contributors.

Note:
In order to make this PR atomic as much as possible, I did not touch
those broken routes. Once this PR is merged, I will try to fix them.

Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com>
2022-04-07 21:46:15 +08:00