Tony
6251a2dc02
fix(core): parse date utils ( #10789 )
...
* fix(core): parse date utils
* fix: regex
2022-09-14 21:59:32 +08:00
DIYgod
9329a44c80
feat(core): support proxy config for pupperteer ( #10714 )
...
* feat: support proxy config for pupperteer
* test: add puppeteer proxy detection
* fix: package.json
* test: fix regex
2022-09-07 00:07:04 +08:00
Tony
9fa248c324
fix(route): nmpa feed items on 3rd party site ( #10146 )
...
* fix(route): nmpa feed items on 3rd party site
* fix: regex unescaped .
2022-07-06 00:46:05 +08:00
Atlas Quan
bd80574f76
feat(route): add slowmist news ( #10044 )
...
* feat(route): add slowmist news
* feat(utils): remove useless ul for code section
* docs: add slowmist news route docs
* docs(quick-start): new router under /lib/v2
* test(utils): add test fixArticleContent for new wechat-mp code
* feat(router): update radar source of slowmist
2022-07-04 23:44:58 +08:00
Tony
2f26479fc9
feat(route): sspai series update ( #9901 )
2022-06-05 23:59:10 +08:00
github-actions[bot]
5bc8be9ac5
style: auto format
2022-06-05 14:42:33 +00:00
Tony
450e522167
feat(utils): decode cf protected email string ( #9900 )
...
* feat(utils): decode cf protected email string
* Update cf-email.js
2022-06-05 22:40:57 +08:00
Tony
a5fd8de30e
fix(docker): puppeteer stealth not working in docker ( #9896 )
2022-06-05 17:09:34 +08:00
MisLink
df6f4caf8b
feat(route): add 北京大学国家发展研究院 - 观点 ( #9804 )
...
* feat(route): add 北京大学国家发展研究院 - 观点
Signed-off-by: MisLink <gjq.uoiai@outlook.com >
* Fix https issue
Signed-off-by: MisLink <gjq.uoiai@outlook.com >
* fetch full text from wechat-mp and pku news
Signed-off-by: MisLink <gjq.uoiai@outlook.com >
* Fix ci
Signed-off-by: MisLink <gjq.uoiai@outlook.com >
* refactor: sort new route
2022-05-27 19:12:04 +08:00
Tony
0f31bfa8b9
fix(utils): parse relative date with meridiem ( #9775 )
...
* fix(utils): parse relative date with meridiem
* fix: use regex
2022-05-17 20:21:47 +08:00
Rongrong
23fcb6bc5a
feat(core/utils/request-wrapper): request logging ( #9691 )
...
Signed-off-by: Rongrong <i@rong.moe >
2022-05-04 12:36:48 +10:00
Tony
0b544e1395
feat(utils): puppeteer-extra-plugin-stealth ( #9676 )
2022-05-03 01:17:17 +08:00
Rongrong
7a6be9a229
feat(core): customizable Chromium executable path ( #9670 )
...
* feat(core): customizable Chromium executable path
also build Chromium-bundled Docker image for arm/arm64
Signed-off-by: Rongrong <i@rong.moe >
* chore: fix typo
Signed-off-by: Rongrong <i@rong.moe >
* chore(CI/test): using build matrix
Signed-off-by: Rongrong <i@rong.moe >
* docs(install): fix punctuation
Signed-off-by: Rongrong <i@rong.moe >
2022-05-01 21:00:29 +08:00
Tony
34b58ebc64
fix(utils): request without hostname ( #9649 )
2022-04-28 19:35:31 +08:00
任平生
9d9926d0bf
fix(utils): 修复抓取微信已删除文章时遇到的报错 ( #9589 )
...
* fix(utils): 支持微信公众号单图片文章抓取
* fix(utils): 支持输出微信公众号转载文章阅读原文链接
* fix(utils): 修复抓取微信已删除文章时遇到的报错
* refactor: migrate to v2
Co-authored-by: blankyu(于海洋) <blankyu@tencent.com >
2022-04-21 21:40:16 +08:00
dependabot[bot]
dd4a216648
chore(deps): bump socks-proxy-agent from 6.1.1 to 6.2.0 ( #9572 )
...
* chore(deps): bump socks-proxy-agent from 6.1.1 to 6.2.0
Bumps [socks-proxy-agent](https://github.com/TooTallNate/node-socks-proxy-agent ) from 6.1.1 to 6.2.0.
- [Release notes](https://github.com/TooTallNate/node-socks-proxy-agent/releases )
- [Changelog](https://github.com/TooTallNate/node-socks-proxy-agent/blob/master/CHANGELOG.md )
- [Commits](https://github.com/TooTallNate/node-socks-proxy-agent/compare/v6.1.1...v6.2.0 )
---
updated-dependencies:
- dependency-name: socks-proxy-agent
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
* fix: use dot notation
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: TonyRL <TonyRL@users.noreply.github.com >
2022-04-21 03:54:53 +08:00
Rongrong
0522c63d5f
fix(core/utils): invalid request header Server ( #9582 )
...
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
2022-04-20 22:32:46 +08:00
任平生
f3e069d399
fix(utils): 支持微信公众号单图片文章抓取; 增加输出阅读原文链接 ( #9557 )
...
* fix(utils): 支持微信公众号单图片文章抓取
* fix(utils): 支持输出微信公众号转载文章阅读原文链接
2022-04-19 01:29:45 +08:00
Levi Zim
958be6266e
feat(route): 山东大学(威海)新闻网 ( #9537 )
...
* feat(sduwh): add extractors.
* feat(route): add route for 山东大学(威海)新闻网
* docs: for route sduwh/news
* docs: for route sduwh/news
(cherry picked from commit 831830167a )
* feat(radar): for route 山东大学(威海)新闻网
* refactor: change `got.get` to `got`.
* refactor: prefer `parseDate()` to `new Date()`
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
* fix: incomplete URL substring sanitization.
Make CodeQL happy.
* fix(radar): fix target field.
* fix: change route /sduwh to /sdu/wh
* fix: remove superfluous slash character in url.
* feat: look for exact date first.
* feat: extract exact date from news extractor.
* feat: extract exact date from view extractor.
* feat: extractor for www.sdrj.sdu.edu.cn
* refactor: semantic separation of sduwh with sdu
* feat(radar): more accurate name
* docs: update documentation
* refactor: migrate to v2
* refactor: fix deprecated url.resolve
* fix: update docs url
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
* fix: sdu not working routes
* fix: accurate `ctx.state.data.url`
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
* fix: better error handling for extractors.
* fix: timezone
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
* fix: better error handling.
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
2022-04-17 00:01:39 +08:00
任平生
eb467afae1
fix(utils): 支持将微信公众号转载文章的正文抓取回来 ( #9534 )
...
* feat: 修正日期时间匹配规则、移除一些不必要评论元素
* fix(route)(fortunechina): 修正财富中文网1、双语文章中文内容重复问题;2、移除 kol 大头像
* fix(route)(wechat): 支持将微信公众号转载文章的正文抓取回来
2022-04-15 19:23:09 +08:00
Tony
bafb3534e1
feat(utils): random user agent ( #9449 )
...
* feat(utils): random ua
* chore: bump rand-user-agent to 1.0.58(no more deps)
2022-04-12 17:51:07 +08:00
Ethan Shen
a25fd4b67f
fix(utils): wrong date with same weekday in parse-date ( #9506 )
2022-04-10 23:40:35 +08:00
Chenxing Luo
ef94bcde8e
feat(route): add two blogs: Stratechery & Miris Whispers ( #9496 )
...
* Add two blogs
* fix(utils): add parseDate timezone common-config
* Update lib/v2/stratechery/index.js
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
* Update docs/blog.md
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
* Update docs/en/blog.md
Co-authored-by: Tony <TonyRL@users.noreply.github.com >
* Update lib/v2/miris/blog.js
2022-04-08 23:48:28 +08:00
Rongrong
74e1f88a32
feat(core)(utils/wechat-mp): normalize URL ( #9497 )
...
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
2022-04-08 18:42:52 +08:00
Rongrong
a79cc20ec1
feat(utils): add utils for WeChat MP ( #9487 )
...
Motivation:
There are multiple routes that need to fetch articles from WeChat MP.
However, letting them fetch articles by themselves could potentially
lead to cache key collisions. Even if cache key collisions do not occur,
un-normalized URL could potentially lead to duplicated requests.
What's more, articles from WeChat MP have weird formats and need to be
fixed. Creating a universal function to do this work can create some
ease for new route contributors.
Note:
In order to make this PR atomic as much as possible, I did not touch
those broken routes. Once this PR is merged, I will try to fix them.
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
2022-04-07 21:46:15 +08:00
Rongrong
5b35471e39
fix(core): offending RFC4287 ( #9441 )
...
* fix(core): offending RFC4287
should not leave `<updated>` blank when `<published>` is not blank
these two fields MUST conform to the "date-time" production in RFC3339
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
* test(common-utils): complete tests
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
* test(template): restrict expected value of `pubDate`
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
2022-04-02 17:44:45 +08:00
Rongrong
95db3b4e99
fix(core): torrent searching error ( #9407 )
...
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
2022-03-29 21:12:16 +08:00
Tony
cd151f45ab
fix(utils): parseDate custom format not working
2022-03-28 22:25:40 +08:00
Ethan Shen
a3396029ec
fix(utils): wrong last weekday for relative date ( #9397 )
2022-03-27 23:39:44 +08:00
Ethan Shen
e42bdf3d97
fix(utils): typo in parse-date ( #9391 )
...
* fix(utils): typo in parse-date
* fix: add `后日`
2022-03-26 19:44:27 +08:00
Ethan Shen
d5648b37ee
fix(utils): parse relative dates with multiple time units ( #9365 )
...
* fix(utils): parse relative dates with multiple time units
* docs: remove warning
* fix: add more characters to match
* fix: rename to parse-date
2022-03-26 16:45:42 +08:00
Rongrong
c48ca6bd5b
fix(core): invalid feed fields ( #9286 )
...
Signed-off-by: Rongrong <15956627+Rongronggg9@users.noreply.github.com >
2022-03-22 02:13:15 +08:00
Tony
2c2bf4ed09
fix: typo in puppeteer options
2022-03-06 23:42:32 +08:00
DIYgod
35c9834049
fix: puppeteer userDataDir
2021-11-28 01:21:42 +00:00
NeverBehave
0792f7ba25
feat(core): first attempt to init script standard ( #8224 )
...
- lazy load
- rate limit per path
- init .debug.json support
- docs
- maintainer
- radar
2021-09-22 05:41:00 -07:00
Daniel Li (李丹阳)
d77a039f05
style(eslint): add no-implicit-coercion rule ( #8175 )
...
* refactor: add no-implicit-coercion rule for ESLint
* fix: errors from deepscan
* fix: errors from deepscan
* fix: errors from deepscan
* fix: errors from deepscan
* fix: errors from deepscan
* Update docs/en/joinus/quick-start.md
Co-authored-by: Sukka <isukkaw@gmail.com >
* Update docs/joinus/quick-start.md
Co-authored-by: Sukka <isukkaw@gmail.com >
* Update lib/routes/av01/tag.js
Co-authored-by: Sukka <isukkaw@gmail.com >
* Update lib/routes/gov/taiwan/mnd.js
Co-authored-by: Sukka <isukkaw@gmail.com >
* Update lib/routes/ps/product.js
Co-authored-by: Sukka <isukkaw@gmail.com >
* refactor: minify html string
Co-authored-by: Sukka <isukkaw@gmail.com >
2021-09-15 21:22:11 +08:00
Sukka
31720bbb1b
perf: lazy require dependencies ( #8025 )
2021-08-20 14:05:57 +08:00
Sukka
d82847f541
style/chore(eslint): enforce new rules ( #8040 )
...
* style: prefer object shorthand syntax
* refactor: prefer Array#map over Array#forEach
* style: prefer arrow callback
* chore(eslint): update rules
* style: auto fix by eslint
2021-08-17 22:23:23 +08:00
Sukka
6e3b58ed1d
refactor: avoid promise overhead ( #8028 )
2021-08-16 11:45:53 -07:00
Chih-Hsuan Yen
1c9c4ccfc8
fix(core): make sure timeout error messages include URLs ( #7981 )
...
Before this fix, timeout messages are not quite useful
> error: Request undefined fail, retry attempt #1 : TimeoutError: Timeout awaiting 'request' for 5000ms
2021-08-12 01:27:07 -07:00
Queensferry
65e74a1c5e
chore(utils): parse-date supports relative time & fix routes ( #7530 )
2021-05-14 23:08:33 -04:00
GitHub Action
e1b3b5d877
style: auto format
2021-05-08 21:49:05 +00:00
Queensferry
10f5bb7bce
refactor: timezone conversion in lib/utils/date.js ( #7438 )
2021-05-08 17:45:37 -04:00
DIYgod
89e82d88fa
feat: got request timeout
2021-02-01 20:06:49 +08:00
DIYgod
c5e3a27f44
feat: auto add headers.host
2020-12-17 18:54:29 +08:00
Herb Brewer
d4bdf8c7e8
feat: add 豆瓣用户想看 ( #6285 )
2020-12-04 15:44:33 +00:00
Shun Zi
3442ca9196
feat: more readable twitter tweet ( #6051 )
2020-10-30 09:09:10 +00:00
Henry
ec49562269
Revert #5271
...
Incomplete PR. #5261
2020-07-29 17:25:41 +01:00
sabuaka18
eddba23099
feat: javbus routes: director label studio ( #5271 )
...
Co-authored-by: zrenca <42361841+zrenca@users.noreply.github.com >
2020-07-28 15:59:30 +01:00
hoilc
90030ec017
fix anitama date ( #5255 )
2020-07-28 05:54:43 +01:00