Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ feat: 批量支持按照稿件发布时间过滤 #146

Merged
merged 6 commits into from
Jun 10, 2023

Conversation

lc4t
Copy link
Contributor

@lc4t lc4t commented Jun 8, 2023

动机

related #133

解决方案

提供了--batch-filter-start-time--batch-filter-end-time,定义了一个时间过滤闭区间,并在utils定义了一个filter作为全局静态变量,在批量ugc视频信息获取时根据该区间过滤。

类型

  • ✨ feat: 添加新功能
  • 🐛 fix: 修复 bug
  • 📝 docs: 对文档进行修改
  • ♻️ refactor: 代码重构(既不是新增功能,也不是修改 bug 的代码变动)
  • ⚡ perf: 提高性能的代码修改
  • 🧑‍💻 dx: 优化开发体验
  • 🔨 workflow: 工作流变动
  • 🏷️ types: 类型声明修改
  • 🚧 wip: 工作正在进行中
  • ✅ test: 测试用例添加及修改
  • 🔨 build: 影响构建系统或外部依赖关系的更改
  • 👷 ci: 更改 CI 配置文件和脚本
  • ❓ chore: 其它不涉及源码以及测试的修改
  • ⬆️ deps: 依赖项修改
  • 🔖 release: 发布新版本

@lc4t lc4t marked this pull request as ready for review June 9, 2023 11:33
@lc4t lc4t requested a review from SigureMo June 9, 2023 11:33
@SigureMo SigureMo changed the title feat: 批量支持按照稿件发布时间过滤 ✨ feat: 批量支持按照稿件发布时间过滤 Jun 10, 2023
@@ -123,15 +123,15 @@ async def get_ugc_video_info(session: ClientSession, avid: AvId) -> _UgcVideoInf
}


async def get_ugc_video_list(session: ClientSession, avid: AvId) -> UgcVideoList:
async def get_ugc_video_list(session: ClientSession, avid: AvId, pubdata_fmt: str = "%Y-%m-%d") -> UgcVideoList:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async def get_ugc_video_list(session: ClientSession, avid: AvId, pubdata_fmt: str = "%Y-%m-%d") -> UgcVideoList:
async def get_ugc_video_list(session: ClientSession, avid: AvId, pubdate_fmt: str = "%Y-%m-%d") -> UgcVideoList:

- 参数 `--batch-filter-start-time` 和 `--batch-filter-end-time` 分别表示`开始`和`结束`时间,该区间左右都是闭区间
- 默认 `不限制`
- 格式 `%Y-%m-%d` 或 `%Y-%m-%d %H:%M:%S`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里给出一个示例吧~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,我来处理

@@ -110,6 +110,9 @@ def cli() -> argparse.ArgumentParser:
"-af", "--alias-file", type=argparse.FileType("r", encoding="utf-8"), help="设置 url 别名文件路径"
)

group_common.add_argument("--batch-filter-start-time", help="批量下载时,只下载该时间之后发布的稿件")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

仅批量下载使用的话,应该加在 group_batch 里?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

咦我看下,没太注意这个参数分组

Filter.batch_filter_start_time
<= datetime.datetime.strptime(datestr, "%Y-%m-%d %H:%M:%S")
<= Filter.batch_filter_end_time
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if start is None:
    start = datetime.datetime.min
if end is None:
    end = datetime.datetime.max
return start <= datetime.datetime.strptime(datestr, "%Y-%m-%d %H:%M:%S") <= end

这样是不是会简单些?

video_info = await get_ugc_video_info(session, avid)
if avid not in [video_info["aid"], video_info["bvid"]]:
avid = video_info["avid"]
video_title = video_info["title"]
result: UgcVideoList = {
"title": video_title,
"avid": avid,
"pubdate": get_time_str_by_stamp(video_info["pubdate"], "%Y-%m-%d"), # TODO: 可自由定制
"pubdate": get_time_str_by_stamp(video_info["pubdate"], pubdata_fmt),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

唔,我觉得在这里格式化是不是不太合适,我们这里返回时间戳吧,在 resolve_path_template 处格式化

然后格式化时候支持如下语法 "{pubdate@%Y-%m-%d}",以允许用户自由定制

这个我稍后提一个 PR 来实现下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好,那这个你来处理~

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只是一点点格式上的建议~

yutto/extractor/favourites.py Outdated Show resolved Hide resolved
yutto/__main__.py Outdated Show resolved Hide resolved
lc4t and others added 3 commits June 10, 2023 20:55
Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>
Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>
@SigureMo SigureMo merged commit bc73ca3 into yutto-dev:main Jun 10, 2023
@lc4t lc4t deleted the batch_set_start branch June 10, 2023 14:54
@FrankHB
Copy link
Contributor

FrankHB commented Jun 17, 2023

默认值看来没法用:

 File "/usr/lib/python3.11/site-packages/yutto/utils/filter.py", line 29, in verify_timer
    return Filter.batch_filter_start_time.timestamp() <= timestamp < Filter.batch_filter_end_time.timestamp()

ValueError: year 0 is out of range

https://bugs.python.org/issue31212

@FrankHB FrankHB mentioned this pull request Jun 17, 2023
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants