Skip to content

Budzich Maxim #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ wheels/
*.egg-info/
.installed.cfg
*.egg
.idea
MANIFEST

# PyInstaller
Expand Down
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
include app/support_files/templates/fb2/*
include app/support_files/templates/html/*
114 changes: 114 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
## It is a one-shot command-line RSS reader by Zviger.
### Installation
Install [Python3.8](https://www.python.org/downloads/)

Install [pip](https://pip.pypa.io/en/stable/installing/)

Install GIT.
This [link](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
may be useful in this matter.
Clone this repository in the folder you need using this command
```text
git clone https://github.com/Zviger/PythonHomework
```
after change the branch to the project branch and run
```text
git checkout final_proj
```
Now you can install the application itself using this command from the application folder
```text
pip install . --user
```
You will also need MongoDB. You can install and run
MongoDB on your system using this [link](https://docs.mongodb.com/manual/installation/).
But you can install [Docker](https://docs.docker.com/install/) and
[Docker-Compose](https://docs.docker.com/compose/install/).


To start the container with MongoDB, run the following command in the application folder
```text
docker-compose up
```
* The application and database are connected through port 27017.

You can stop container with MongoDB by command
```text
docker-compose stop
```
and run again
```text
docker-compose start
```
You can execute the command
```text
docker-compose down
```
but you will lose all saved data from the database.


Congratulations!

Run
```text
rss-reader --help
```
to learn about the features of the application and start using it.
### User interface
```text
usage: rss-reader [-h] [--version] [-l LIMIT] [--verbose] [--json] [--length LENGTH] [--date DATE] [--to_html PATH] [--to_fb2 PATH] [--colorize] source

positional arguments:
source RSS URL

optional arguments:
-h, --help show this help message and exit
--version Print version info
-l LIMIT, --limit LIMIT
Limit news topics if this parameter provided
--verbose Print result as JSON in stdout
--json Outputs verbose status messages
--length LENGTH Sets the length of each line of news output
--date DATE Search past news by date in format yeardaymonth (19991311)
--to_html PATH Save news by path in html format
--to_fb2 PATH Save news by path in fb2 format
--colorize Make console text display colorful
```

### Json structure
```json
[
{
"title": "Yahoo News - Latest News & Headlines",
"link": "https://www.yahoo.com/news",
"items":
[
{
"title": "Sorry, Hillary: Democrats don't need a savior",
"link": "https://news.yahoo.com/sorry-hillary-democrats-dont-need-a-savior-194253123.html",
"author": "no author",
"published_parsed": [2019, 11, 13, 19, 42, 53, 2, 317, 0],
"description": "With the Iowa caucuses fast approaching, Hillary Clinton is just the latest in the colorful cast of characters who seem to have surveyed the sprawling Democratic field, sensed something lacking and decided that \u201csomething\u201d might be them.",
"img_links":
[
"http://l.yimg.com/uu/api/res/1.2/xq3Ser6KXPfV6aeoxbq9Uw--/YXBwaWQ9eXRhY2h5b247aD04Njt3PTEzMDs-/https://media-mbst-pub-ue1.s3.amazonaws.com/creatr-uploaded-images/2019-11/14586fd0-064d-11ea-b7df-7288f8d8c1a7"
]
}
]
}
]
```
### Cashing
The news is saved to the database when news output commands are executed. MongoDB is used as a database management system.
When the --date parameter is used, news is downloaded from the database by the entered date and the entered RSS link.

Features:
* The --limit parameter affects the amount of data loaded into the database.
* Date must be written in the yearmonthday (example - 19991113) format.

### Saving in files
Using the "--to_html" and "--to_fb2" parameters, you can save files at a given path.
The path should be written in the style of UNIX systems (example: ./some/folder).
File names are formed using the "feed[index].[format]" template (example: feed13.html).
File indices go sequentially and a new file fills this sequence or is set to the end.
What does this mean: if, for example, there are files "feed1.html" and "feed3.html",
a new file will be created with the name "feed2.html".
1 change: 1 addition & 0 deletions app/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = "5.1"
9 changes: 9 additions & 0 deletions app/core.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from app.support_files.rss_reader import Reader


def main() -> None:
Reader.exec_console_args()


if __name__ == "__main__":
main()
Empty file added app/support_files/__init__.py
Empty file.
30 changes: 30 additions & 0 deletions app/support_files/app_logger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
"""
This module provides functions to work with logging.
"""
import logging
import sys
from logging import Logger


def init_logger(name: str) -> Logger:
"""
Initialize and return logger object.
:param name: Name of the logger object.
"""
logger = logging.getLogger(name)
logger.setLevel(logging.INFO)
# create the logging file handler
stream_handler = logging.StreamHandler(sys.stdout)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(message)s')
stream_handler.setFormatter(formatter)
# add handler to logger object
logger.addHandler(stream_handler)
return logger


def get_logger(name: str) -> Logger:
"""
Return logger object.
:param name: Name of the logger object.
"""
return logging.getLogger(name)
29 changes: 29 additions & 0 deletions app/support_files/args_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
"""
This module is a parser of console arguments for this project.
"""
import argparse
from argparse import Namespace

import app


def get_args() -> Namespace:
"""
Function, that parse console args.
:return: An object that provides the values ​​of parsed arguments.
"""
parser = argparse.ArgumentParser(description="It is a python command-line rss reader")
parser.add_argument("source", help="RSS URL")
parser.add_argument("--version", action="version", version=f"%(prog)s {app.__version__}", help="Print version info")
parser.add_argument("-l", "--limit", type=int, help="Limit news topics if this parameter provided", default=-1)
parser.add_argument("--verbose", action="store_true", help="Print result as JSON in stdout", default=False)
parser.add_argument("--json", action="store_true", help="Outputs verbose status messages", default=False)
parser.add_argument("--length", type=int, help="Sets the length of each line of news output", default=120)
parser.add_argument("--date", type=str, help="Search past news by date in format yeardaymonth (19991311)",
default=None)
parser.add_argument("--to_html", metavar="PATH", type=str, help="Save news by path in html format", default=None)
parser.add_argument("--to_fb2", metavar="PATH", type=str, help="Save news by path in fb2 format", default=None)
parser.add_argument("--colorize", action="store_true", help="Make console text display colorful", default=False)
parser.parse_args()
args = parser.parse_args()
return args
5 changes: 5 additions & 0 deletions app/support_files/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
"""
This module provides configuration strings
"""

APP_NAME = "rss-reader"
104 changes: 104 additions & 0 deletions app/support_files/db_manager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
"""
This module contains class to work with database.
"""
import os
from dataclasses import asdict
from time import strptime, mktime, altzone, localtime, struct_time
from typing import Optional

from pymongo import MongoClient, errors

from app.support_files.app_logger import get_logger
from app.support_files.config import APP_NAME
from app.support_files.dtos import Feed, Item
from app.support_files.exeptions import FindFeedError, DateError, DBConnectError


class DBManager:
"""
Class to work with database.
"""

def __init__(self) -> None:
mongo_host = os.getenv('MONGO_HOST', '127.0.0.1')
client = MongoClient(f"mongodb://{mongo_host}:27017/")
try:
client.server_info()
except errors.ServerSelectionTimeoutError:
raise DBConnectError(f"Can't connect to database with host - {mongo_host} and port - 27017")
self._db = client["feed_db"]
self._collection = self._db["feed_collection"]
self._logger = get_logger(APP_NAME)

def insert_feed(self, feed: Feed) -> None:
"""
Insert feed in database.
If this feed exists in the database, then news is added that was not there.
:param feed: Feed, which should be inserted.
"""
self._logger.info("Loading data from database to join with inserted data is started")
cashed_feed = self.find_feed_by_link(feed.rss_link)
self._logger.info("Loading data from database to join with inserted data is finished")

if cashed_feed is not None:
items = set(feed.items)
cashed_items = set(cashed_feed.items)
result_items = list(set(items).union(set(cashed_items)))
result_items = list(map(asdict, result_items))
self._collection.update_one({"rss_link": feed.rss_link}, {"$set": {"items": result_items}})
else:
self._collection.insert_one(asdict(feed))
self._logger.info("New and old data are joined")

def find_feed_by_link(self, link: str) -> Optional[Feed]:
"""
Looks for feed in the database by rss link and returns it.
:param link: Rss link.
:return: Feed, if it exist, otherwise None.
"""
dict_feed = self._collection.find_one({"rss_link": link})
if dict_feed is None:
return None
del dict_feed["_id"]
feed = Feed(**dict_feed)
feed.items = [Item(**item) for item in dict_feed["items"]]
return feed

def find_feed_by_link_and_date(self, link: str, date: str, limit: int = -1) -> Feed:
"""
Looks for feed in the database by rss link and date and returns it.
Raise DateError, in it not exist.
:param link: Rss link.
:param date: Need date.
:param limit: Limit count of returned items.
:return: Feed, if it exist.
"""
try:
date = strptime(date, "%Y%m%d")
except ValueError as err:
raise DateError(err.__str__())
feed = self.find_feed_by_link(link)
if feed is None:
raise FindFeedError("This feed is not cashed")
result_items = []
count = limit
for item in feed.items:
item_date = struct_time(item.published_parsed)
l_i_date = localtime(mktime(tuple(item_date)) - altzone)
if (l_i_date.tm_year, l_i_date.tm_mon, l_i_date.tm_mday) == (date.tm_year, date.tm_mon, date.tm_mday):
result_items.append(item)
count -= 1
if count == 0:
break
feed.items = result_items
return feed

def truncate_collection(self) -> None:
"""
Truncate database.
"""
self._collection.delete_many({})


if __name__ == "__main__":
DBManager()
33 changes: 33 additions & 0 deletions app/support_files/dtos.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
"""
This module contains data classes to work with feeds.
"""
from dataclasses import dataclass, field
from time import struct_time, localtime, time
from typing import List


@dataclass
class Item:
"""
This class represents each item in feed.
"""
title: str = "no title"
link: str = "no link"
author: str = "no author"
published_parsed: struct_time = localtime(time())
description: str = "description"
img_links: List[str] = field(default_factory=list)

def __hash__(self) -> int:
return hash(str(self.__dict__))


@dataclass
class Feed:
"""
This class represents feed.
"""
rss_link: str
title: str = "no title"
link: str = "no link"
items: List[Item] = field(default_factory=list)
38 changes: 38 additions & 0 deletions app/support_files/exeptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
"""
This module provides exception classes.
"""


class FindFeedError(Exception):
"""
This class should be raised, if received some problems with getting feed.
"""
pass


class DateError(ValueError):
"""
This class should be raised, if received some problems with converting date.
"""
pass


class DirError(Exception):
"""
This class should be raised, if received path is not a directory.
"""
pass


class DirExistsError(Exception):
"""
This class should be raised, if directory which was received by bath not exists.
"""
pass


class DBConnectError(Exception):
"""
This class should be raised, if received some problems with connection with database.
"""
pass
Loading