Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
248 changes: 226 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,232 @@
# How to create a PR with a homework task
RSS reader
=========

1. Create fork from the following repo: https://github.com/E-P-T/Homework. (Docs: https://docs.github.com/en/get-started/quickstart/fork-a-repo )
2. Clone your forked repo in your local folder.
3. Create separate branches for each session.Example(`session_2`, `session_3` and so on)
4. Create folder with you First and Last name in you forked repo in the created session.
5. Add your task into created folder
6. Push finished session task in the appropriate branch in accordance with written above.
You should get the structure that looks something like that
This is RSS reader version 1.0.

rss_reader.py is a python script intended to get RSS feed from given source URL
and write its content to standart output.

Please be carefull with redirecting output to files. In this case CPython implementation
of Python interpreter will change encoding from UTF-8 to
the system locale encoding (i.e. the ANSI codepage).

This script will try to install all required packages from PyPI with pip in
the current environment.

Tests
------

To launch tests run

on Windows

```shell
python -m unittest tests.py
```

on Linux

```bash
python3 -m unittest tests.py
```

To check test coverage run

on Windows

```shell
python -m coverage run --source=rss_reader -m unittest tests.py
python -m coverage report -m
```

on Linux

```bash
python3 -m coverage run --source=rss_reader -m unittest tests.py
python3 -m coverage report -m
```

All specified above commands should be used when current directory is the directory with rss_reader.py

How to execute without installation
------

Before installation there are two ways to start RSS reader

1. Using module loading. Run from directory with rss_reader.py file the following command

on Windows

```shell
python -m rss_reader ...
```

on Linux

```bash
python3 -m rss_reader ...
```

2. Specifying the script file. Run from directory with rss_reader.py file the following command

on Windows

```shell
python rss_reader.py ...
```

on Linux

```bash
python3 rss_reader.py ...
```

Installation
------

To install the script as site-package to python environment run the following command

on Windows

```shell
python setup.py install
```
Branch: Session_2
DzmitryKolb
|___Task1.py
|___Task2.py
Branch: Session_3
DzmitryKolb
|___Task1.py
|___Task2.py

on Linux

```bash
python3 setup.py install
```

7. When you finish your work on task you should create Pull request to the appropriate branch of the main repo https://github.com/E-P-T/Homework (Docs: https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork).
Please use the following instructions to prepare good description of the pull request:
- Pull request header should be: `Session <Number of the session> - <FirstName> <LastName>`.
Example: `Session 2 - Dzmitry Kolb`
- Pull request body: You should write here what tasks were implemented.
Example: `Finished: Task 1.2, Task 1.3, Task 1.6`
How to execute after installation
------

Before installation there are three ways to start RSS reader

1. Using module loading. Run from any directory

on Windows

```shell
python -m rss_reader ...
```

on Linux

```bash
python3 -m rss_reader ...
```

2. Specifying the script file. Run from directory with rss_reader.py file the following command

on Windows

```shell
python rss_reader.py ...
```

on Linux

```bash
python3 rss_reader.py ...
```

3. Using entry point. Run from any directory

```shell
rss_reader ...
```

Command line format
-------

usage: rss_reader.py [-h] [--version] [--json] [--verbose] [--limit LIMIT] [--date DATE] [--to-html HTML_DEST]
[--to-pdf PDF_DEST]
source

Pure Python command-line RSS reader.

positional arguments:
source RSS URL

optional arguments:
-h, --help show this help message and exit
--version Print version info
--json Print result as JSON in stdout
--verbose Outputs verbose status messages
--limit LIMIT Limit news topics if this parameter provided
--date DATE Get from cache news that was published after specified date (date should be specified in format
YYYYmmdd, for example --date 20191020)
--to-html HTML_DEST Store feed in HTML as specified file
--to-pdf PDF_DEST Store feed in PDF as specified file

JSON representation
-------

```json
{
"title": Title of the feed,
"link": URL of feed,
"description": Description of the feed,
"items": [
{
"title": Item title if present,
"pubDate": Publication date if present,
"link": URL of the item if present,
"description": Description of the item,
"links": [
[
Link URL,
Link type
],
...
]
},
...
]
}
```

Cache storage format
------

News cache is stored in file rss_reader.cache in current working directory

Content of the cache file is serialized dictionary by module `pickle`.

Keys of the dictionary are URLs of retieved feeds.

For each key in dictionary appropriate value is the result of parsing feed with merged item lists.

Items from all retrieval of the same URL will be merged together in single list.

The result of parsing feed is the dictionary with following keys:

- `title` - title of the feed

- `link` - link to the feed

- `description` - description of the feed

- `items` - list of parsed items of the feed.

The result of parsing item of feed is dictionary with following keys:

- `title` - title of the item

- `pubDate` - publication date of the item

- `link` - link to resource related to the item

- `description` - description of the item

- `images` - dictionary of collected images (keys are URLs of images and values are their content as bytes object)

- `links` - a list of links collected for the item.

There is a tuple in the list `links` for each link collected for the item. The tuple has two elements:

- URL of the link

- type of the link. It may be just link if type is html or unknown.
Or it may be type part of MIME type of the resource.
14 changes: 14 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
beautifulsoup4==4.11.1
bs4==0.0.1
certifi==2022.6.15
charset-normalizer==2.1.0
defusedxml==0.7.1
fpdf2==2.5.5
idna==3.3
lxml==4.9.0
Pillow==9.1.1
python-dateutil==2.8.2
requests==2.28.1
six==1.16.0
soupsieve==2.3.2.post1
urllib3==1.26.9
Loading