- Repository: https://github.com/harzo/pragmatics
- Docker settings in Dockerfile and docker-compose.yml
- Possible AWS deploy: http://18.216.240.105/
- db: used PostgreSQL db on private server
- imdbparser: used in 'ybsuggestions/crawler' module (imdbpyparser.py)
- asyncio: used in 'ybsuggestions/crawler' module (jobs.py), main loop created in 'ybsuggestions' init
- pytest: test stored in 'tests' directory
- YBParser in 'ybsuggestions/crawler' module.
- For parsing xml used 'feedparser' library
- For parsing torrents names used 'ptn' library
- Because of often YourBitTorrent server problems and Cloudflare CAPTCHA screen, on deployment used similar feed from https://rarbg.to/rssdd.php?category=movies
- Job 'job_check_new_movies' in 'ybsuggestions/crawler/jobs.py' create async task for each movie.
- Async 'update_imdb_info' in 'ybsuggestions/crawler/jobs.py' runs IMDbSearchParser
- IMDbSearchParser in 'ybsuggestions/crawler' is based on BeautifulSOup used by 'imdbpy'.
- Side Thread created in 'ybsuggestions' init.
- Job 'job_check_new_movies' added to schedule executing script each 12 hours.
- Used Flask sql-alchemy to managing objects (Movie, Profile, Genre)
- For Movie created special data access class - MovieDAO in 'ybsuggestions/crawler/moviedao.py'
- Created as Flask blueprint in 'ybsuggestions/application/apis.py'
Mini-service suggesting movies to download from torrents
- setuptools
- docker
- db of choice (postgresql/mysql/mongo/couchdb)
- use imdbpy source (http://imdbpy.sourceforge.net/) to figure out IMDB api you need (please mind that this library will not work for Python 3.x)
- pytest
- asyncio
- Write a service which 2 times a day parses the rss showing movies added that day: https://yourbittorrent.com/movies/rss.xml
- Save it to a db of choice, mind duplicates.
- For each movie fetch it?s current imdb rating and genres.
- Generate suggestions for a movie to download for each ?profile? saved in database. A single profile consists of:
- minimal imdb rating
- whitelist of genres (if at least one matches, it matches)
- blacklist of genres (if at least one matches, it doesn?t match)
- profile_id
- Expose json API showing:
- GET to list the suggestions for profile_id sorted with newest on top
- POST call to dismiss a suggestion (mark it dismissed)
- POST call to say a suggestion was a good one (mark it as good)