- Add `output_dir` to cli argument
- This argument allows you to save output files in a special folder, not just current dir
- Refactor 'cli.rs'
- Add `Builder` for `AppConfig`
- Add `Error` instead separated panics
- Upgrade dependencies
- display of partial downloads in the summary
- custom file name that is displayed after the summary ensuring it is visible
- log-to-file flag which specifies that logs will be sent to the default directory
- verbose flag (v) used to configure the log levels
- disabling the progress bars when logging to the terminal is active
- Add ReadabilityError field
- Refactor `article` getter in Extractor to return a &NodeRef. This
relies on the assumption that the article has already been parsed
and should otherwise panic.
- Map errors in `fetch_html` to include the source url
- Change `article_link` to `article_source`
- Add `Into` conversion for `UTF8Error`
- Collect errors in `generate_epubs` for displaying in a table
Using this custom error type, many instances of unwrap are replaced
with mapping to errors that are then logged in main.rs. This allows
paperoni to stop crashing when downloading articles when the errors
are possibly recoverable or should not affect other downloads.
This subsequently introduces ignoring the failed image downloads
and instead leaving the original URLs intact.
This change allows for parallel downloads of HTML pages upto a maximum
number of concurrent HTTP requests which is more efficient than
before where all HTTP requests are likely to begin at the same time.
This adds:
- More validation of responses to ensure the HTML response is valid.
- Better handling of redirecting URLs which allows for fetching of
links proxied to Medium.
- Prevent downloading images with base64 strings as the source
- Add escaping of quotation characters in the serializer
- Disable redirects when downloading images which fails on multiple sites
- Remove invalid characters for making the epub export file name
- Fix version number in release
The bug fixes are for:
- <base> elements with "/" as the href
- articles containing an ampersand in the title which would create
corrupted manifest files.
Change from using res directory for image downloads to using temp directories.
Update surf to v2 which required changing the way Content-Type headers are
read from.