the fix involves ensuring the ToC is generated prior to serialization
because it mutates the document and will not work otherwise.
chore: add .vscode config to .gitignore
This commit adds the itertools crate which is used to dedup the Vec
when downloading urls
fix: fix error message
feat: change the serif and mono fonts declarations
- restore default debug level when logging to file
- return early from generating epubs if there are no articles
- fix serialization bug in creating attributes
- Add ReadabilityError field
- Refactor `article` getter in Extractor to return a &NodeRef. This
relies on the assumption that the article has already been parsed
and should otherwise panic.
Using this custom error type, many instances of unwrap are replaced
with mapping to errors that are then logged in main.rs. This allows
paperoni to stop crashing when downloading articles when the errors
are possibly recoverable or should not affect other downloads.
This subsequently introduces ignoring the failed image downloads
and instead leaving the original URLs intact.
The bug fixes include:
- `<html>` nodes being added to the replaced image when `unwrap_noscript_tags`
is called.
- Remove `srcset` attribute of <img> tags after downloading the image. This
prevented readers like Foliate from displaying the downloaded image
- Prevent downloading images with base64 strings as the source
- Add escaping of quotation characters in the serializer
- Disable redirects when downloading images which fails on multiple sites
- Remove invalid characters for making the epub export file name
- Fix version number in release
Change from using res directory for image downloads to using temp directories.
Update surf to v2 which required changing the way Content-Type headers are
read from.