fix_lazy_images checks whether an img node is lazily loaded. An img is
considered lazily loaded if it does not have an src/srcset attribute, or
if it's class contains the 'lazy' string. If an img is considered lazy,
fix_lazy_images will make attempts to replace it's src.
However, if an img was missing the class attribute, it was incorrectly
assumed to be lazy and had it's src replaced.
Fixeshipstermojo/paperoni#13
- swap unwrap for if let statement in `get_article_metadata`
- add default when extracting the title from a possible `<title>` element
- fix extracting alternative titles from h1 tags
- Add ReadabilityError field
- Refactor `article` getter in Extractor to return a &NodeRef. This
relies on the assumption that the article has already been parsed
and should otherwise panic.
The code for title retrieval previously assumed that meta tags concerned
with the title would always contain a value but some sites leave the value
empty thus it had to be checked for as well.
The bug fixes include:
- `<html>` nodes being added to the replaced image when `unwrap_noscript_tags`
is called.
- Remove `srcset` attribute of <img> tags after downloading the image. This
prevented readers like Foliate from displaying the downloaded image
The bug fixes are for:
- <base> elements with "/" as the href
- articles containing an ampersand in the title which would create
corrupted manifest files.
When calling `detach` in a for loop or `for_each` iterator consumer,
only the first node is ever deleted.
Fix replacement of table nodes in prep_article
Edit clean_conditionally to remove unnecessary assignment.
Extracted constants from the code for easier reusability in some cases.
Change select queries for multiple elements to use the `,` operator
instead of calling `chain`.
Remove check for "null" in `fix_lazy_images`. This mitigates a JSOM
issue so it doesn't affect the Rust code in any way.