Commit graph

144 commits

Author SHA1 Message Date
Kenneth Gitere
abaa7d37df dev: update packages and disable paperteer features 2022-02-01 20:16:29 +03:00
Kenneth Gitere
e777426c1b feat: add reinsertion of title as <h1> requested in #22 2021-12-30 07:58:19 +03:00
Kenneth Gitere
3bf0719c8e feat: add fetch_html_from_puppeteer fn 2021-10-18 10:03:09 +03:00
Kenneth Gitere
796a34a34c bump version 2021-08-24 07:40:05 +03:00
Kenneth Gitere
dc16f9f52b
Merge pull request #21 from hipstermojo/dev
v0.6.1 release
2021-08-24 07:24:56 +03:00
Kenneth Gitere
d4a23088a9 test: add cli tests 2021-08-24 07:20:23 +03:00
Kenneth Gitere
07479afeac refactor: refactor update_imgs_base64
chore: add doc comment on ResourceType alias

fix: add error when image MIME type is invalid on an image
2021-07-28 10:00:45 +03:00
Kenneth Gitere
0b19376f59 test: add tests for html module 2021-07-27 18:43:08 +03:00
Kenneth Gitere
0357eaebb6 fix: fix insert_appendix function when inserting HTML nodes
refactor: remove check for `<head>` in inline_css

The `<head>` element is automatically added when parsing an HTML document,
therefore, the program should panic if it still does not find the `<head>`
element
2021-07-27 18:42:17 +03:00
Kenneth Gitere
9c2232e37f fix: add validation when passing inline-images flag 2021-07-27 18:38:01 +03:00
Kenneth Gitere
3958261cda
Merge pull request #20 from hipstermojo/dev
v0.6.0 release
2021-07-24 13:54:50 +03:00
Kenneth Gitere
40cf5b06c9 chore: update README
chore: bump version
2021-07-24 13:29:55 +03:00
Kenneth Gitere
e6f901eb5a refactor: rename Extractor to Article 2021-07-24 12:43:40 +03:00
Kenneth Gitere
eac28da798 fix: add validation when passing --inline-toc
feat: add coloring when displaying CLI errors
2021-07-24 12:36:33 +03:00
Kenneth Gitere
2f4da824ba feat: add HTML exports with inlining of images
fix: typo fix
refactor: refactor `add_stylesheets` function
2021-07-24 12:08:18 +03:00
Kenneth Gitere
d1d1a0f3f4 feat: add no-css and no-header-css flags for #19
refactor: change to yaml configuration for the CLI

refactor: change all flags to kebab case
2021-07-22 08:50:08 +03:00
Kenneth Gitere
d67169425d fix: fix serialization of element attributes 2021-07-16 07:45:20 +03:00
Kenneth Gitere
6b1a826ccc
Merge pull request #18 from hipstermojo/dev
v0.5.0 release
2021-06-24 08:36:11 +03:00
Kenneth Gitere
92c97ca2cf fix: add .epub extension as fallback
chore: update dependencies and update README
chore: bump version
2021-06-24 08:26:40 +03:00
Kenneth Gitere
754365a42a feat: add inline-toc flag 2021-06-17 17:32:53 +03:00
Kenneth Gitere
c6c10689eb fix: fix broken links in toc generation
the fix involves ensuring the ToC is generated prior to serialization
because it mutates the document and will not work otherwise.

chore: add .vscode config to .gitignore
2021-06-16 18:09:05 +03:00
Kenneth Gitere
282d229754 fix: fix ordering issue with merged articles
This commit adds the itertools crate which is used to dedup the Vec
when downloading urls

fix: fix error message
feat: change the serif and mono fonts declarations
2021-06-11 14:21:41 +03:00
Kenneth Gitere
4247fab1ea feat: add css library for EPUB exports 2021-06-09 08:04:50 +03:00
Kenneth Gitere
d50bbdfb58 fix: minor fixes
- restore default debug level when logging to file
- return early from generating epubs if there are no articles
- fix serialization bug in creating attributes
2021-06-09 07:26:52 +03:00
Kenneth Gitere
8691b0166f fix: fix panic when unwrapping a base URI
chore: add message when downloading articles to a specified output-dir
2021-06-08 20:37:20 +03:00
Kenneth Gitere
5fbfb9c806 refactor: move download function to http module
feat:  add rendering of table for partial downloads
feat:  add help message for enabling --log-to-file
chore: format flags to kebab-case and shorten --output-directory flag
2021-06-08 07:58:52 +03:00
Kenneth Gitere
95bd22f339 Merge branch 'dev' of github.com:hipstermojo/paperoni into dev 2021-06-07 22:44:51 +03:00
Kenneth Gitere
5b41e785b8 Fix get_header_level_toc_vec 2021-06-07 22:42:14 +03:00
Kenneth Gitere
16dc83ac62
Merge pull request #15 from sadsnake42/output-directory
Add `output_dir` to cli argument
2021-06-06 16:01:38 +03:00
Mikhail Gorbachev
67e86e4d74 Refactor LogError 2021-06-06 15:53:47 +03:00
Mikhail Gorbachev
aa9258e122 Fix from PR#15
- refactor comments
- move `cli::Error` to `errors::ErrorCli`
- removed mixing of order of input urls
- move pure functionality if `init_logger` to clear function
2021-06-06 13:25:28 +03:00
Kenneth Gitere
a1156e10fc Add generate_header_ids function
Add h4 to header level ToC and update implementation
Add tests
2021-06-06 13:02:31 +03:00
Kenneth Gitere
8220cf29f7 Change function replace_metadata_value to replace_escaped_characters 2021-06-06 12:59:25 +03:00
Kenneth Gitere
5548ba4ba5 Merge branch 'dev' of github.com:hipstermojo/paperoni into dev 2021-06-06 09:24:17 +03:00
Kenneth Gitere
751b5702fe
Merge pull request #17 from philwrenn/dev
Removed unwrap to prevent unexpected panic.
2021-06-06 09:23:01 +03:00
Philip Wrenn
fd161455b4 Removed unwrap to prevent unexpected panic. 2021-06-05 23:17:55 -04:00
Mikhail Gorbachev
13ad14e73d Add output_dir to cli argument
- Add `output_dir` to cli argument
    - This argument allows you to save output files in a special folder, not just current dir
- Refactor 'cli.rs'
    - Add `Builder` for `AppConfig`
    - Add `Error` instead separated panics
- Upgrade dependencies
2021-06-01 18:18:14 +03:00
Kenneth Gitere
8c9783b596 feat: add header level table of contents for articles 2021-05-24 20:40:41 +03:00
Kenneth Gitere
3a8160412c refactor short_summary function in logs.rs to be less redundant 2021-05-24 20:40:41 +03:00
Kenneth Gitere
1cbbc7527f Update version 2021-05-24 20:33:05 +03:00
Kenneth Gitere
c916fb8493 Edit README 2021-05-13 12:26:23 +03:00
Kenneth Gitere
5ccbe1a17a Merge branch 'dev' of github.com:hipstermojo/paperoni into dev 2021-05-13 12:25:11 +03:00
Kenneth Gitere
102304544d
Merge pull request #14 from kxt/13-fix-lazy-images-laziness-check
Fix laziness check in fix_lazy_images
2021-05-12 07:12:46 +03:00
KOVACS Tamas
7649f6aa18 moz_readability/mod.rs: fix laziness check in fix_lazy_images
fix_lazy_images checks whether an img node is lazily loaded. An img is
considered lazily loaded if it does not have an src/srcset attribute, or
if it's class contains the 'lazy' string. If an img is considered lazy,
fix_lazy_images will make attempts to replace it's src.

However, if an img was missing the class attribute, it was incorrectly
assumed to be lazy and had it's src replaced.

Fixes hipstermojo/paperoni#13
2021-05-10 10:08:33 +02:00
KOVACS Tamas
d50f08b875 moz_readability/mod.rs: add testcase for issue #13
This patch adds a testcase for issue #13, where an img node without
a class attribute is automatically assumed to be lazy and its src is
replaced.
2021-05-10 10:08:25 +02:00
Kenneth Gitere
312dff95e2
Merge pull request #12 from kxt/11-image-status-codes
Check response status for fetched images
2021-05-10 10:58:23 +03:00
KOVACS Tamas
8ec491ff06 http.rs: check response status for fetched images
This patch checks if fetching an image resulted in a non-success status
code. In case of non-success status, the response is discarded and an
error is emitted.

This relies on having 3xx codes already handled by surf's Redirect
middleware, so we should see 4xx and 5xx codes here.

Fixes hipstermojo/paperoni#11
2021-05-09 14:35:55 +02:00
KOVACS Tamas
4581f07330 http.rs: extract process_img_response function 2021-05-08 21:32:15 +02:00
Kenneth Gitere
474d97c6bd
Merge pull request #10 from hipstermojo/dev
v0.4.0 release
2021-04-30 08:48:11 +03:00
Kenneth Gitere
538a65f6fd Update dependencies in lockfile 2021-04-30 08:34:09 +03:00