Commit graph

125 commits

Author SHA1 Message Date
Kenneth Gitere
754365a42a feat: add inline-toc flag 2021-06-17 17:32:53 +03:00
Kenneth Gitere
c6c10689eb fix: fix broken links in toc generation
the fix involves ensuring the ToC is generated prior to serialization
because it mutates the document and will not work otherwise.

chore: add .vscode config to .gitignore
2021-06-16 18:09:05 +03:00
Kenneth Gitere
282d229754 fix: fix ordering issue with merged articles
This commit adds the itertools crate which is used to dedup the Vec
when downloading urls

fix: fix error message
feat: change the serif and mono fonts declarations
2021-06-11 14:21:41 +03:00
Kenneth Gitere
4247fab1ea feat: add css library for EPUB exports 2021-06-09 08:04:50 +03:00
Kenneth Gitere
d50bbdfb58 fix: minor fixes
- restore default debug level when logging to file
- return early from generating epubs if there are no articles
- fix serialization bug in creating attributes
2021-06-09 07:26:52 +03:00
Kenneth Gitere
8691b0166f fix: fix panic when unwrapping a base URI
chore: add message when downloading articles to a specified output-dir
2021-06-08 20:37:20 +03:00
Kenneth Gitere
5fbfb9c806 refactor: move download function to http module
feat:  add rendering of table for partial downloads
feat:  add help message for enabling --log-to-file
chore: format flags to kebab-case and shorten --output-directory flag
2021-06-08 07:58:52 +03:00
Kenneth Gitere
95bd22f339 Merge branch 'dev' of github.com:hipstermojo/paperoni into dev 2021-06-07 22:44:51 +03:00
Kenneth Gitere
5b41e785b8 Fix get_header_level_toc_vec 2021-06-07 22:42:14 +03:00
Kenneth Gitere
16dc83ac62
Merge pull request from sadsnake42/output-directory
Add `output_dir` to cli argument
2021-06-06 16:01:38 +03:00
Mikhail Gorbachev
67e86e4d74 Refactor LogError 2021-06-06 15:53:47 +03:00
Mikhail Gorbachev
aa9258e122 Fix from PR#15
- refactor comments
- move `cli::Error` to `errors::ErrorCli`
- removed mixing of order of input urls
- move pure functionality if `init_logger` to clear function
2021-06-06 13:25:28 +03:00
Kenneth Gitere
a1156e10fc Add generate_header_ids function
Add h4 to header level ToC and update implementation
Add tests
2021-06-06 13:02:31 +03:00
Kenneth Gitere
8220cf29f7 Change function replace_metadata_value to replace_escaped_characters 2021-06-06 12:59:25 +03:00
Kenneth Gitere
5548ba4ba5 Merge branch 'dev' of github.com:hipstermojo/paperoni into dev 2021-06-06 09:24:17 +03:00
Kenneth Gitere
751b5702fe
Merge pull request from philwrenn/dev
Removed unwrap to prevent unexpected panic.
2021-06-06 09:23:01 +03:00
Philip Wrenn
fd161455b4 Removed unwrap to prevent unexpected panic. 2021-06-05 23:17:55 -04:00
Mikhail Gorbachev
13ad14e73d Add output_dir to cli argument
- Add `output_dir` to cli argument
    - This argument allows you to save output files in a special folder, not just current dir
- Refactor 'cli.rs'
    - Add `Builder` for `AppConfig`
    - Add `Error` instead separated panics
- Upgrade dependencies
2021-06-01 18:18:14 +03:00
Kenneth Gitere
8c9783b596 feat: add header level table of contents for articles 2021-05-24 20:40:41 +03:00
Kenneth Gitere
3a8160412c refactor short_summary function in logs.rs to be less redundant 2021-05-24 20:40:41 +03:00
Kenneth Gitere
1cbbc7527f Update version 2021-05-24 20:33:05 +03:00
Kenneth Gitere
c916fb8493 Edit README 2021-05-13 12:26:23 +03:00
Kenneth Gitere
5ccbe1a17a Merge branch 'dev' of github.com:hipstermojo/paperoni into dev 2021-05-13 12:25:11 +03:00
Kenneth Gitere
102304544d
Merge pull request from kxt/13-fix-lazy-images-laziness-check
Fix laziness check in fix_lazy_images
2021-05-12 07:12:46 +03:00
KOVACS Tamas
7649f6aa18 moz_readability/mod.rs: fix laziness check in fix_lazy_images
fix_lazy_images checks whether an img node is lazily loaded. An img is
considered lazily loaded if it does not have an src/srcset attribute, or
if it's class contains the 'lazy' string. If an img is considered lazy,
fix_lazy_images will make attempts to replace it's src.

However, if an img was missing the class attribute, it was incorrectly
assumed to be lazy and had it's src replaced.

Fixes 
2021-05-10 10:08:33 +02:00
KOVACS Tamas
d50f08b875 moz_readability/mod.rs: add testcase for issue
This patch adds a testcase for issue , where an img node without
a class attribute is automatically assumed to be lazy and its src is
replaced.
2021-05-10 10:08:25 +02:00
Kenneth Gitere
312dff95e2
Merge pull request from kxt/11-image-status-codes
Check response status for fetched images
2021-05-10 10:58:23 +03:00
KOVACS Tamas
8ec491ff06 http.rs: check response status for fetched images
This patch checks if fetching an image resulted in a non-success status
code. In case of non-success status, the response is discarded and an
error is emitted.

This relies on having 3xx codes already handled by surf's Redirect
middleware, so we should see 4xx and 5xx codes here.

Fixes 
2021-05-09 14:35:55 +02:00
KOVACS Tamas
4581f07330 http.rs: extract process_img_response function 2021-05-08 21:32:15 +02:00
Kenneth Gitere
474d97c6bd
Merge pull request from hipstermojo/dev
v0.4.0 release
2021-04-30 08:48:11 +03:00
Kenneth Gitere
538a65f6fd Update dependencies in lockfile 2021-04-30 08:34:09 +03:00
Kenneth Gitere
f93017ab73 Fix README formatting 2021-04-30 08:29:08 +03:00
Kenneth Gitere
4fd71311a1 Fix bug when validating the download file name in merged mode 2021-04-30 07:47:25 +03:00
Kenneth Gitere
cae9227ab0 Update documentation 2021-04-30 06:55:02 +03:00
Kenneth Gitere
c00582ac29 Fix verbosity levels ordering 2021-04-30 06:42:08 +03:00
Kenneth Gitere
ae52cc4e13 Add features for logging and cli
- display of partial downloads in the summary
- custom file name that is displayed after the summary ensuring it is visible
- log-to-file flag which specifies that logs will be sent to the default directory
- verbose flag (v) used to configure the log levels
- disabling the progress bars when logging to the terminal is active
2021-04-29 20:02:08 +03:00
Kenneth Gitere
00d704fdd6 Move initializing logger to logs module 2021-04-28 07:47:45 +03:00
Kenneth Gitere
36c3eb65c6 Add appendix page for listing the source of the article 2021-04-28 07:46:07 +03:00
Kenneth Gitere
088699b2c3 Add debug flag 2021-04-24 15:50:43 +03:00
Kenneth Gitere
a9787d7b5a Add colored output and configuring of a paperoni root directory for logs 2021-04-24 15:13:44 +03:00
Kenneth Gitere
65f8ebda56 Add logs crate for dealing with printing out the final download summary 2021-04-24 13:58:03 +03:00
Kenneth Gitere
a3de3fb6ff Add ImgError struct for representing errors in downloading article images 2021-04-24 13:57:06 +03:00
Kenneth Gitere
910c45abf7 Add logging configured to send to a file by default 2021-04-24 13:56:02 +03:00
Kenneth Gitere
c0323a6ae4 Minor refactor and add non zero exit upon failure to download any article
- Move printing of the successfully downloaded articles into main.rs
- Add summary text
2021-04-24 09:00:18 +03:00
Kenneth Gitere
b496abb576 Fix serialization issue with poorly defined attribute names 2021-04-22 19:00:32 +03:00
Kenneth Gitere
313041a109 Update dependencies and restore redirect middleware in download_images 2021-04-22 18:01:23 +03:00
Kenneth Gitere
960f114dc6 Minor fixes in moz_readability
- swap unwrap for if let statement in `get_article_metadata`
- add default when extracting the title from a possible `<title>` element
- fix extracting alternative titles from h1 tags
2021-04-21 19:52:41 +03:00
Kenneth Gitere
dbac7c3b69 Refactor grab_article to return a Result
- Add ReadabilityError field
- Refactor `article` getter in Extractor to return a &NodeRef. This
  relies on the assumption that the article has already been parsed
  and should otherwise panic.
2021-04-21 19:11:57 +03:00
Kenneth Gitere
ae1ddb9386 Add printing of table for failed article downloads
- Map errors in `fetch_html` to include the source url
- Change `article_link` to `article_source`
- Add `Into` conversion for `UTF8Error`
- Collect errors in `generate_epubs` for displaying in a table
2021-04-20 21:33:24 +03:00
Kenneth Gitere
60fb30e8a2 Add url field in Extractor struct 2021-04-20 21:06:54 +03:00