chore: update README

chore: bump version
This commit is contained in:
Kenneth Gitere 2021-07-24 13:29:14 +03:00
parent e6f901eb5a
commit 40cf5b06c9
4 changed files with 95 additions and 19 deletions

2
Cargo.lock generated
View file

@ -1552,7 +1552,7 @@ dependencies = [
[[package]] [[package]]
name = "paperoni" name = "paperoni"
version = "0.5.0-alpha1" version = "0.6.0-alpha1"
dependencies = [ dependencies = [
"async-std", "async-std",
"base64", "base64",

View file

@ -3,7 +3,7 @@ description = "A web article downloader"
homepage = "https://github.com/hipstermojo/paperoni" homepage = "https://github.com/hipstermojo/paperoni"
repository = "https://github.com/hipstermojo/paperoni" repository = "https://github.com/hipstermojo/paperoni"
name = "paperoni" name = "paperoni"
version = "0.5.0-alpha1" version = "0.6.0-alpha1"
authors = ["Kenneth Gitere <gitere81@gmail.com>"] authors = ["Kenneth Gitere <gitere81@gmail.com>"]
edition = "2018" edition = "2018"
license = "MIT" license = "MIT"

View file

@ -8,7 +8,7 @@
</a> </a>
</div> </div>
Paperoni is a CLI tool made in Rust for downloading web articles as EPUBs. There is provisional<sup><a href="#pdf-exports">\*</a></sup> support for exporting to PDF as well. Paperoni is a CLI tool made in Rust for downloading web articles as EPUB or HTML files. There is provisional<sup><a href="#pdf-exports">\*</a></sup> support for exporting to PDF as well.
> This project is in an alpha release so it might crash when you use it. Please open an [issue on Github](https://github.com/hipstermojo/paperoni/issues/new) if it does crash. > This project is in an alpha release so it might crash when you use it. Please open an [issue on Github](https://github.com/hipstermojo/paperoni/issues/new) if it does crash.
@ -23,7 +23,7 @@ Check the [releases](https://github.com/hipstermojo/paperoni/releases) page for
Paperoni is published on [crates.io](https://crates.io). If you have [cargo](https://github.com/rust-lang/cargo) installed, then run: Paperoni is published on [crates.io](https://crates.io). If you have [cargo](https://github.com/rust-lang/cargo) installed, then run:
```sh ```sh
cargo install paperoni --version 0.5.0-alpha1 cargo install paperoni --version 0.6.0-alpha1
``` ```
_Paperoni is still in alpha so the `version` flag has to be passed._ _Paperoni is still in alpha so the `version` flag has to be passed._
@ -48,28 +48,43 @@ USAGE:
paperoni [OPTIONS] [urls]... paperoni [OPTIONS] [urls]...
OPTIONS: OPTIONS:
--export <type>
Specify the file type of the export. The type must be in lower case. [default: epub] [possible values:
html, epub]
-f, --file <file> -f, --file <file>
Input file containing links Input file containing links
-h, --help -h, --help
Prints help information Prints help information
--inline-images
Inlines the article images when exporting to HTML using base64.
This is used when you do not want a separate folder created for images during HTML export.
NOTE: It uses base64 encoding on the images which results in larger HTML export sizes as each image
increases in size by about 25%-33%.
--inline-toc --inline-toc
Add an inlined Table of Contents page at the start of the merged article. Add an inlined Table of Contents page at the start of the merged article. This does not affect the Table of Contents navigation
--log-to-file --log-to-file
Enables logging of events to a file located in .paperoni/logs with a default log level of debug. Use -v to Enables logging of events to a file located in .paperoni/logs with a default log level of debug. Use -v to
specify the logging level specify the logging level
--max-conn <max_conn> --max-conn <max-conn>
The maximum number of concurrent HTTP connections when downloading articles. Default is 8. The maximum number of concurrent HTTP connections when downloading articles. Default is 8.
NOTE: It is advised to use as few connections as needed i.e between 1 and 50. Using more connections can end NOTE: It is advised to use as few connections as needed i.e between 1 and 50. Using more connections can end
up overloading your network card with too many concurrent requests. up overloading your network card with too many concurrent requests.
-o, --output-dir <output_directory> --no-css
Directory for saving epub documents Removes the stylesheets used in the EPUB generation.
The EPUB file will then be laid out based on your e-reader's default stylesheets.
--merge <output_name> Images and code blocks may overflow when this flag is set and layout of generated
PDFs will be affected. Use --no-header-css if you want to only disable the styling on headers.
--no-header-css
Removes the header CSS styling but preserves styling of images and codeblocks. To remove all the default
CSS, use --no-css instead.
--merge <output-name>
Merge multiple articles into a single epub that will be given the name provided Merge multiple articles into a single epub that will be given the name provided
-o, --output-dir <output_directory>
Directory to store output epub documents
-V, --version -V, --version
Prints version information Prints version information
@ -112,6 +127,41 @@ These can also be read from a file using the `-f/--file` flag.
paperoni -f links.txt paperoni -f links.txt
``` ```
### Exporting articles
By default, Paperoni exports to EPUB files but you can change to HTML by passing the `--export html` flag.
```sh
paperoni https://en.wikipedia.org/wiki/Pepperoni --export html
```
HTML exports allow you to read the articles as plain HTML documents on your browser but can also be used to convert to PDF as explained [here](#).
When exporting to HTML, Paperoni will download the article's images to a folder named similar to the article. Therefore the folder structure would look like this for the command ran above:
```
.
├── Pepperoni - Wikipedia
│ ├── 1a9f886e9b58db72e0003a2cd52681d8.png
│ ├── 216f8a4265a1ceb3f8cfba4c2f9057b1.jpeg
│ ...
└── Pepperoni - Wikipedia.html
```
If you would instead prefer to have the images inlined directly to the HTML export, pass the `inline-images` flag, i.e.:
```sh
paperoni https://en.wikipedia.org/wiki/Pepperoni --export html --inline-images
```
This is especially useful when exporting multiple links.
**NOTE**: The inlining of images for HTML exports uses base64 encoding which is known to increase the overall size of images by about 25% to 33%.
### Disabling CSS
The `no-css` and `no-header-css` flags can be used to remove the default styling added by Paperoni. Refer to `--help` to see the usage of the flags.
### Merging articles ### Merging articles
By default, Paperoni generates an epub file for each link. You can also merge multiple links By default, Paperoni generates an epub file for each link. You can also merge multiple links
@ -153,7 +203,11 @@ There are also web pages it won't work on in general such as Twitter and Reddit
## PDF exports ## PDF exports
As of version 0.5-alpha1, you can now export to PDF using a third party tool. This requires that you install [Calibre](https://calibre-ebook.com/) which comes with a ebook conversion. You can convert the epub to a pdf through the terminal with `ebook-convert`: PDF conversion can be done using a third party tool. There are 2 options to do so:
### EPUB to PDF
This requires that you install [Calibre](https://calibre-ebook.com/) which comes with a ebook conversion. You can convert the epub to a pdf through the terminal with `ebook-convert`:
```sh ```sh
# Assuming the downloaded epub was called foo.epub # Assuming the downloaded epub was called foo.epub
@ -161,3 +215,25 @@ ebook-convert foo.epub foo.pdf
``` ```
Alternatively, you can use the Calibre GUI to do the file conversion. Alternatively, you can use the Calibre GUI to do the file conversion.
### HTML to PDF
The recommended approach is to use [Weasyprint](https://weasyprint.org/start/), a free and open-source tool that converts HTML documents to PDF. It is available on Linux, MacOS and Windows. Using the CLI, it can be done as follows:
```sh
paperoni https://en.wikipedia.org/wiki/Pepperoni --export html
weasyprint "Pepperoni - Wikipedia.html" Pepperoni.pdf
```
Inlining images is not mandatory as Weasyprint will be able to find the files on its own.
### Comparison of PDF conversion methods
Either of the conversion methods is sufficient for most use cases. The main differences are listed below:
| | EPUB to PDF | HTML to PDF |
|----------------------|----------------------------|------------------|
| Wrapping code blocks | Yes | No |
| CSS customization | No | Yes |
| Generated file size | Slightly larger | Slightly smaller |
The difference in file size is due to the additional fonts added to the PDF file by `ebook-convert`.

View file

@ -49,7 +49,7 @@ args:
long: inline-toc long: inline-toc
requires: output-name requires: output-name
help: Add an inlined Table of Contents page at the start of the merged article. help: Add an inlined Table of Contents page at the start of the merged article.
long_help: Add an inlined Table of Contents page at the start of the merged article. This does not affect the Table of Contents navigation" long_help: Add an inlined Table of Contents page at the start of the merged article. This does not affect the Table of Contents navigation
- no-css: - no-css:
long: no-css long: no-css
conflicts_with: no-header-css conflicts_with: no-header-css