chore: update README

chore: bump version
2021-07-24 13:29:14 +03:00 · 2021-07-24 13:29:14 +03:00 · 40cf5b06c9
commit 40cf5b06c9
parent e6f901eb5a
4 changed files with 95 additions and 19 deletions
--- a/Cargo.lock
+++ b/Cargo.lock
@ -1552,7 +1552,7 @@ dependencies = [

 [[package]]
 name = "paperoni"
-version = "0.5.0-alpha1"
+version = "0.6.0-alpha1"
 dependencies = [
 "async-std",
 "base64",
--- a/Cargo.toml
+++ b/Cargo.toml
@ -3,7 +3,7 @@ description = "A web article downloader"
 homepage = "https://github.com/hipstermojo/paperoni"
 repository = "https://github.com/hipstermojo/paperoni"
 name = "paperoni"
-version = "0.5.0-alpha1"
+version = "0.6.0-alpha1"
 authors = ["Kenneth Gitere <gitere81@gmail.com>"]
 edition = "2018"
 license = "MIT"
--- a/README.md
+++ b/README.md
@ -8,7 +8,7 @@
    </a>
 </div>

-Paperoni is a CLI tool made in Rust for downloading web articles as EPUBs. There is provisional<sup><a href="#pdf-exports">\*</a></sup> support for exporting to PDF as well.
+Paperoni is a CLI tool made in Rust for downloading web articles as EPUB or HTML files. There is provisional<sup><a href="#pdf-exports">\*</a></sup> support for exporting to PDF as well.

 > This project is in an alpha release so it might crash when you use it. Please open an [issue on Github](https://github.com/hipstermojo/paperoni/issues/new) if it does crash.

@ -23,7 +23,7 @@ Check the [releases](https://github.com/hipstermojo/paperoni/releases) page for
 Paperoni is published on [crates.io](https://crates.io). If you have [cargo](https://github.com/rust-lang/cargo) installed, then run:

 ```sh
-cargo install paperoni --version 0.5.0-alpha1
+cargo install paperoni --version 0.6.0-alpha1
 ```

 _Paperoni is still in alpha so the `version` flag has to be passed._
@ -48,28 +48,43 @@ USAGE:
    paperoni [OPTIONS] [urls]...

 OPTIONS:
+        --export <type>
+            Specify the file type of the export. The type must be in lower case. [default: epub]  [possible values:
+            html, epub]
    -f, --file <file>
            Input file containing links

    -h, --help
            Prints help information

+        --inline-images
+            Inlines the article images when exporting to HTML using base64.
+            This is used when you do not want a separate folder created for images during HTML export.
+            NOTE: It uses base64 encoding on the images which results in larger HTML export sizes as each image
+            increases in size by about 25%-33%.
        --inline-toc
-            Add an inlined Table of Contents page at the start of the merged article.
-
+            Add an inlined Table of Contents page at the start of the merged article. This does not affect the Table of Contents navigation
        --log-to-file
            Enables logging of events to a file located in .paperoni/logs with a default log level of debug. Use -v to
            specify the logging level
-        --max-conn <max_conn>
+        --max-conn <max-conn>
            The maximum number of concurrent HTTP connections when downloading articles. Default is 8.
            NOTE: It is advised to use as few connections as needed i.e between 1 and 50. Using more connections can end
            up overloading your network card with too many concurrent requests.
-    -o, --output-dir <output_directory>
-            Directory for saving epub documents
-
-        --merge <output_name>
+        --no-css
+            Removes the stylesheets used in the EPUB generation.
+            The EPUB file will then be laid out based on your e-reader's default stylesheets.
+            Images and code blocks may overflow when this flag is set and layout of generated
+            PDFs will be affected. Use --no-header-css if you want to only disable the styling on headers.
+        --no-header-css
+            Removes the header CSS styling but preserves styling of images and codeblocks. To remove all the default
+            CSS, use --no-css instead.
+        --merge <output-name>
            Merge multiple articles into a single epub that will be given the name provided

+    -o, --output-dir <output_directory>
+            Directory to store output epub documents
+
    -V, --version
            Prints version information

@ -112,6 +127,41 @@ These can also be read from a file using the `-f/--file` flag.
 paperoni -f links.txt
 ```

+### Exporting articles
+
+By default, Paperoni exports to EPUB files but you can change to HTML by passing the `--export html` flag.
+
+```sh
+paperoni https://en.wikipedia.org/wiki/Pepperoni --export html
+```
+
+HTML exports allow you to read the articles as plain HTML documents on your browser but can also be used to convert to PDF as explained [here](#).
+
+When exporting to HTML, Paperoni will download the article's images to a folder named similar to the article. Therefore the folder structure would look like this for the command ran above:
+
+```
+.
+├── Pepperoni - Wikipedia
+│   ├── 1a9f886e9b58db72e0003a2cd52681d8.png
+│   ├── 216f8a4265a1ceb3f8cfba4c2f9057b1.jpeg
+│   ...
+└── Pepperoni - Wikipedia.html
+```
+
+If you would instead prefer to have the images inlined directly to the HTML export, pass the `inline-images` flag, i.e.:
+
+```sh
+paperoni https://en.wikipedia.org/wiki/Pepperoni --export html --inline-images
+```
+
+This is especially useful when exporting multiple links.
+
+**NOTE**: The inlining of images for HTML exports uses base64 encoding which is known to increase the overall size of images by about 25% to 33%.
+
+### Disabling CSS
+
+The `no-css` and `no-header-css` flags can be used to remove the default styling added by Paperoni. Refer to `--help` to see the usage of the flags.
+
 ### Merging articles

 By default, Paperoni generates an epub file for each link. You can also merge multiple links
@ -153,7 +203,11 @@ There are also web pages it won't work on in general such as Twitter and Reddit

 ## PDF exports

-As of version 0.5-alpha1, you can now export to PDF using a third party tool. This requires that you install [Calibre](https://calibre-ebook.com/) which comes with a ebook conversion. You can convert the epub to a pdf through the terminal with `ebook-convert`:
+PDF conversion can be done using a third party tool. There are 2 options to do so:
+
+### EPUB to PDF
+
+This requires that you install [Calibre](https://calibre-ebook.com/) which comes with a ebook conversion. You can convert the epub to a pdf through the terminal with `ebook-convert`:

 ```sh
 # Assuming the downloaded epub was called foo.epub
@ -161,3 +215,25 @@ ebook-convert foo.epub foo.pdf
 ```

 Alternatively, you can use the Calibre GUI to do the file conversion.
+
+### HTML to PDF
+
+The recommended approach is to use [Weasyprint](https://weasyprint.org/start/), a free and open-source tool that converts HTML documents to PDF. It is available on Linux, MacOS and Windows. Using the CLI, it can be done as follows:
+
+```sh
+paperoni https://en.wikipedia.org/wiki/Pepperoni --export html
+weasyprint "Pepperoni - Wikipedia.html" Pepperoni.pdf
+```
+
+Inlining images is not mandatory as Weasyprint will be able to find the files on its own.
+
+### Comparison of PDF conversion methods
+
+Either of the conversion methods is sufficient for most use cases. The main differences are listed below:
+| | EPUB to PDF | HTML to PDF |
+|----------------------|----------------------------|------------------|
+| Wrapping code blocks | Yes | No |
+| CSS customization | No | Yes |
+| Generated file size | Slightly larger | Slightly smaller |
+
+The difference in file size is due to the additional fonts added to the PDF file by `ebook-convert`.
--- a/src/cli_config.yml
+++ b/src/cli_config.yml
@ -49,7 +49,7 @@ args:
      long: inline-toc
      requires: output-name
      help: Add an inlined Table of Contents page at the start of the merged article.
-      long_help: Add an inlined Table of Contents page at the start of the merged article. This does not affect the Table of Contents navigation"
+      long_help: Add an inlined Table of Contents page at the start of the merged article. This does not affect the Table of Contents navigation
  - no-css:
      long: no-css
      conflicts_with: no-header-css