Paperoni is a CLI tool made in Rust for downloading web articles as EPUB or HTML files. There is provisional<sup><ahref="#pdf-exports">\*</a></sup> support for exporting to PDF as well.
> This project is in an alpha release so it might crash when you use it. Please open an [issue on Github](https://github.com/hipstermojo/paperoni/issues/new) if it does crash.
## Installation
### Precompiled binaries
Check the [releases](https://github.com/hipstermojo/paperoni/releases) page for precompiled binaries. Currently there are only builds for Debian and Arch.
### Installing from crates.io
Paperoni is published on [crates.io](https://crates.io). If you have [cargo](https://github.com/rust-lang/cargo) installed, then run:
By default, Paperoni exports to EPUB files but you can change to HTML by passing the `--export html` flag.
```sh
paperoni https://en.wikipedia.org/wiki/Pepperoni --export html
```
HTML exports allow you to read the articles as plain HTML documents on your browser but can also be used to convert to PDF as explained [here](#).
When exporting to HTML, Paperoni will download the article's images to a folder named similar to the article. Therefore the folder structure would look like this for the command ran above:
```
.
├── Pepperoni - Wikipedia
│ ├── 1a9f886e9b58db72e0003a2cd52681d8.png
│ ├── 216f8a4265a1ceb3f8cfba4c2f9057b1.jpeg
│ ...
└── Pepperoni - Wikipedia.html
```
If you would instead prefer to have the images inlined directly to the HTML export, pass the `inline-images` flag, i.e.:
```sh
paperoni https://en.wikipedia.org/wiki/Pepperoni --export html --inline-images
```
This is especially useful when exporting multiple links.
**NOTE**: The inlining of images for HTML exports uses base64 encoding which is known to increase the overall size of images by about 25% to 33%.
### Disabling CSS
The `no-css` and `no-header-css` flags can be used to remove the default styling added by Paperoni. Refer to `--help` to see the usage of the flags.
Logging is disabled by default. This can be activated by either using the `-v` flag or `--log-to-file` flag. If the `--log-to-file` flag is passed the logs are sent to a file in the default Paperoni directory `.paperoni/logs` which is on your home directory. The `-v` flag configures the verbosity levels such that:
```
-v Logs only the error level
-vv Logs only the warn level
-vvv Logs only the info level
-vvvv Logs only the debug level
```
If only the `-v` flag is passed, the progress bars are disabled. If both `-v` and `--log-to-file` are passed then the progress bars will still be shown.
This extractor retrieves a possible article using a [custom port](https://github.com/hipstermojo/paperoni/blob/master/src/moz_readability/mod.rs) of the [Mozilla Readability algorithm](https://github.com/mozilla/readability). This article is then saved in an EPUB.
PDF conversion can be done using a third party tool. There are 2 options to do so:
### EPUB to PDF
This requires that you install [Calibre](https://calibre-ebook.com/) which comes with a ebook conversion. You can convert the epub to a pdf through the terminal with `ebook-convert`:
The recommended approach is to use [Weasyprint](https://weasyprint.org/start/), a free and open-source tool that converts HTML documents to PDF. It is available on Linux, MacOS and Windows. Using the CLI, it can be done as follows:
```sh
paperoni https://en.wikipedia.org/wiki/Pepperoni --export html