Commit graph

  • ccdbbb5a16 Add initial implementation of grabArticle Kenneth Gitere 2020-10-17 07:16:15 +0300
  • 3254064c0d Fix calls to select to return an iterator excluding the original calling node. Kenneth Gitere 2020-10-16 14:57:26 +0300
  • 6377c01fb3 Add tests for clean_conditionally and fix_lazy_images Kenneth Gitere 2020-10-16 08:03:01 +0300
  • 78d6e16618 Add unit tests for clean, clean_styles, clean_headers and clean_matched_nodes Kenneth Gitere 2020-10-16 07:53:23 +0300
  • b661211f0f Refactored code to use regexes from regexes module Kenneth Gitere 2020-10-15 22:21:21 +0300
  • 75018894ae Add regexes module in moz_readability that contains the regular expressions used. For optimal performance, the regular expresions are compiled to static values to prevent recompiling in loops Kenneth Gitere 2020-10-12 21:33:01 +0300
  • d2bd31dc47 Add helper functions for the grabArticle function Kenneth Gitere 2020-10-07 20:46:08 +0300
  • 87ff21b676 Add regex and lazy_static crates Kenneth Gitere 2020-10-07 20:44:35 +0300
  • 7219198524 Change function signature of next_element to return an Option rather than mutate a given value. Kenneth Gitere 2020-09-23 22:36:01 +0300
  • 7fb09130e8 Add calls to remove_scripts and prep_document Kenneth Gitere 2020-08-31 20:40:37 +0300
  • e1debf5630 Add moz_readability initial code and accompanying unit tests Kenneth Gitere 2020-08-31 19:30:09 +0300
  • a27e45b5f3 Merge branch 'master' into dev Kenneth Gitere 2020-05-16 10:32:54 +0300
  • 5e7cf7ddfe Fixed img resolving bug Kenneth Gitere 2020-05-16 10:22:49 +0300
  • 6dab011cac Fixed img resolving bug Kenneth Gitere 2020-05-16 10:22:49 +0300
  • 9f56c58dd9 Add simple CLI wrapper Kenneth Gitere 2020-05-16 10:09:44 +0300
  • c30d5f732e Fix test data Kenneth Gitere 2020-05-05 12:29:08 +0300
  • 271d3c8951 Change download code to save images to a folder Add downloaded images to the output epub file Kenneth Gitere 2020-05-05 12:24:11 +0300
  • f02973157d Refactor downloading code to download images in parallel Kenneth Gitere 2020-05-05 09:40:44 +0300
  • 4e8812c1ee Add first attempt to save an epub file Kenneth Gitere 2020-05-02 19:25:31 +0300
  • e5a318282d Update img tags with new src values to point to the local files Kenneth Gitere 2020-05-02 19:06:03 +0300
  • 78ba40f57a Add image download functionality Kenneth Gitere 2020-05-02 18:33:45 +0300
  • f24e72e70f Change signature of extract_content to copy the reference to article DOM node instead of writing to file Kenneth Gitere 2020-05-02 14:51:53 +0300
  • 529704d227 Add test for extract content Kenneth Gitere 2020-05-01 20:42:41 +0300
  • b5336e078d Factor out text extraction into extractor module Kenneth Gitere 2020-05-01 16:17:59 +0300
  • 4527fb07d9 Initial extraction code to get meta information on a blog Kenneth Gitere 2020-04-30 11:05:53 +0300
  • 52f272f586
    Initial commit Kenneth Gitere 2020-04-30 08:06:07 +0300