https://github.com/lnenad/newser
(there is a screenshot of the first page of the generated pdf)
It scrapes (news) websites for content and puts it into a pdf. For me the pdf location is my dropbox supernote directory so my setup is to run this thing daily and have a fresh pdf with news whenever I want it.
It's rough around the edges probably (currently added crawl support for verge, ars, engadget) but I think it's a good base so if anyone wants to contribute feel free. Some of the stuff I want to add is pictures (maybe), maybe parse the text html to include font styling and other stuff.
I've tried to generalize it as much as possible so the crawling is pretty much automatic and is controlled by a config file where you define "rules" on how to parse the website.