Commit graph

42 commits

Author SHA1 Message Date
Jeff Epler
34dd42b149 accumulated changes 2023-09-06 19:42:58 -05:00
Jeff Epler
534299d422 accumulated changes 2023-09-06 19:42:58 -05:00
Jeff Epler
9c55e80e12 show what feed encountered an error 2023-09-06 19:42:58 -05:00
Jeff Epler
0f3b710ad8 allow rss feeds to come from functions
this allows them to be extracted from non-rss sources
2023-09-06 19:42:58 -05:00
Jeff Epler
a20fbf112d remove debugging prints 2023-09-06 19:42:58 -05:00
Jeff Epler
249da95006 hash long keys
memcached doesn't like long keys, so when they naturally end up
long we'll have to make them unnatural keys instead.
2023-09-06 19:42:58 -05:00
Jeff Epler
f9e7c11bd9 messy wip 2023-09-06 19:42:58 -05:00
Jeff Epler
eaf362d249 Use link as entry id when id not available 2013-12-08 10:15:26 -06:00
Jeff Epler
280352f09d Fix link in entry 2013-12-08 10:15:07 -06:00
Jeff Epler
aa827980c6 Fix authors in entry 2013-12-08 10:15:00 -06:00
Jeff Epler
c5b44bc758 Fix authors in feed 2013-12-08 10:14:49 -06:00
Jeff Epler
967efbb9fe When ID is not available, use Link 2013-12-08 10:14:33 -06:00
Jeff Epler
ada22e5b49 Fix href attribute of link tag 2013-12-08 10:12:16 -06:00
Jeff Epler
8bdb49c437 Assume unicode URLs can be UTF-8 encoded 2013-12-08 10:11:50 -06:00
Jeff Epler
3f5e56c893 don't crash on non-ascii titles in debug log 2013-06-26 09:25:24 -05:00
Jeff Epler
82823e4208 index.html: produce an index of generated feeds 2013-06-26 09:24:59 -05:00
Jeff Epler
c0648de760 write_if_change: don't touch output files if there's no difference 2013-06-26 09:24:28 -05:00
Jeff Epler
25158abe44 get_url: fix cache in the case of redirect
this still is sort of smelly and probably needs a rethink...
2013-06-26 09:24:00 -05:00
Jeff Epler
37246f1438 A very simple 'make install' 2013-04-04 09:53:56 -05:00
Jeff Epler
bdb4591a8f Cache results of bsparse+do_extract
this is the most time-consuming step after fetching, so take
advantage of results from prior runs.  This changes a run where
all web results are time-cached to under 1s per 40 items.
2013-04-04 09:53:43 -05:00
Jeff Epler
8219bd38c6 Include 'published' tag in output entries 2013-04-04 09:51:44 -05:00
Jeff Epler
de75b5ec9e Print some progress messages 2013-04-04 09:51:21 -05:00
Jeff Epler
18f28460cf Properly get feed info in the output
I was getting this from the wrong location before, but protected
by LBYL checks so there was no overt error
2013-04-03 08:59:13 -05:00
Jeff Epler
7bf5807970 Fix a bug in 'shiftylook' feeds
whitespace in the <img src> attribute is weird and doesn't work
when showing this comic in newsblur.
2013-04-03 08:58:34 -05:00
Jeff Epler
baa9095654 note some other stuff that furss does for you 2013-03-26 18:11:04 -05:00
Jeff Epler
ef7fe88c3f note that multiple XPATHs are capable 2013-03-25 19:56:53 -05:00
Jeff Epler
995b6eb5e9 fix whitespace 2013-03-25 19:55:00 -05:00
Jeff Epler
c4056d1284 work in parallel via threads
since things like fetching URLs involves lots of latency, a modest
number of threads is expected to improve throughput.
2013-03-25 19:55:00 -05:00
Jeff Epler
a19d204439 replace output files atomically
.. well on unix anyway
2013-03-25 19:51:59 -05:00
Jeff Epler
7584302574 get rid of debug prints 2013-03-25 19:35:09 -05:00
Jeff Epler
2996ee5b79 can defer checking robots until after expiry match failed 2013-03-25 19:35:06 -05:00
Jeff Epler
93c588c961 Allow an rcfile to specify feeds to fetch and other config items 2013-03-25 19:34:39 -05:00
Jeff Epler
7e24083b51 fix time-based cache 2013-03-25 19:34:35 -05:00
Jeff Epler
8b097f119a mark script as executable 2013-03-25 19:33:50 -05:00
Jeff Epler
fa17930815 use memcache cacher 2013-03-25 19:33:50 -05:00
Jeff Epler
cf535d3a01 implement mecached cacher 2013-03-25 19:33:50 -05:00
Jeff Epler
366f4d8c70 It turns out I need Python 2.7 features 2013-03-25 18:53:00 -05:00
Jeff Epler
bdce5ddbf1 note license in README 2013-03-25 07:22:08 -05:00
Jeff Epler
010f1f6ea7 fix feeds which require the base to be specified 2013-03-25 07:16:55 -05:00
Jeff Epler
3a48072219 initial implementation of furss
it still needs a proper (persistent) cache and a commandline
interface which would make it useful to run from cron.
2013-03-25 07:01:22 -05:00
Jeff Epler
c2659ad211 ignore generated files 2013-03-25 06:55:16 -05:00
Jeff Epler
abc185d32e Describe what I hope to accomplish 2013-03-25 06:55:03 -05:00