+++ /dev/null
-2013-11-18 12:56:52 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * tweeper.1.asciidoc: small fixes to the man page (HEAD, origin/master, master)
-
-2013-11-18 12:12:16 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * tweeper.1.asciidoc: add a missing semicolon
-
-2013-11-18 01:01:29 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add a ChangeLog file (tag: v0.1)
-
-2013-11-18 00:59:58 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add a NEWS file
-
-2013-11-18 00:59:20 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add a Makefile rule to generate a Changelog file
-
-2013-11-18 00:43:00 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add a man page
-
-2013-11-08 16:22:26 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add a Makefile to simplify installation and packaging
-
-2013-11-08 16:17:48 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add a wrapper script intended to be called as an executable
-
-2013-11-08 15:01:25 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Write error messages on STDERR and return saner values in CLI mode
-
-2013-11-08 13:09:19 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * TODO: add more info about checking UTF output
-
-2013-11-08 10:44:44 +0100 Antonio Ospite <ospite@studenti.unina.it>
-
- * Handle errors and warnings from loadHTML()
-
-2013-10-06 11:01:46 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Show the actual name of the user the tweet comes from
-
-2013-08-12 10:16:27 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Follow HTTP redirects in get_contents() too
-
-2013-08-12 01:25:56 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add some entries to the TODO file
-
-2013-08-12 01:22:35 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Merge branch 'generate-enclosure-element'
-
-2013-08-12 01:16:10 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Cosmetics: re-indent cURL options to follow the coding style (generate-enclosure-element)
-
-2013-08-12 01:13:56 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Use cURL for Tweeper::get_contents() too
-
-2013-08-12 01:06:23 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Remove double semicolon in Tweeper::get_info()
-
-2013-08-11 21:23:42 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Make get_url_info() and generate_enclosure() static methods
-
-2013-08-11 21:15:41 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Turn epoch_to_gmdate() and str_to_gmdate() into static methods
-
-2013-08-11 21:11:03 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Make get_contents() a static method
-
-2013-08-11 20:57:02 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Cosmetics: sort supported_content_types, remove unneeded spaces
-
-2013-08-11 20:52:47 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Use an array to list supported content types for enclosures
-
-2013-08-11 20:44:37 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Make it optional to generate the <enclosure/> element
-
-2013-08-11 20:27:36 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Use getopt() to parse command line options
-
-2013-08-11 20:08:37 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Split parsing CLI options from parsing QUERY_STRING ones
-
-2013-08-11 13:43:05 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Use templates to generate enclosures
-
-2013-08-11 12:48:21 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Merge https://github.com/grote/Tweeper into generate-encolure-elements
-
-2013-08-11 12:43:42 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Fix a typo: s/tweeter/Twitter/
-
-2013-08-04 23:22:02 +0200 Torsten Grote <t@grobox.de>
-
- * only enclosify certain mimetypes, use same user agent
-
-2013-08-04 22:00:51 +0200 Torsten Grote <t@grobox.de>
-
- * add initial support for enclosures
-
-2013-08-03 20:56:55 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Fix a typo in an error message
-
-2013-07-28 22:34:06 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add an RSS conversion stylesheet for dilbert.com
-
-2013-07-28 22:30:26 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * TODO: mention the <ttl/> RSS element
-
-2013-07-28 22:28:55 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * rss_converter_twitter.com.xsl: use concat() more
-
-2013-07-27 17:14:07 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add an example with identi.ca
-
-2013-07-27 17:05:03 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Mention in the README that other sites can be converted to RSS (local-ao2)
-
-2013-07-27 16:51:38 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add initial support for scraping Pump.io activity streams
-
-2013-07-27 16:46:23 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Change mode of tweeper.php
-
-2013-07-27 16:45:47 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add -h and --help options
-
-2013-07-27 16:38:46 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add another date conversion routine
-
-2013-07-27 16:36:36 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Update the documentation to use URLs as arguments
-
-2013-07-27 16:35:47 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Mention http://rssitfor.me as an alternative service
-
-2013-07-27 16:04:41 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Use __DIR__ when building the stylesheet path name
-
-2013-07-27 16:01:36 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Rename formatDate() function to epoch_to_gmdate()
-
-2013-07-27 13:31:59 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Be more verbose in error messages
-
-2013-07-27 13:24:44 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Make stylesheet file name parametric
-
-2013-07-27 13:09:08 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Change of behavior| Now a URL is required as an argument
-
-2013-07-27 12:49:21 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Factor out a usage() function
-
-2013-07-27 12:43:16 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Use php_sapi_name() to check for CLI interface
-
-2013-07-07 15:34:21 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Fix a typo
-
-2013-07-07 15:33:26 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Add more info about how to call Tweeper from command line
-
-2013-07-07 01:22:47 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Embed the full HTML content of the tweet in the description field
-
-2013-07-06 23:06:12 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Format dates using an external php function
-
-2013-07-06 21:51:53 +0200 Antonio Ospite <ospite@studenti.unina.it>
-
- * Initial import
clean:
rm -f tweeper.1
-changelog:
- git log --pretty="format:%ai %aN <%aE>%n%n%x09* %s%d%n" > ChangeLog
-
docs:
a2x -f manpage tweeper.1.asciidoc
install -m755 tweeper $(DESTDIR)$(TWEEPER_DIR)
install -d $(DESTDIR)$(BIN_DIR)
ln -sf $(TWEEPER_DIR)/tweeper $(DESTDIR)$(BIN_DIR)/tweeper
- @echo -e "\n\nINTALLATION COMPLETE"
+ @echo -e "\n\nINSTALLATION COMPLETE"
@echo -e "Make sure '$(PHP_SCRIPT_DIR)' is in PHP include_path!\n"
+News for v0.3:
+==============
+
+ * Support generating enclosure for "audio/ogg" links
+ * Always specify xml:base to improve local URLs expansions in some cases
+ * Support both the classic and the new Twitter profile pages
+ * Fix getting the profile picture of Twitter users
+ * Add support for Howtoons.com
+
News for v0.2:
==============
<!--
Stylesheet to convert Dilbert daily strips to RSS.
- Copyright (C) 2013 Antonio Ospite <ospite@studenti.unina.it>
+ Copyright (C) 2013-2014 Antonio Ospite <ao2@ao2.it>
This file is part of tweeper.
<xsl:template match="/">
<rss version="2.0">
+ <xsl:attribute name="xml:base"><xsl:value-of select="$BaseURL" /></xsl:attribute>
<channel>
<generator>Tweeper</generator>
<title>
--- /dev/null
+<!--
+ Stylesheet to convert Howtoons.com to RSS.
+
+ Copyright (C) 2014 Antonio Ospite <ao2@ao2.it>
+
+ This file is part of tweeper.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+-->
+
+<!--
+ The RSS feed link is broken on http://howtoons.com so just work around it.
+
+ Howtoons uses Wordpress, so maybe this style sheet can be used as a base for
+ scraping other Wordpress sites.
+-->
+
+<xsl:stylesheet version="1.0"
+ xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
+ xmlns:php="http://php.net/xsl"
+ xsl:extension-element-prefixes="php">
+
+ <xsl:output method="xml" indent="yes"/>
+
+ <xsl:variable name="BaseURL">
+ <xsl:text>http://howtoons.com</xsl:text>
+ </xsl:variable>
+
+ <xsl:template match="//div[contains(@id, 'post-')]">
+ <item>
+ <title>
+ <xsl:value-of select="normalize-space(.//div[@class='post-headline']//a)"/>
+ </title>
+ <link>
+ <xsl:value-of select=".//div[@class='post-headline']//a/@href"/>
+ </link>
+ <pubDate>
+ <xsl:variable name="date" select="substring-before(.//div[@class='post-byline'], ',')"/>
+ <!-- date format is MM.DD.YY -->
+ <xsl:variable name="month" select="substring($date, 1, 2)"/>
+ <xsl:variable name="day" select="substring($date, 4, 2)"/>
+ <xsl:variable name="year" select="substring($date, 7, 2)"/>
+ <xsl:variable name="iso-date" select="concat('20', $year, '-', $month, '-', $day)"/>
+ <xsl:value-of select="php:functionString('Tweeper::str_to_gmdate', $iso-date)"/>
+ </pubDate>
+ <description>
+ <xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
+ <xsl:copy-of select=".//div[contains(@class, 'post-bodycopy')]/p"/>
+ <xsl:text disable-output-escaping="yes">]]></xsl:text>
+ </description>
+ </item>
+ </xsl:template>
+
+ <xsl:template match="/">
+
+ <rss version="2.0">
+ <xsl:attribute name="xml:base"><xsl:value-of select="$BaseURL" /></xsl:attribute>
+ <channel>
+ <generator>Tweeper</generator>
+ <title>
+ <xsl:value-of select="//title"/>
+ </title>
+ <link>
+ <xsl:value-of select="$BaseURL"/>
+ </link>
+ <description>
+ <xsl:text>The world's greatest D.I.Y. comic website! Tools of mass construction!</xsl:text>
+ </description>
+ <image>
+ <url>
+ <xsl:text>http://www.howtoons.com/wp-content/themes/atahualpa/images/header/tuck1000.png</xsl:text>
+ </url>
+ </image>
+ <xsl:apply-templates select="//div[contains(@id, 'post-')]"/>
+ </channel>
+ </rss>
+ </xsl:template>
+</xsl:stylesheet>
<!--
Stylesheet to convert Pump.io activity streams to RSS.
- Copyright (C) 2013 Antonio Ospite <ospite@studenti.unina.it>
+ Copyright (C) 2013-2014 Antonio Ospite <ao2@ao2.it>
This file is part of tweeper.
<xsl:output method="xml" indent="yes"/>
+ <xsl:variable name="domain-name" select="substring-after(//div[@id='profile-block']/@data-profile-id, '@')"/>
+ <xsl:variable name="BaseURL" select="concat('https://', $domain-name)"/>
+
<xsl:variable name="user-name" select="substring-after(//div[@id='profile-block']/@data-profile-id, ':')"/>
<xsl:template match="//div[@id='user-content-activities']//ul[@id='major-stream']/li">
<xsl:template match="/">
<rss version="2.0">
+ <xsl:attribute name="xml:base"><xsl:value-of select="$BaseURL" /></xsl:attribute>
<channel>
<generator>Tweeper</generator>
<title>
<!--
Stylesheet to convert Twitter user timelines to RSS.
- Copyright (C) 2013 Antonio Ospite <ospite@studenti.unina.it>
+ Copyright (C) 2013-2014 Antonio Ospite <ao2@ao2.it>
This file is part of tweeper.
<xsl:value-of disable-output-escaping="yes" select="php:function('Tweeper::generate_enclosure', string(./@data-expanded-url))"/>
</xsl:template>
- <xsl:variable name="screen-name" select="//div[@class='profile-card-inner']/@data-screen-name"/>
+ <xsl:variable name="screen-name" select="//div[@class='user-actions btn-group not-following ']/@data-screen-name"/>
- <xsl:template match="//div[@id='timeline']//ol[@id='stream-items-id']//li[@data-item-type='tweet']">
- <xsl:variable name="user-name" select=".//span[@class='username js-action-profile-name']/b"/>
- <xsl:variable name="tweet-text" select=".//p[@class='js-tweet-text tweet-text']"/>
+ <xsl:template match="//*[@data-item-type='tweet']">
+ <xsl:variable name="user-name" select=".//div[contains(@class, 'js-stream-tweet')]/@data-screen-name"/>
+ <xsl:variable name="tweet-text" select=".//p[contains(@class, 'js-tweet-text')]"/>
<item>
<title>
<xsl:value-of select="concat($user-name, ': ', $tweet-text)"/>
</title>
<link>
- <xsl:value-of select="concat($twitterBaseURL, .//a[@class='details with-icn js-details']/@href)"/>
+ <xsl:value-of select="concat($twitterBaseURL, .//a[contains(@class, 'js-permalink')]/@href)"/>
</link>
<pubDate>
- <xsl:value-of select="php:functionString('Tweeper::epoch_to_gmdate', .//small[@class='time']//span/@data-time)"/>
+ <xsl:variable name="timestamp" select=".//span[contains(@class, 'js-short-timestamp')]/@data-time"/>
+ <xsl:value-of select="php:functionString('Tweeper::epoch_to_gmdate', number($timestamp))"/>
</pubDate>
<description>
<xsl:value-of select="concat($user-name, ': ')"/>
<xsl:template match="/">
<rss version="2.0">
+ <xsl:attribute name="xml:base"><xsl:value-of select="$twitterBaseURL" /></xsl:attribute>
<channel>
<generator>Tweeper</generator>
<title>
</description>
<image>
<url>
- <xsl:value-of select="//a[@class='profile-picture media-thumbnail']/@href"/>
+ <xsl:value-of select="//a[contains(@class, 'profile-picture media-thumbnail')]/@href"/>
</url>
</image>
- <xsl:apply-templates select="//div[@id='timeline']//ol[@id='stream-items-id']//li[@data-item-type='tweet']"/>
+ <xsl:apply-templates select="//*[@data-item-type='tweet']"/>
</channel>
</rss>
</xsl:template>
tweeper can be used as:
1. a command line tool;
-2. as a filter for feed readers;
-3. as a web based tool when PHP support is available in the web server.
+2. a filter for feed readers;
+3. a web based tool when PHP support is available in the web server.
OPTIONS
liferea-add-feed "|tweeper http://twitter.com/NSAcareers"
+Using tweeper via web (the symlink must be created only the very first time):
+
+ sudo ln -s /usr/share/php/tweeper/tweeper.php /var/www
+ xdg-open http://localhost/tweeper.php?src_url=http://twitter.com/NSAcareers
+
+
+NOTES
+-----
+
+In order to use tweeper with a symlink with the apache 'userdir' module, the
+'SymLinksIfOwnerMatch' option must be replaced by 'FollowSymlink' in
+/etc/apache2/mods-enabled/userdir.conf
+
EXIT STATUS
-----------
COPYING
-------
-Copyright \(C) 2013 Antonio Ospite <ospite@studenti.unina.it>
+Copyright \(C) 2013-2014 Antonio Ospite <ao2@ao2.it>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
/*
* tweeper - a Twitter to RSS web scraper
*
- * Copyright (C) 2013 Antonio Ospite <ospite@studenti.unina.it>
+ * Copyright (C) 2013-2014 Antonio Ospite <ao2@ao2.it>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
"audio/aac",
"audio/mp4",
"audio/mpeg",
+ "audio/ogg",
"audio/vorbis",
"audio/wav",
"audio/webm",