Add the helper script tests/fetch_facebook_page.sh
authorAntonio Ospite <ao2@ao2.it>
Thu, 9 Feb 2017 17:15:54 +0000 (18:15 +0100)
committerAntonio Ospite <ao2@ao2.it>
Thu, 9 Feb 2017 17:24:06 +0000 (18:24 +0100)
The script helps retrieving the actual html of a public page on
facebook.com, ignoring the pages which require the CAPTCHA.

This allows to have a local copy of the page to test tweeper on.

tests/fetch_facebook_page.sh [new file with mode: 0755]

diff --git a/tests/fetch_facebook_page.sh b/tests/fetch_facebook_page.sh
new file mode 100755 (executable)
index 0000000..f25966e
--- /dev/null
@@ -0,0 +1,20 @@
+#!/bin/sh
+#
+# Facebook requires a CAPTCHA most of the times, so keep fetching the URL as
+# long as needed, until the page is shown with no CAPTCHA.
+
+set -e
+
+USER_AGENT="Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20130405 Firefox/22.0";
+
+while true;
+do
+  # Force language to en-us to make sure that the string matching works
+  OUTPUT=$(wget -nv --user-agent="$USER_AGENT" --header='Accept-Language: en-us' -O - -- "$1")
+  if echo $OUTPUT | grep -q -v "Security Check Required";
+  then
+    echo "$OUTPUT" > facebook.html
+    break
+  fi
+  sleep 5
+done