Scraping Source in Safari

Here is an applescript solution for grabbing the html code from a page online. This is particularly handy if you are trying to grab the code from a page that you need to login to. I am sure there is a much better solution out there, but this one seems to work for me ok.

CODE:
  1. -- Define the page to save the document and the url
  2. set the pageFile to "/Users/yourUserNameHere/Desktop/safariSource.html"
  3. set the pageUrl to "http://www.plasticstare.com/"
  4.  
  5. -- define the applescript to run
  6.  
  7. tell application "Safari"
  8.    activate
  9.    make new document at end of documents
  10.    set URL of document 1 to pageUrl
  11. end tell
  12.  
  13. set web_page_is_loaded to false
  14. --check if page has loaded
  15. repeat
  16.    delay 0.5
  17.    tell application "System Events" to tell application process "Safari"
  18.       if (name of static text 1 of group 1 of window 1 as text) begins with "Contacting" or (name of static text 1 of group 1 of window 1 as text) begins with "Loading" then
  19.          -- do nothing
  20.       else
  21.          exit repeat
  22.       end if
  23.    end tell
  24. end repeat
  25.  
  26. tell application "Safari"
  27.    set siteSource to the source of document 1 as text
  28.    set theFile to open for access (pageFile) as POSIX file with write permission
  29.    set eof of theFile to 0
  30.    write siteSource to theFile
  31.    close access theFile
  32. end tell

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*