Jun 29 2009

Fun with other people's websites (aka Baconification of the web)

Posted by Joe Steinbring at 12:01 AM
13 comments
- Categories: ColdFusion

I recently found myself needing to copy webpages that are not on my server and redisplay them on one of my sites, on demand.  Using cfhttp, that's an easy task but if the site that you're redisplaying uses relative links, it's not going to look right redisplayed.  So, how do you clean up the page before redisplaying it?

The way I handled it, was to use a series of replaces.  You can check out the full CFC here:

 

<cffunction name="domainfromurl" access="public" returntype="string">
<cfargument name="url" type="string" required="yes">
<cfset start = #find("//", arguments.url)# + 2>
<cfset length = #find("/", arguments.url, variables.start)#>
<cfset length = #variables.length# - #variables.start#>
<cftry>
<cfreturn #mid(arguments.url, variables.start, variables.length)#>
<cfcatch>
<cfreturn #replace(arguments.url, 'http://', '')#>
</cfcatch>
</cftry>
</cffunction>

<cffunction name="getHTML" access="public" returntype="string">
   <cfargument name="url" type="string" required="yes">
   <cfhttp url="#arguments.url#" timeout="5" useragent="proxy.cfc: A proxy server CFC by Joe Steinbring (http://steinbring.net)">
   <cfreturn cfhttp.FileContent>
</cffunction>

<cffunction name="fixHTML" access="public" returntype="string">
<cfargument name="html" type="string" required="yes">
<cfargument name="url" type="string" required="yes">
<cfargument name="proxyify" type="string" required="yes">

<cfinvoke
component = "proxy"
method = "domainfromurl"
returnVariable = "variables.domain">

<cfinvokeargument name="url" value="#arguments.url#">
</cfinvoke>

<!---

When you specify that you want to "proxyify" a page, it changes references to webpages
that are on the origional server to proxied pages.

--->


<cfif #arguments.proxyify# neq 'no'>

<cfset replacewith = [
'a href="?#arguments.proxyify#=http',
'a href="
?#arguments.proxyify#=http://#variables.domain#/',
'a onClick="javascript:" href="?#arguments.proxyify#='
] />


<cfset whattolookfor = [
'a href="
http',
'a href="/',
'a onClick="
javascript:" href="'
] />


<cfloop from="1" to="#arraylen(whattolookfor)#" index="loop_index">

<cfset html = #replacenocase(html, variables.whattolookfor[loop_index], variables.replacewith[loop_index], 'all')#>

</cfloop>

</cfif>


<cfset replacewith = [
'src="http://#variables.domain#/',
'src="
http://#variables.domain#/images/',
'href="http://#variables.domain#/',
'href="
http://#variables.domain#/css/',
'action="http://#variables.domain#/',
': url(http://#variables.domain#/',
'@import "
http://#variables.domain#/',
' href="http://#variables.domain#/style.css',
'img src="
http://#variables.domain#/logo.'
] />


<cfset whattolookfor = [
'src="/',
'src="
images/',
'href="/',
'href="
css/',
'action="/',
': url(/',
'@import "
/',
' href="style.css',
'img src="
logo.'
] />


<cfloop from="1" to="#arraylen(whattolookfor)#" index="loop_index">

<cfset html = #replacenocase(html, variables.whattolookfor[loop_index], variables.replacewith[loop_index], 'all')#>

</cfloop>

<cfreturn html>
</cffunction>

 

 

So, you might be wondering what practical applications there are in this.  Well, I came up with a good one.  I wrote a utility that takes the html from a site of your choice and redisplays it with random words replaced with the word 'bacon'.  Think of it like madlibs but with just bacon.

Original Site

Baconified Site

Comments

Jay

Jay wrote on 09/02/09 8:08 PM

a great place for information and advice about a wide range of topics.
I am from Mauritius and too poorly know English, give true I wrote the following sentence: "Similar blues knew hard theater and fourth magazine to reach filmmaking an dyed contaminant, imparted kuffi - September 11, 2009, 2:58 AM Another government to dramatically becoming down a society is to currently depict the book with the pop at the practical wood leathers on the bailey, which uses many psychedelic costs."

With respect :-), Jay.
Ink Cartridge Recycling

Ink Cartridge Recycling wrote on 11/09/09 10:37 AM

I added your post to my college Report


Larry
Betty

Betty wrote on 06/03/10 6:22 AM

thanks
chanel bags 2010

chanel bags 2010 wrote on 06/14/10 1:44 AM

You write good articles,I like your essay.
Coach bags

Coach bags wrote on 06/21/10 9:33 PM

Hunting, see the birds, lay a gun was found not only, the http://www.coachoutletmalls.com Coach factory outlet hair is puzzled, other bird flew down to lambaste hunter: his mama of, Lao tze just before she took off http://www.coachoutletmalls.com Coach bags lothes, you will put her down.
Coach bags

Coach bags wrote on 06/23/10 10:38 PM

http://www.coachoutletmalls.com Coach factory outlet
http://www.coachoutletmalls.com Coach bags
http://www.coachoutletmalls.com Coach outlet store
http://www.coachoutletmalls.com Coach outlet

This article is written by zocy003 on 2010-6-24 http://www.coachoutletmalls.com
gucci outlet

gucci outlet wrote on 07/03/10 12:33 PM

A lifeguard to visitors to protest: I notice you have three days, you cannot in urine wang swimming pool.
Wang: everybody in the http://www.gucci-zone.com/ gucci outlet swimming pool.
The lifeguard: yes! Sir, but only if you stand on the board in the urine...
replica handbags

replica handbags wrote on 07/07/10 2:37 AM

Thanks for posting this tutorial it has been a great help. .yangyiyi100707
gucci handbags outlet

gucci handbags outlet wrote on 07/14/10 1:54 AM

http://suprashoes365.com/ Supra Shoes
http://suprashoes365.com/ supra outlet
http://suprashoes365.com/ cheap supra shoes
http://suprashoes365.com/ supra shoes on sale
http://www.suprashoes365.com/supra-newest-c-10.html Supra Newest
http://www.suprashoes365.com/supra-skytop-c-3.html Supra Skytop
http://www.suprashoes365.com/supra-skytop-ns-c-7.html Supra Skytop NS
http://www.suprashoes365.com/supra-society-c-5.html Supra Society
http://www.suprashoes365.com/supra-style-ii-c-6.html Supra Style II
http://www.suprashoes365.com/supra-thunder-c-4.html Supra Thunder
http://www.suprashoes365.com/women-skytop-c-2.html Women Skytop
http://www.suprashoes365.com/women-supra-society-c-9.html Women Supra Society
http://www.gucci-outlet.us/ gucci handbags outlet
http://www.gucci-outlet.us/ gucci outlet
http://www.gucci-outlet.us/ gucci handbags


http://www.gucci-outlet.us/gucci-handbags-c-74.html Gucci Handbags
http://www.gucci-outlet.us/gucci-handbags-gucci-backpacks-c-74_83.html Gucci Backpacks
http://www.gucci-outlet.us/gucci-handbags-gucci-belt-bags-c-74_84.html Gucci Belt Bags
http://www.gucci-outlet.us/gucci-handbags-gucci-briefcases-c-74_85.html Gucci Briefcases
http://www.gucci-outlet.us/gucci-handbags-gucci-computer-cases-c-74_86.html Gucci Computer Cases
http://www.gucci-outlet.us/gucci-handbags-gucci-duffels-c-74_87.html Gucci Duffels
http://www.gucci-outlet.us/gucci-handbags-gucci-hobos-c-74_75.html Gucci Hobos
http://www.gucci-outlet.us/gucci-handbags-gucci-jolicoeur-c-74_76.html Gucci Jolicoeur
http://www.gucci-outlet.us/gucci-handbags-gucci-messenger-bags-c-74_77.html Gucci Messenger Bags
http://www.gucci-outlet.us/gucci-handbags-gucci-shoulder-bags-c-74_78.html Gucci Shoulder Bags
http://www.gucci-outlet.us/gucci-handbags-gucci-top-handles-c-74_79.html Gucci Top Handles
http://www.gucci-outlet.us/gucci-handbags-gucci-totes-c-74_80.html Gucci Totes
http://www.gucci-outlet.us/gucci-handbags-gucci-travel-business-c-74_81.html Gucci Travel Business
http://www.gucci-outlet.us/gucci-leather-wallets-c-82.html Gucci Leather Wallets
http://www.gucci-outlet.us/gucci-sunglasses-c-65.html Gucci Sunglasses
http://www.gucci-outlet.us/gucci-wallets-c-66.html Gucci Wallets
http://www.gucci-outlet.us/gucci-hats-c-67.html Gucci Hats
cosplay

cosplay wrote on 07/26/10 2:11 AM

You write good articles,I like your essay.

Write your comment



(it will not be displayed)