Wednesday, November 29, 2006

Code for Extracting Html Content of specific Website

Code for Extracting Html Content of specific Website

Below code is used to extract the html content of specific site.

try this by just copy and paste the below code.

require 'open-uri'
require 'pp'

open('http://www.chennairails.blogspot.com') do |a|
# hash with meta information
pp a.meta

#
pp "Content-Type: " + a.content_type # meta content type
pp "last modified" + a.last_modified.to_s

no = 1
# print the first thousands lines
a.each do |line|
print "#{no}: #{line}"
no += 1
break if no > 1000 #no of lines
end
end

source: code snippets

Enjoy rails!

1 comment:

Ravi said...

http://www.juretta.com/log/2006/08/13/ruby_net_http_and_open-uri/

Seems very similar - identical if i m ay say to the code on this page (even the comments) doesnt it? How about linking?

Pay Pal

Sign up for PayPal and start accepting credit card payments instantly.

Ruby Corner

Earn Money