用Ruby做Get网络请求-技术博客集

用Ruby做Get网络请求
编程技术 / houtizong 发布于 3年前 67

使用ruby发起网络请求，需要用到'net/http'，下面的程序是获得一个对url请求的
响应

其实最简单的方法是

>>require "open-uri">>open("http://www.cnblog.org/blog/atom.xml")

但是，这个方法的缺点是太简单，无法设置超时时间。在超时的情况下，他会无限的请求下去，直到达到了默认的超时时间，这个时间很长

>> open("http://www.cnblog.org/blog/atom.xml")Errno::ETIMEDOUT: Connection timed out - connect(2)        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `initialize'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `open'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect'        from /usr/local/bin/rubyee/lib/ruby/1.8/timeout.rb:53:in `timeout'        from /usr/local/bin/rubyee/lib/ruby/1.8/timeout.rb:93:in `timeout'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:553:in `do_start'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:542:in `start'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:242:in `open_http'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:616:in `buffer_open'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:164:in `open_loop'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:162:in `catch'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:162:in `open_loop'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:132:in `open_uri'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:518:in `open'        from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:30:in `open'        from (irb):6>>

为了保险起见，在要考虑超时处理或者其他设定的情况下，还是使用Net::HTTP
除了能设置超时时间之外，还能设置其他的请求参数，例如user-agent

这个user-agent还是很有用的参数，先前在拿163.com做实验的时候，没有设个参数，结果老是重定向，把这个请求当做了手机端的

class HandleGetRequest  # 对url发起get请求  require 'net/http'  def self.get_response(url)    begin      url_str = URI.parse(url)      site = Net::HTTP.new(url_str.host, url_str.port)      site.open_timeout = 20      site.read_timeout = 20      path = url_str.query.blank? ? url_str.path : url_str.path+"?"+url_str.query      return site.get2(path,{'accept'=>'text/html','user-agent'=>'Mozilla/5.0'})    rescue Exception => ex      p ex    end  endend

请求一个正常的网址

>> HandleGetRequest.get_response("http://www.iteye.com/topic/431217")=> #<Net::HTTPOK 200 OK readbody=true>

如果后面的path为空注意斜杠

>> HandleGetRequest.get_response("http://www.google.com.hk")#<ArgumentError: HTTP request path is empty>=> nil>> HandleGetRequest.get_response("http://www.google.com.hk/")=> #<Net::HTTPOK 200 OK readbody=true>

请求一个超时的网址（在我机器上测试时超时的），会在设定的时间到达时抛出异常

>> HandleGetRequest.get_response("http://www.cnblog.org/blog/atom.xml")#<Timeout::Error: execution expired>Timeout::Error: execution expired        from /usr/local/bin/rubyee/lib/ruby/1.8/timeout.rb:60:in `open'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:553:in `do_start'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:542:in `start'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:1035:in `request'        from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:948:in `get2'        from /home/chengliwen/chengliwen/deploy/pin-macro-tmp/lib/handle_get_request.rb:30:in `get_response'        from (irb):1

然后可以根据响应值，去处理response的body了

上一篇：Ruby自定义类重写比较运算符

下一篇：Firefox扩展新建时的小问题

请勿发布不友善或者负能量的内容。与人为善，比聪明更重要！

使用ruby发起网络请求，需要用到'net/http'，下面的程序是获得一个对url请求的 响应 其实最简单的方法是 <pre name="code" class="ruby">&gt;&gt;require &quot;open-uri&quot;&gt;&gt;open(&quot;http://www.cnblog.org/blog/atom.xml&quot;)</pre> 但是，这个方法的缺点是太简单，无法设置超时时间。在超时的情况下，他会无限的请求下去，直到达到了默认的超时时间，这个时间很长 <pre name="code" class="ruby">&gt;&gt; open(&quot;http://www.cnblog.org/blog/atom.xml&quot;)Errno::ETIMEDOUT: Connection timed out - connect(2) from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `initialize' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `open' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect' from /usr/local/bin/rubyee/lib/ruby/1.8/timeout.rb:53:in `timeout' from /usr/local/bin/rubyee/lib/ruby/1.8/timeout.rb:93:in `timeout' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:553:in `do_start' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:542:in `start' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:242:in `open_http' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:616:in `buffer_open' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:164:in `open_loop' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:162:in `catch' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:162:in `open_loop' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:132:in `open_uri' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:518:in `open' from /usr/local/bin/rubyee/lib/ruby/1.8/open-uri.rb:30:in `open' from (irb):6&gt;&gt;</pre> 为了保险起见，在要考虑超时处理或者其他设定的情况下，还是使用Net::HTTP 除了能设置超时时间之外，还能设置其他的请求参数，例如user-agent 这个user-agent还是很有用的参数，先前在拿163.com做实验的时候，没有设个参数，结果老是重定向，把这个请求当做了手机端的 <pre name="code" class="ruby">class HandleGetRequest # 对url发起get请求 require 'net/http' def self.get_response(url) begin url_str = URI.parse(url) site = Net::HTTP.new(url_str.host, url_str.port) site.open_timeout = 20 site.read_timeout = 20 path = url_str.query.blank? ? url_str.path : url_str.path+&quot;?&quot;+url_str.query return site.get2(path,{'accept'=&gt;'text/html','user-agent'=&gt;'Mozilla/5.0'}) rescue Exception =&gt; ex p ex end endend</pre> 请求一个正常的网址 <pre name="code" class="ruby">&gt;&gt; HandleGetRequest.get_response(&quot;http://www.iteye.com/topic/431217&quot;)=&gt; #&lt;Net::HTTPOK 200 OK readbody=true&gt;</pre> 如果后面的path为空 注意斜杠 <pre name="code" class="ruby">&gt;&gt; HandleGetRequest.get_response(&quot;http://www.google.com.hk&quot;)#&lt;ArgumentError: HTTP request path is empty&gt;=&gt; nil&gt;&gt; HandleGetRequest.get_response(&quot;http://www.google.com.hk/&quot;)=&gt; #&lt;Net::HTTPOK 200 OK readbody=true&gt;</pre> 请求一个超时的网址（在我机器上测试时超时的），会在设定的时间到达时抛出异常 <pre name="code" class="ruby">&gt;&gt; HandleGetRequest.get_response(&quot;http://www.cnblog.org/blog/atom.xml&quot;)#&lt;Timeout::Error: execution expired&gt;Timeout::Error: execution expired from /usr/local/bin/rubyee/lib/ruby/1.8/timeout.rb:60:in `open' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:560:in `connect' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:553:in `do_start' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:542:in `start' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:1035:in `request' from /usr/local/bin/rubyee/lib/ruby/1.8/net/http.rb:948:in `get2' from /home/chengliwen/chengliwen/deploy/pin-macro-tmp/lib/handle_get_request.rb:30:in `get_response' from (irb):1</pre> 然后可以根据响应值，去处理response的body了 </div>

留言需要登陆哦

技术博客集 - 网站简介：
前后端技术：
后端基于Hyperf2.1框架开发,前端使用Bootstrap可视化布局系统生成
网站主要作用：
1.编程技术分享及讨论交流，内置聊天系统;
2.测试交流框架问题，比如：Hyperf、Laravel、TP、beego;
3.本站数据是基于大数据采集等爬虫技术为基础助力分享知识，如有侵权请发邮件到站长邮箱，站长会尽快处理;
4.站长邮箱：[email protected];

文章归档

文章标签

友情链接

首页
关于我们

Auther ·HouTiZong: 侯体宗的博客