Selenium+PhantomJS（系列四：模拟登录微博）

2016/10/17

Selenium+PhantomJS系列教程：

Selenium+PhantomJS（系列四：模拟登录微博）

引入selenium package, 建立webdriver对象

from selenium import webdriver  
sel = selenium.webdriver.Chrome()

1 2	from selenium import webdriver sel = selenium.webdriver.Chrome()

打开设定的url，并等待response:

loginurl = 'http://weibo.com/'  
#open the login in page  
sel.get(loginurl)  
time.sleep(10)

loginurl = 'http://weibo.com/'

#open the login in page

sel.get(loginurl)

time.sleep(10)

通过xpath找到登录框，并填入相应帐号密码，模拟点击登录：

#sign in the username  
try:  
    sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[1]/div[1]/input").send_keys('yourusername')  
    print 'user success!'  
except:  
    print 'user error!'  
time.sleep(1)  
#sign in the pasword  
try:  
    sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[2]/div[1]/input").send_keys('yourPW')  
    print 'pw success!'  
except:  
    print 'pw error!'  
time.sleep(1)  
#click to login  
try:  
    sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[6]/a").click()  
    print 'click success!'  
except:  
    print 'click error!'  
time.sleep(3)

#sign in the username

try:

sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[1]/div[1]/input").send_keys('yourusername')

print 'user success!'

except:

print 'user error!'

time.sleep(1)

#sign in the pasword

try:

sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[2]/div[1]/input").send_keys('yourPW')

print 'pw success!'

except:

print 'pw error!'

time.sleep(1)

#click to login

try:

sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[6]/a").click()

print 'click success!'

except:

print 'click error!'

time.sleep(3)

验证登录成功与否，若currenturl发生变化，则认为登录成功：

curpage_url = sel.current_url  
print curpage_url  
while(curpage_url == loginurl):  
    #print 'please input the verify code:'  
    print 'please input the verify code:'  
    verifycode = sys.stdin.readline()  
    sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[3]/div[1]/input").send_keys(verifycode)  
    try:  
        sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[6]/a").click()  
        print 'click success!'  
    except:  
         print 'click error!'  
    time.sleep(3)  
    curpage_url = sel.current_url

curpage_url = sel.current_url

print curpage_url

while(curpage_url == loginurl):

#print 'please input the verify code:'

print 'please input the verify code:'

verifycode = sys.stdin.readline()

sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[3]/div[1]/input").send_keys(verifycode)

try:

sel.find_element_by_xpath("//div[@id='pl_login_form']/div/div[2]/div[6]/a").click()

print 'click success!'

except:

print 'click error!'

time.sleep(3)

curpage_url = sel.current_url

通过对象的方法获取当前访问网站的session cookie:

#get the session cookie  
cookie = [item["name"] + "=" + item["value"] for item in sel.get_cookies()]  
#print cookie  
  
cookiestr = ';'.join(item for item in cookie)  
print cookiestr

#get the session cookie

cookie = [item["name"] + "=" + item["value"] for item in sel.get_cookies()]

#print cookie

cookiestr = ';'.join(item for item in cookie)

print cookiestr

得到cookie之后，就可以通过urllib2、scrapy、requests等访问相应的网站：

import urllib2  

print '%%%using the urllib2 !!'  
homeurl = sel.current_url  
print 'homeurl: %s' % homeurl  
headers = {'cookie':cookiestr}  
req = urllib2.Request(homeurl, headers = headers)  
try:  
    response = urllib2.urlopen(req)  
    text = response.read()  
    fd = open('homepage', 'w')  
    fd.write(text)  
    fd.close()  
    print '###get home page html success!!'  
except:  
    print '### get home page html error!!'

import urllib2

print '%%%using the urllib2 !!'

homeurl = sel.current_url

print 'homeurl: %s' % homeurl

headers = {'cookie':cookiestr}

req = urllib2.Request(homeurl, headers = headers)

try:

response = urllib2.urlopen(req)

text = response.read()

fd = open('homepage', 'w')

fd.write(text)

fd.close()

print '###get home page html success!!'

except:

print '### get home page html error!!'

转载自：http://blog.csdn.net/warrior_zhang/article/details/50198699

引入selenium package, 建立webdriver对象

打开设定的url，并等待response:

通过xpath找到登录框，并填入相应帐号密码，模拟点击登录：

验证登录成功与否，若currenturl发生变化，则认为登录成功：

通过对象的方法获取当前访问网站的session cookie:

得到cookie之后，就可以通过urllib2、scrapy、requests等访问相应的网站：

Blogroll