HTML 소스에 액세스하는 Python Selenium

programing

HTML 소스에 액세스하는 Python Selenium

lastcode 2023. 7. 21. 21:39

HTML 소스에 액세스하는 Python Selenium

Python과 함께 Selenium 모듈을 사용하여 변수의 HTML 소스를 가져오려면 어떻게 해야 합니까?

저는 다음과 같은 것을 하고 싶었습니다.

from selenium import webdriver

browser = webdriver.Firefox()
browser.get("http://example.com")
if "whatever" in html_source:
    # Do something
else:
    # Do something else

어떻게 해야 하나요?저는 HTML 소스에 액세스하는 방법을 모릅니다.

속성에 액세스해야 합니다.

from selenium import webdriver

browser = webdriver.Firefox()
browser.get("http://example.com")

html_source = browser.page_source
if "whatever" in html_source:
    # do something
else:
    # do something else

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome()
html_source_code = driver.execute_script("return document.body.innerHTML;")
html_soup: BeautifulSoup = BeautifulSoup(html_source_code, 'html.parser')

이제 BeautifulSoup 기능을 적용하여 데이터를 추출할 수 있습니다...

driver.page_source를 사용하면 페이지 소스 코드를 얻을 수 있습니다.페이지 소스에 텍스트가 있는지 확인할 수 있습니다.

from selenium import webdriver
driver = webdriver.Firefox()
driver.get("some url")
if "your text here" in driver.page_source:
    print('Found it!')
else:
    print('Did not find it.')

페이지 소스를 변수에 저장하려면 driver.get:

var_pgsource=driver.page_source

if 조건을 다음으로 변경합니다.

if "your text here" in var_pgsource:

Selenium2 라이브러리를 사용하면get_source()

import Selenium2Library
s = Selenium2Library.Selenium2Library()
s.open_browser("localhost:7080", "firefox")
source = s.get_source()

페이지 소스를 사용하면 전체 HTML 코드를 얻을 수 있습니다.
따라서 먼저 데이터를 검색하거나 요소를 클릭해야 하는 코드 또는 태그 블록을 결정합니다.

options = driver.find_elements_by_name_("XXX")
for option in options:
    if option.text == "XXXXXX":
        print(option.text)
        option.click()

이름, XPath, id, 링크 및 CSS 경로별로 요소를 찾을 수 있습니다.

간단히 사용할 수 있습니다.WebDriver객체, 그리고 그것을 통해 페이지 소스 코드에 대한 접근.@property들판page_source...

이 코드 스니펫을 사용해 보세요 :-)

from selenium import webdriver
driver = webdriver.Firefox('path/to/executable')
driver.get('https://some-domain.com')
source = driver.page_source
if 'stuff' in source:
    print('found...')
else:
    print('not in source...')

URLlib에 사용할 URL을 가져오는 것에 대한 질문에 대답하려면 다음 JavaScript 코드를 실행하십시오.

url = browser.execute_script("return window.location;")

나는 urlib로 소스를 얻는 것을 추천하고, 만약 당신이 구문 분석할 것이라면, Beautiful Soup 같은 것을 사용합니다.

import urllib

url = urllib.urlopen("http://example.com") # Open the URL.
content = url.readlines() # Read the source and save it to a variable.

언급URL : https://stackoverflow.com/questions/7861775/python-selenium-accessing-html-source

'programing' 카테고리의 다른 글

일시적으로 auto_now / auto_now_add 사용 안 함 (0)	2023.07.21
PEP 8, 키워드 인수 또는 기본 매개 변수 값에서 '=' 주변에 공백이 없는 이유는 무엇입니까? (0)	2023.07.21
get_dummies(판다스)와 OneHotEncoder(Scikit-learn)의 장단점은 무엇입니까? (0)	2023.07.21
NumPy와 SciPy에서 BLAS/LAPACK 연결을 확인하는 방법은 무엇입니까? (0)	2023.07.21
ImportError: pip을 설치한 후 바로 pip'이라는 모듈이 없는 이유는 무엇입니까? (0)	2023.07.21

현재글HTML 소스에 액세스하는 Python Selenium

각종 프로그래밍 정보를 다루는 블로그입니다.

ASP.NET, jquery, Oracle, ajax, C, Excel, Wordpress, Python, SQL-Server, mariadb, json, mongoDB, Git, angularjs, android, ReactJS, MySQL, Swift, WPF, spring-boot,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

lastcode

HTML 소스에 액세스하는 Python Selenium

HTML 소스에 액세스하는 Python Selenium

'programing' 카테고리의 다른 글

'programing'의 다른글

티스토리툴바

HTML 소스에 액세스하는 Python Selenium

HTML 소스에 액세스하는 Python Selenium

'programing' 카테고리의 다른 글

'programing'의 다른글

관련글

티스토리툴바