๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
IT๐Ÿ’ก/Python

[Python / Mac] ํŒŒ์ด์ฌ ํฌ๋กค๋ง ์˜ˆ์ œ - VIBE(๋ฐ”์ด๋ธŒ) ์Œ์›์ฐจํŠธ TOP 100 (BeautifulSoup, Selenium)

by hk713 2022. 3. 17.

โš ๏ธ Mac(๋งฅ๋ถ) m1์—์„œ ์•„๋‚˜์ฝ˜๋‹ค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค

 

์ œ๋ชฉ์—๋Š” ํŒŒ์ด์ฌ ํฌ๋กค๋ง์ด๋ผ ์ž‘์„ฑํ–ˆ์ง€๋งŒ ์ •ํ™•ํžˆ ๋งํ•˜์ž๋ฉด ์›น ์Šคํฌ๋ž˜ํ•‘ ์ด๋‹ค. 

์Œ์› ์‚ฌ์ดํŠธ์—์„œ TOP 100 ์ฐจํŠธ ๋…ธ๋ž˜ ์ œ๋ชฉ๊ณผ ์•„ํ‹ฐ์ŠคํŠธ ์ •๋ณด๋ฅผ ๊ธ์–ด์™€ txt ํŒŒ์ผ ํ˜•์‹์œผ๋กœ ์ €์žฅํ•˜๋Š” ์ฝ”๋“œ๋‹ค.

 

๋Œ€๋ถ€๋ถ„์˜ ์‚ฌ์ดํŠธ๋“ค์ด ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์—,

๋™์ ์ธ ํŽ˜์ด์ง€์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ ์œ„ํ•ด์„  Selenium์„ ์‚ฌ์šฉํ•ด์•ผํ•œ๋‹ค. (์…€๋ ˆ๋‹ˆ์›€์€ pip ์„ค์น˜ ํ›„ ์‚ฌ์šฉํ•ด์•ผ ํ•œ๋‹ค!)

pip install selenium

 

๊ทธ๋ฆฌ๊ณ  ํฌ๋กฌ ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์„ค์น˜ํ•ด์คฌ๋‹ค.

https://chromedriver.chromium.org/downloads

 

ChromeDriver - WebDriver for Chrome - Downloads

Current Releases If you are using Chrome version 100, please download ChromeDriver 100.0.4896.20 If you are using Chrome version 99, please download ChromeDriver 99.0.4844.51 If you are using Chrome version 98, please download ChromeDriver 98.0.4758.102 Fo

chromedriver.chromium.org

์œ„ ์‚ฌ์ดํŠธ์—์„œ ํฌ๋กฌ ๋ฒ„์ „์— ๋งž๋Š” ๋“œ๋ผ์ด๋ฒ„๋ฅผ ๋‹ค์šด๋ฐ›์•„ ์ž‘์„ฑํ•  ์ฝ”๋“œ ํŒŒ์ผ๊ณผ ๊ฐ™์€ ํด๋”์— ๋„ฃ์–ด์คฌ๋‹ค.

์—ฌ๊ธฐ์„œ ์ฃผ์˜ํ•  ์ ์€, ์ฒ˜์Œ ํฌ๋กฌ ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์„ค์น˜ ํ›„ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค.

WebDriverException: Message: Service ./chromedriver unexpectedly exited. Status code was: -9

๊ทธ๋Ÿฌ๋ฉด, ์‹œ์Šคํ…œ ํ™˜๊ฒฝ์„ค์ • > ๋ณด์•ˆ ๋ฐ ๊ฐœ์ธ ์ •๋ณด ๋ณดํ˜ธ ์— ๋“ค์–ด๊ฐ€ ํ•ด๊ฒฐํ•ด์ฃผ๋ฉด ๋œ๋‹ค.

"ํ™•์ธ ์—†์ด ํ—ˆ์šฉ" ์„ ๋ˆ„๋ฅด๊ณ  ๋‹ค์‹œ ์‹คํ–‰์‹œํ‚ค๋ฉด,

์ด๋Ÿฐ ๊ฒฝ๊ณ ์ฐฝ์ด ๋œจ๊ณ  ์—ฌ๊ธฐ์„œ "์—ด๊ธฐ"๋ฅผ ๋ˆ„๋ฅด๋ฉด ์ •์ƒ์ ์œผ๋กœ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.


์™„์„ฑ ์ฝ”๋“œ

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from datetime import datetime

# ๋ฐ”์ด๋ธŒ ์ฐจํŠธ ํŽ˜์ด์ง€ URL
url = "https://vibe.naver.com/chart/total"

# ์›น ํŽ˜์ด์ง€ ์—ด๊ธฐ
driver = webdriver.Chrome("./chromedriver")
driver.get(url)
driver.implicitly_wait(5)  

# HTML ์†Œ์Šค ๋ฐ›๊ธฐ
html = driver.page_source
soup = BeautifulSoup(html,'html.parser')

# ๊ณก ์ œ๋ชฉ ๊ฐ€์ ธ์˜ค๊ธฐ
spans = soup.findAll('span','inner_cell')
titles = []
for span in spans:
    title = span.select_one('span > a').get_text()
    titles.append(title)

# ๊ฐ€์ˆ˜ ๊ฐ€์ ธ์˜ค๊ธฐ
divs = soup.findAll('div','artist_sub')
artists = []
for div in divs:
    artist = div.select_one('div > span > span > a > span').get_text()
    artists.append(artist)

# ๋ธŒ๋ผ์šฐ์ € ๋‹ซ๊ธฐ
driver.quit()

# ํŒŒ์ผ ์—ด๊ธฐ
vibe_file = open("VIBE_TOP100_List.txt","a")

# ํŒŒ์ผ ์ž‘์„ฑ
head = datetime.today().strftime("%Y๋…„ %m์›” %d์ผ์˜ VIBE ์ฐจํŠธ TOP 100 ์ž…๋‹ˆ๋‹ค"+"\n")
vibe_file.write(head)
print(head)

rank = 1
for i in range(100):
    vibe_file.write(str(rank)+"์œ„"+"\t"+titles[i]+" - "+artists[i]+"\n")
    print(rank,"์œ„","\t",titles[i]," - ",artists[i],"\n",sep='')
    rank += 1
    
# ํŒŒ์ผ ๋‹ซ๊ธฐ
vibe_file.close()

์‹คํ–‰ ๊ฒฐ๊ณผ

VIBE ์Œ์› ์ฐจํŠธ Top 100 ์‚ฌ์ดํŠธ ํ™”๋ฉด์ด๋‹ค.

Jupyter notebook์—์„œ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ๋‚˜๋ฉด ์„ฑ๊ณต์ ์œผ๋กœ ๋ถˆ๋Ÿฌ์˜จ๋‹ค.

ํฌ๋กค๋ง ์ฝ”๋“œ๊ฐ€ ์žˆ๋Š” ํด๋”์— ๋“ค์–ด๊ฐ€๋ณด๋ฉด ์„ฑ๊ณต์ ์œผ๋กœ top 100 ์ฐจํŠธ๊ฐ€ ํ…์ŠคํŠธ ํŒŒ์ผ๋กœ ๋งŒ๋“ค์–ด์ง„ ๊ฒƒ ๋˜ํ•œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค๐Ÿ˜Š

( ์‹คํ–‰ ๊ฒฐ๊ณผ ํ™”๋ฉด ์บก์ณ๋ฅผ ์•ˆํ•ด๋†จ์–ด์„œ ๋‹ค์‹œ ๋Œ๋ ค์„œ ์ถ”๊ฐ€ํ–ˆ๋‹ค - 22.03.18)

 

๋Œ“๊ธ€