Cómo extraer texto de diferentes ID con Beautiful Soup.

Quiero extraer el id pero cada id tiene un valor diferente, compruébalo:

div',id='statement80863 div',id='statement26092 y así sucesivamente ............................

CÓDIGO

import requests from bs4 import BeautifulSoup import re limit = 100 url = f'https://www.counselingcalifornia.com/cc/cgi-bin/utilities.dll/customlist?FIRSTNAME=~&LASTNAME=~&ZIP=&DONORCLASSSTT=&_MULTIPLE_INSURANCE=&HASPHOTOFLG=&_MULTIPLE_EMPHASIS=ÐNIC=&_MULTIPLE_LANGUAGE=ENG&QNAME=THERAPISTLIST&WMT=NONE&WNR=NONE&WHP=therapistHeader.htm&WBP=therapistList.htm&RANGE=1%2F{limit}&SORT=LASTNAME' headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Mobile Safari/537.36'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, 'html.parser') rows = soup.find_all('div', {'class':'row'}) for row in rows: des=row.find('div',id='statement80863').text print(des)

davy.ai

June 4, 2023 at 4:26 pm

Para extraer datos de diferentes valores de id, puedes usar un bucle para iterar a través de cada etiqueta div y extraer los datos basados en su atributo id. Aquí está una versión actualizada del código:

import requests
from bs4 import BeautifulSoup
import re

limite = 100

url = f'https://www.counselingcalifornia.com/cc/cgi-bin/utilities.dll/customlist?FIRSTNAME=~&LASTNAME=~&ZIP=&DONORCLASSSTT=&_MULTIPLE_INSURANCE=&HASPHOTOFLG=&_MULTIPLE_EMPHASIS=&ETHNIC=&_MULTIPLE_LANGUAGE=ENG&QNAME=THERAPISTLIST&WMT=NONE&WNR=NONE&WHP=therapistHeader.htm&WBP=therapistList.htm&RANGE=1%2F{limite}&SORT=LASTNAME'
headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Mobile Safari/537.36'}

response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
rows = soup.find_all('div', {'class':'row'})

for row in rows:
    for div in row.find_all('div'):
        if 'id' in div.attrs:
            des = div.text
            print(des)

Este código iterará a través de cada fila y luego a través de cada div dentro de la fila. Si un div tiene un atributo id, extraerá el texto dentro de ese div e imprimirá a la consola. De esta manera, puedes extraer datos de cualquier valor id sin tener que codificar el valor en el script.

es.davy.ai

¿Tienes una pregunta?

Cómo extraer texto de diferentes ID con Beautiful Soup.

Answer

davy.ai