The Beautiful Soup module is used for web scraping in Python. Learn how to use the Beautiful Soup and Requests modules in this tutorial. After watching, you will be able to start scraping the web on your own.
💻Code: https://github.com/vprusso/youtube_tutorials/tree/master/web_scraping_and_automation/beautiful_soup
Tutorial from Vincent Russo of Lucid Programming. Check out his YouTube channel: http://bit.ly/lucidcode
🐦Vincent on Twitter: @captainhamptons
—
Learn to code for free and get a developer job: https://www.freecodecamp.org
Read hundreds of articles on programming: https://medium.freecodecamp.org
And subscribe for new videos on technology every day: https://youtube.com/subscription_center?add_user=freecodecamp
source
Hi everyone, LucidProgramming here, the creator of this video series. If you enjoyed the video, I'm really thrilled to hear that. If you like this type of content, head on over to my channel and consider subscribing for staying up-to-date with similar types of videos. If you have any requests, recommendations, I take these seriously to heart and am constantly trying to improve the content of my channel. Thanks again, and happy coding!
Thanks for the video. Could you explain how to scrap within parents or siblings of a tag?
this is an awesome tutorial, thank you! very clear and concise, but with lots of content!
How can we fetch specific tags for example- address I want to fetch from multiple files I have in .txt format.
plzzzz provide solution ….
for one file I can fetch but how can I fetch for multiples files:-
addresses = []
with open("/rawhtml/greerwilsonchapel.com_executives_contact_us.txt") as fp:
soup = BeautifulSoup(fp)
address = soup.find('div',class_="locator-titles").get_text().rstrip('n').split('n')
addresses.append(address)
print(addresses)
Someone here has a big time cold, huh?
Thanks, it really helps me to understand the uses of beautifulsoup functions.
great video!!
if you are getting an error and windows 32bit then use html.parser instead of lxml
Guys, I have a problem with this and would appreciate any help:
Whenever I attempt to run this program I get this error. I have followed all the steps, typed it correctly, installed it into the same folder as my program and python, installed all the other software in the comments section (lxml, parser) but I still get this error, if I have missed anything please tell me as it is getting very annoying now. The error is "ModuleNotFoundError: No module named 'requests'"
Great tutorial. Very clear and concise
Boom!
At 16:45 you find the a tags inside the h2 tags! That's what I needed.
Very happy! Thank you!
If any one getting lxml error while running the code just do:
pip install lxml
I am only 11 minutes in, and this is one of the clearest explanations of the data extraction syntax that I have come across so far.
For anyone getting the
AttributeError: 'NoneType' object has no attribute 'attrs'
error – the list returns two items that are empty for some reason.
Use
if a_tag is not None:
before you append to urls
I would like to say that my first computer was an Osborne 8088 (cira, 1981). I begin programming in the mid-1980s in Dbase, Pascal, and C as a hobby. In the 1980s, we would log into bulletin boards that were someone's computer, which had games and other things to trade and share. This was my internet when I was a kid. I just want to say thank you, FreeCodeCamp, for bringing the joy back into computer programming for me with such rich and insightful content.
nice
We are providing services Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Usually, such software programs simulate human exploration of the World Wide Web by either implementing low-level Hypertext Transfer Protocol (HTTP), or embedding a fully-fledged web browser, such as Internet Explorer, Google Chrome and Mozilla Firefox
Contact Us
What editor is this? Makes moving around seem so easy
very helpful but I kept falling asleep through it…
I used to be a chef that makes delicious soup, after seeing this I became a programmer that makes beautiful soups.
For some reason, it says ModuleNotFoundError when I try to type in "from bs4 import BeautifulSoup"
he sounds like someone caught cold, btw it was very helpful.
3:43, 4:31, 5:18, 7:51
Amazing vid!
at 18:18 run progam i am getting below error, pls help
line 13, in <module>
urls.append(a_tag.attrs['href'])
AttributeError: 'NoneType' object has no attribute 'attrs'
Business Opportunity in Ruvol
I have invented a Board Game [still unpublished and not yet out in the market] that I believe is guaranteed to be as challenging and exciting as CHESS. I called it “RUVOL.”
It is my hope that one day Ruvol may surpass chess as the “Number One Board Game in the World.”
The weakness of chess is it always starts in fixed positions that the opening moves become “memorizable.” In fact, not a few have so mastered the moves that they can play against their opponents “blindfolded.” It is for this very reason that the great Bobby Fischer introduced his so-called “Fischer Random Chess,” where the starting position of the pieces is “randomized” to make the memorization of openings impracticable. Fortunately, it is also for this reason that I invented Ruvol where “every game” has been calculated to be a challenging one to play.
HOW IS RUVOL PLAYED and HOW YOU CAN MONETIZE IT?
I detailed everything in my YouTube video. Here is the link: https://www.youtube.com/watch?v=jcqth0m3-R0
BIG MONEY POTENTIAL IN RUVOL!
It is worthwhile to note that the people who play chess will be the same people who will play Ruvol. In my Google search, I learned there are around 800 million chess players in the world. Even just a small percentage of these 800 million is good enough to earn big money from Ruvol either as an ONLINE GAME BUSINESS or as a PHYSICAL PRODUCT DISTRIBUTOR.
You may contact me at: rodolfovitangcol@gmail.com.
Thanks and God bless!
RODOLFO MARTIN VITANGCOL
The Ruvol Inventor
this dude lowkey sounds like charlie/penguinz0
apparenty in the whitehouse website, first <a> in <h2> is None, so to avoid error, before appending a_tag to urls, make sure a_tag is not None
…
a_tag = …
if a_tag is not None:
urls.append(a_tag.attrs['href'])
searching with tags "div" doesn't work with link.text. I don't think it recognizes div to be text. what do I need to do instead to find key words?
How is this done on mac?
It doesn't return any links containing 'about'even with the correct code..any ideas?
if you are a mac user, then use pip3 instead of pip. for mac pip3
urls.append(a_tag.attrs['href'])
AttributeError: 'NoneType' object has no attribute 'attrs'
It is giving me this error while appending my list. Anyone can help ???
Doesn't work
I am getting "AttributeError: 'NoneType' object has no attribute 'attrs'" on urls.append(a_tag.attrs['href'])
This has been helpful
Super helpful that unlike other videos, goes into depth how to actually extract different things. Thank you.
Hi, thanks for the tutorial. I am new to Python and almost totally ignorant. I downloaded Python 3.9.5 and trying to use IDLE for web scraping for a long while. You have this tab down there which starts with "NORMAL" and where you enter soup_and_requests.py. I don't have that part. Since I am totally ignorant, I do not know how to have that tab at the very bottom. I would be so grateful if you can answer my stupid question. Thanks so much.
I literally fall at the first hurdle with all of these tutorials:
zsh: command not found: pip
for h2_tag in soup.find_all("h2"):
a_tag = h2_tag.find("a")
if a_tag == None:
urls.append("Empty")
continue
urls.append(a_tag.attrs['href'])
How do I create a scrapper to track product on Amazon specific category ? Can someone help
This is how The Matrix do code.
Scrapping whatsapp in india is illegal ???
i really liked the github notes make more vids like that please
you just needed Jupyter to make your life easier with this demo 🙂