How to get the contents under a particular column in a table from Wikipedia using soup & python

bsy Source

I need to get the href links that the contents point to under a particular column from a table in wikipedia. The page is "". On this page there are a few tables with class "wikitable". I need the links of the contents under the column Title for each row that they point to. I would like them to be copied onto an excel sheet.

I do not know the exact code of searching under a particular column but I came upto this far and I am getting a "Nonetype object is not callable". I am using bs4. I wanted to extract atleast somepart of the table so I could figure out narrowing to the href links under the Title column I want but I am ending up with this error. The code is as below:

from urllib.request import urlopen
from bs4 import BeautifulSoup
soup = BeautifulSoup(urlopen('').read())
for row in soup('table', {'class': 'wikitable'})[1].tbody('tr'):
    tds = row('td')
    print (tds[0].string, tds[0].string)

A little guidance appreciated. Anyone knows?



answered 3 years ago bsy #1

Figured out that the none type error might be related to the table filtering. Corrected code is as below:

import urllib2

from bs4 import BeautifulSoup, SoupStrainer

content = urllib2.urlopen("").read()  
filter_tag = SoupStrainer("table", {"class":"wikitable"})
soup = BeautifulSoup(content, parse_only=filter_tag)

for sp in soup.find_all(align="center"):
    a_tag = sp('a')
    if a_tag:

comments powered by Disqus