Parsing HTML Tables

developerkushal · March 13, 2020, 9:43am

Can anyone help me resolve the issue of extra "\n"s being printed as output?

S18CRX0120 · March 13, 2020, 10:27am

Hey @developerkushal, you need to remove it individually from the columns by iterating over them, there’s no direct way to do it.

Hope this cleared your doubt.

developerkushal · March 17, 2020, 6:51am

But the question is why did the "\n"s not show up when Prateek Bhaiya did it in the tutorial?

S18CRX0120 · March 17, 2020, 10:44am

I think you need to change “if idx==0 or idx==3:” instead of “if idx!=0 and idx!=3:”.

developerkushal · March 17, 2020, 10:47am

Sorry, it’s wrong. Try again.

S18CRX0120 · March 17, 2020, 10:49am

Share your ipynb notebook after uploading it on google drive and sharing the link here.

developerkushal · March 17, 2020, 10:57am

Here you go:

https://drive.google.com/file/d/1raTajQuTplExs5F3jXTSeV31560iZwe7/view?usp=sharing

S18CRX0120 · March 17, 2020, 12:25pm

Use this code,
table_rows = []
for row in rows_data:
current_row = []
row_data = row.findAll(‘td’, {})
for idx,data in enumerate(row_data):
current_row.append(data.text[:-1])
print(current_row)
table_rows.append(current_ro

It will print desired results.

developerkushal · March 17, 2020, 12:36pm

Thank you so much for helping me out!