Reading Financial Statements into Python Pandas – Episode 4

How to Read Financial Statements (income statement) into Python Pandas from SimFin data for Value Investing Stock Analysis Series, Episode 4

Source Code:

Similar Posts

15 thoughts on “Reading Financial Statements into Python Pandas – Episode 4
  1. Got a heads up from Thomas Flassbeck (founder) of SimFin. He wrote in his email: "You are right with the interaction of the downloaded files and pandas – I changed the format from .xls to .xlsx now, I think it works now. The only thing I had to install additionally was XLRD (conda install xlrd) in order for Pandas to load the file." Thanks Thomas!

  2. The technical issues you dealt with are a goldmine. It is really helpful to see you work through each issue and get through it. Tons of respect for your python skill. This episode really taught me a lot about cleaning data for jupyter notebooks

  3. thanks for your contribution to my learning Taewoo. all of the code examples i have found that analyse stocks seem to just look at one or maybe several predetermined companies. i am trying to learn how to build my own screener for all of the stocks. do you know where you can get the csv file, for example, all of the S&P500 stocks?



  4. Thank you! This is very helpful. Does SimFin have an API where you can download this information directly into Python as opposed to excel then python? The API would be very useful for downloading stock information in bulk.

  5. Hi Taewoo! Thank you so much for this video. You are truly great.

    I am having trouble around 7:30 of the video… with downloading the excel file to the python notebook. I cant seem to figure that out and whenever I try to write the code "df = pd.read_excel("NVDA_Q.xlsx")
    df.dropna(how='all', inplace=True)" it tells me "FileNotFoundError: [Errno 2] No such file or directory: 'NVDA_Q.xlsx'"

  6. Hello can you please explain me about
    Why you written (Data provided by SimFin) its is string right can we write anything in the same place of like xyz etc

    df.loc[ df ["Datdf.loc[ df ["Data provided by SimFin"] == 'Balance Sheet'].index[0]a provided by SimFin"] == 'Balance Sheet'].index[0]

  7. Can you please make a video on, how to read daily P/L statement which we receive from broker and entering it automatically in excel sheet? this will help millions of traders as everyone journal and it consumes most of our time.

  8. I resolved my "KeyError" by opening/saving with Excel. seems to work if you don't have access to a local install.

    As far as the mysterious extra row Dutch mentioned that seems to magically disappear between 15:2515:27, it'd be nice if you could address how you get rid of it. The method mentioned by Dutch works, but if you run all cells repeatedly, it will continuously trim rows, and break the idempotence of the notebook.

    It appears in the notebook in your GitHub repo, this issue has never actually been addressed. You have both problems I run in to, duplicate "in million USD", as well as a second header row titled "2", which contains the quarters.

  9. Thanks for the video. One question: In Out[22] around 15:25 you have a repeat of the column headers as row two in you data. Cut to about 15:27 and Out[27] that row has been removed. Would you mind touching on where and how you changed your code to remove that first row from repeating? Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *