Sometimes you want to store your model output in a dataframe, for example when running grouped regressions so you’ll have multiple regression outputs to review and sort etc.
But while using pmdarima’s auto_arima() function I was surprised to learn there’s no summary2() as you’ll find in statsmodels. Instead you’ll get that same retro block of text which has been familiar to users of Python scikit-learn and R’s glm() for two decades:
data:image/s3,"s3://crabby-images/4e251/4e251b0869be1df17e1cdff8b77a1e13f4060620" alt=""
To be sure, you can extract some of this data with the .tables property list. For example, .tables[1] returns
data:image/s3,"s3://crabby-images/11f81/11f81b181b8bc548e31ba13d4803f0d8cef50e1a" alt=""
which looks great! and you can send this into a pd.DataFrame() and get a dataframe
(albeit one with non-string column names so you can’t reference the columns in slicing).
but…
those individual cell values in the table are of type statsmodels.iolib.table.Cell
data:image/s3,"s3://crabby-images/3fdd9/3fdd9176e797dc207d19a61fb6e2303eedfd178d" alt=""
which cannot be coerced to anything nor assigned to a new dataframe in a series nor really used for anything at all.
Fortunately the “Cell” object has a .data property which can be used to extract the individual items.
TLDR: to save you the same headache, here’s a function to extract cell values from Summary() objects in pmdarima auto_arima:
data:image/s3,"s3://crabby-images/7d899/7d899a93727756dee803b6fbf7e53272489e757c" alt=""
Have a nice day.