Excel Pandas LATEX PDF
2014-10-28
###CSV data into pandas and LATEX then into PDF and then print!###
- To import csv file with date into pandas data frame. Pandas can parse the date column, but it won’t be correct. The date column formate has top be mentioned and parsed for this, like below based on [this], then this can be imported into pandas data frame indicating the date parser.
dateparse = lambda x: pd.datetime.strptime(x, ‘%d/%m/%y’) pro=pd.readcsv(‘PROJECTEXPNDTR1.csv’,parsedates=[1],date_parser=dateparse)
dateparse = lambda x: pd.datetime.strptime(x, ‘%m/%d/%y’) pro=pd.readcsv(‘PROJECTEXPNDTR1.csv’,parsedates=[1],date_parser=dateparse)
- To concat two pandas data frame as row wise
pro0=pandas.concat([pro,pro1],axis=0)
- To fill empty cells in pandas data frame with zero values.
pro2=pro0.fillna(0)
- To drop unwanted columns
pro3=pro2.drop([‘Total’,‘Unnamed: 10’],axis=1)
- To create new column with values summed over different columns
pro3[‘Total’]=pro3[‘Permanent Equipment’]+pro3[‘Consumables’]+pro3[‘Travel ‘]+pro3[‘Remote Deployment’]+pro3[‘Site Preparation’]+pro3[‘Other Project Works’]
- To sort the dataframe with date wise
pro4=pro3.sort([‘Date’])
- To set the date column as index and remove unwanted seconds values in date column
pro5=pro4.set_index([‘Date’])
- To give s.no column to data frames first has to create a numpy array with the length of data frames as
import numpy count = numpy.arange(1,149,1)
then attached this array into dataframe as s.no
pro6[’S.no’] = count
- To reorder the dataframe and kept s.no column as the first column based on this
cols = list(pro6.columns.values)
pro6=pro6[[’S.no’,‘Particulars’, ‘Permanent Equipment’, ‘Consumables’, ‘Travel ‘, ‘Remote Deployment’, ‘Site Preparation’, ‘Other Project Works’, ‘Total’]]
- To get a total row for values, there is a way with specifying margin in the pandas pivot table, but it is giving the error. Based on this done that sum all the row in data frame by
sum_row = {col: pro6[col].sum() for col in pro6}
convert the resultant dict into data frame
sumdf = pd.DataFrame(sumrow, index=[“Total”])
drop unwanted columns from the data frame
sumrow1=sumdf.drop([‘Particulars’,’S.no’],axis=1)
Then append the data frame with old data frame
pro7=pro6.append(sum_row1)
reorder the datafarme
pro8=pro7[[’S.no’,‘Date’,‘Particulars’, ‘Permanent Equipment’, ‘Consumables’, ‘Travel ‘, ‘Remote Deployment’, ‘Site Preparation’, ‘Other Project Works’, ‘Total’]]
- To convert the pandas dataframe into latex table formate based on earlier not on this topic
with open(“pro_expnd.tex”, “w”) as f: f.write(”\begin{tabular}{” + “ | “.join([“l”] * len(pro8.columns)) + “}\n”) columnLabels = [”\textbf{%s}” % label for label in pro8.columns] f.write (“%s\\\hline\n” % “ & “.join(columnLabels)) for i, row in pro8.iterrows(): f.write(” & “.join([str(x) for x in row.values]) + “ \\\n”) f.write(”\end{tabular}“) f.close()
- To convert this table into pdf as latex documnet,
\documentclass{article}
\usepackage[a4paper,margin=1in,landscape]{geometry}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{blindtext}
\begin{document}
%% for adjust columns size\begin{longtable}{l | c | p{5cm} | p{2cm}| p{2cm} |p{2cm}| p{2cm}| p{2cm} |p{2cm} | p{2cm} | p{2cm}}%%
%%%python gernated tex file content column heading%%
\endhead
%%%python gernated tex file content rows%%
\end{longtable}
\end{document}
- The final table doesn’t contain Date column, it is due to it is an index in pandas data frame