Timeseries Pandas
2014-10-28
###Time series data plot and table creation with pandas and latex###
- to import the csv file into pandas
df = pd.read_csv('/home/swl-sacon-dst/Documents/GISE_2013/LAB/Aerocet_DATA/TDM/TDM_MASS_20102014_171059-073359.csv')
, based on this - To sepcifiy date time index in the dateframe
df = df.set_index(pd.DatetimeIndex(df['Time']))
, based on this - To resample 1 minute data into 15 minutes by avergae method,
bars=df.resample('15min')
, here default method is mean. based on this - To select specific columns in pandas
df1=df[['Time','PM2.5(ug/m3)','PM10(ug/m3)','TSP(ug/m3)','AT(C)','RH(%)']]
based on this - To plot datframe by
import matplotlib.pyplot as plt; bars['PM2.5(ug/m3)'].plot(marker='v')
and to show the plot baased onimport pylab; pylab.show()
, to show the legend plt.legend().
two methods can be followed, one is direct matplotlib lib as follows
import matplotlib.pyplot as plt
plt.plot_date(x=db.index, y=db['PM2.5(ug/m3)'], fmt="r-")
plt.show()
next methods is direct pandas plot functionaltiy need pylab
import pylab
db.plot(style=['o','rx'])
pylab.show()
plt.legend()
- to round the resampled data based on this
data11=np.round(data1, decimals=2)
import pandas as pd import numpy as np df = pd.readcsv(‘/home/swl-sacon-dst/Documents/GISE2013/LAB/AerocetDATA/TDM/TDMMASS20102014171059-073359.csv’) df = df.setindex(pd.DatetimeIndex(df[‘Time’])) df1=df[[‘Time’,‘PM2.5(ug/m3)’,‘PM10(ug/m3)’,‘TSP(ug/m3)’,‘AT©’,‘RH(%)’]] df1 = df1.setindex(pd.DatetimeIndex(df[‘Time’])) bars=df1.resample(‘15min’) data2=np.round(data1, decimals=2)
####To print table out of dataframe#### 1. Based on earlier note on converting pandas data frame into latex, the following script was used to take the tex from the pandas’ data frame
import pandas as pd
import numpy as np
df = pd.read_csv('TDM_MASS_20102014_171059-073359.csv')
df = df.set_index(pd.DatetimeIndex(df['Time']))
df1=df[['Time','PM2.5(ug/m3)','PM10(ug/m3)','TSP(ug/m3)']]
data1=df1.resample('15min')
data11=np.round(data1, decimals=2)
data11.to_csv('TDM_15min.csv')
data2=pd.read_csv('TDM_15min.csv')
with open("my_table.tex", "w") as f:
f.write("\\begin{tabular}{" + " | ".join(["l"] * len(data2.columns)) + "}\n")
columnLabels = ["\\textbf{%s}" % label for label in data2.columns]
f.write ("%s\\\\\\hline\n" % " & ".join(columnLabels))
for i, row in data2.iterrows():
f.write(" & ".join([str(x) for x in row.values]) + " \\\\\n")
f.write("\\end{tabular}")
f.close