EmisInv REAS Pandas
2014-09-14
###Working on REAS emission inventory to feed into WRF-CHEM###
1. REAS is an emission inventory for Asia, it has separate files for different pollutant with each category of emission source. For example, pollutant SO2 has 9 data files for each emission source types such as Aviation, Domestic, Industry etc for the year 2008.
1. The files are space separated text files having fields (columns) for longitude, latitude and monthly emission value
1. Used the Python library, pandas for importing data into python environment and combining these different category emission source files (9 files) into one single file.
1. to import the files of REAS2.1 into python as pandas data frame
a1 = pd.read_csv('REASv2.1_BC__AVIATION_2008_0.25x0.25',skiprows=10,delim_whitespace=True,header=None)
# to drop the first empty column in dataframe
a1f=a.drop(0,1)
# to name the columns of data frame
a1f.columns=['lon','lat','m1','m2','m3','m4','m5','m6','m7','m8','m9','m10','m11','m12']
a2 = pd.read_csv('REASv2.1_BC__DOMESTIC_2008_0.25x0.25',skiprows=10,delim_whitespace=True,header=None)
a2f=a2.drop(0,1)
a2f.columns=['lon','lat','m1','m2','m3','m4','m5','m6','m7','m8','m9','m10','m11','m12']
#to compare two dataframes for its equality
(af==a2f).all()
# ended up in error saying two df has to equal size
or label/ Can only compare identically-labeled DataFrame objects
a2 = pd.readcsv(‘REASv2.1BC_INDUSTRY20080.25x0.25’,skiprows=10,delimwhitespace=True,header=None)
REASv2.1_BCINTNNV20080.25x0.25 REASv2.1_BCOTHERTRANSPORT20080.25x0.25 REASv2.1BCPOWERPLANTSNON-POINT20080.25x0.25 REASv2.1_BCPOWERPLANTSNON-POINTJPN20080.25x0.25 REASv2.1BCPOWERPLANTSPOINT2008 REASv2.1BCROADTRANSPORT2008_0.25x0.25
import os fl=os.listdir(“/home/swl-sacon-dst/Documents/GISE2013/LAB/WRF-chem/Data/REAS/BC/2008/“) df1=af.setindex(‘lon’) df2=a2f.setindex(‘lon’) for file in fl: print “a2 = pd.readcsv(‘%s’,skiprows=10,delimwhitespace=True,header=None)” % (file)
def equal( df1, df2 ): return df1.fillna(1).sort(axis=1).eq(df2.fillna(1).sort(axis=1)).all().all() equal( df1, df2 )
- The files have different row length for different categories of pollutant source for particular species. These different length rows have to match with every other category and perform addition, faced the huge task and difficult to imagine the routine required for this.
- Unable to visualize the data in python platform and to use qgis left more difficult in proceeding with the above rotine> this made to search for any availability fo NETCDF of REAS data, found this and resolved the need of huge task above. Moreover, this source has Asian emission inventory for the year 2010.
- Related to this it is found to access NetCDF files without osgeo library. Based on this