IStSOS Data formating using Python
2014-05-10
Use Python with SQLite, istsos data formate and timeseries subsampling (downscaling).
With sqlite to convert a list, for example cur.fetch from database like sqlite,
['2014-04-30T10:25,2797,147', '2014-04-30T10:27,2639,174', '2014-04-30T10:29,2645,158', '2014-04-30T10:31,2676,149']
Use print “\n”.join(b) based on this gives
"2014-04-30T10:25,2797,147 2014-04-30T10:27,2639,174 2014-04-30T10:29,2645,158 2014-04-30T10:31,2676,149"
To remove double quotes from above to write into a.DAT, tried almost two hours then find out that the used method will not do this. The full code is as follows with uncommented lines showing failed attempts. Double # are comments
import csv import sqlite3 as lite #import string #identity = string.maketrans() #import simplejson #import os outfile_path='GVR.DAT' #text_file = open("GVR.DAT", "w") #writer = csv.writer(open(outfile_path, 'w'), quotechar='"',quoting=csv.QUOTE_NONE) ##it gives extra hand for defining the csv output, sticked much in this but not given the results #writer = csv.writer(open(outfile_path, 'w'),dialect='excel') #lineterminator='\n') #headers=['urn:ogc:def:parameter:x-istsos:1.0:time:iso8601','urn:ogc:def:parameter:x-istsos:1.0:cbe:dylos:np05mg','urn:ogc:def:parameter:x-istsos:1.0:cbe:dylos:np25mg'] #text_file.write('urn:ogc:def:parameter:x-istsos:1.0:time:iso8601,urn:ogc:def:parameter:x-istsos:1.0:cbe:dylos:np05mg,urn:ogc:def:parameter:x#-istsos:1.0:cbe:dylos:np25mg') #writer.writerow(headers) #text_file.close ##follwoing comments are for accessing sqlite con = lite.connect('GVR.db') cur = con.cursor() cur.execute("SELECT * FROM data") a= [str(i[1]) for i in cur.fetchall()] ##without maping it produce file with alternative line kept empty b = map(str.strip,a) #c= str(b).strip("[']") #e= [str(i) for i in b] #d="\n".join(b) #print d #print d #e = map(str.strip,b) #e=d.translate(None, '"') #e = d.translate(identity, '",') #e = d[1:-1] #print "[%s]" % (','.join(e)) #f = map(lambda x: x.strip('"'), d) #writer.writerow([d]) #for item in e: # writer.writerow([item,].replace'"') #csvwriter.writerow(JD.split()) #e='map' #text_file = open("GVR.DAT", "w") #text_file.write(e) #text_file.close #writer = csv.writer(open("GVR.DAT", "wb"), quoting=csv.QUOTE_NONE) #reader = csv.reader(open("GVR.DAT", "rb"), skipinitialspace=True) #writer.writerows(reader) #for val in e: # writer.writerow([val]) #writer.writerow(d) f = open('output.txt', 'w') f.write('urn:ogc:def:parameter:x-istsos:1.0:time:iso8601,urn:ogc:def:parameter:x-istsos:1.0:cbe:dylos:np05mg,urn:ogc:def:parameter:x-istsos:1.0:cbe:dylos:np25mg'+"\n") ##this simple comments alveate two double quote removal for two hours f.write("\n".join(a)) #simplejson.dump(d, f) f.close()
Write CSV method is not at all working, tried to write using file method and worked fine.
Istsos date formate
- The Istsos requires to date and time in this ( 2014-04-30T11:25:00.000000+0530) format; the data from Dylos is cut shorten to reduce size. So to edit the column it has to convert into a data frame in python.
Pandas data frame conversion based on this
col_names=["date","np05","np25"] data = pa.read_csv("GVR.DAT", header=None,skiprows=1, parse_dates="date",names=col_names) Read the DAT file created in the earlier step. data["date"]=pa.to_datetime(data["date"]) data["date"] = data['date'].apply(lambda x: x.strftime("%Y-%m-%dT%H:%M:%S.000000+0530")) calling specific date column, then converting into desired date formate.
Time scaling
Based on this and this has to reindex the dataframe into datetimeindex as follows
data2 = data.set_index(pa.DatetimeIndex(data['urn:ogc:def:parameter:x-istsos:1.0:time:iso8601']))
then using this index to resample data frame for every fifteen minutes last data, the default resampling is ‘mean.’
data3=data2.resample('15Min', how='last')
Then set the index to the date columns. Otherwise, it writes one extra column into a data file
data4=data3.set_index('urn:ogc:def:parameter:x-istsos:1.0:time:iso8601')
then to make a file out of this data frame
data4.to_csv('GVR1.DAT', sep=',', encoding='utf-8')