Sunday, January 20, 2013

Project 1: Home Energy Monitor with Arduino, Part 3/3: Data

In the two previous articles, a sensor box and related software were developed for Arduino microcontroller board, providing measurements of home energy usage by observing the rate of energy meter's LED blinks (digital meter) or disk revolutions (analog meter). In this article, we'll focus on the data itself, and learn how to use a PC for logging sensor streams to local files, how to create nice plots and statistics from sensor data captures, and how to upload data to the cloud (Cosm) for everyone to see. In the spirit of openness and open source, PC software will be written using Python programming language on Ubuntu Linux operating system.


LEVEL: Beginner




Motivation


So far we have developed a sensor box and sensor sampling software for Arduino, using home energy meter as a data source. Hence, we have a setup that provides measurements of current energy consumption once per second, ready to be plugged into a USB port of a PC. However, all this will be of little use without a piece of software that stores the data stream into a file. And again, these data files are pretty pointless without means for visualizing their content as graphs or scripts that calculate interesting statistics from the tens of thousands of individual samples.

We must start by making a few decisions. The first thing would be to decide which programming language to use on the PC side. For hobby projects - and why not for professional work as well - I recommend the Python programming language, because it is pretty easy to learn, comes with all kinds of great libraries, and in general allows you to do wonders with just a few lines of code. Over the years I have learned to value that last part, especially in prototyping and hobby projects: the less code lines you have, the easier it is to comprehend and memorize the whole thing, which makes at least me much more productive.

Python runs on many different platforms, so you can choose any one that you like - probably it is Windows, Mac or Linux. Personally I prefer open source solutions especially at home tinkering projects, so I choose Linux. Since this article is written with beginners in mind, we'll go with the Ubuntu flavor. However, to stay focused to sensors, I won't go through setting up the operating system nor installing Python to it - there's plenty of resources available for both and you know how to use Google. Besides, in Ubuntu it's already set up - just type python in console to start the interactive shell. If you prefer to use Windows or Mac, that won't be a problem as Python is easy to install.

We are fortunate in that for Python there are truly excellent tools available free of charge as add-on libraries: Via PySerial we get access to computer's serial ports for reading the data stream from Arduino. With the aid of the Matplotlib library, we can create incredibly useful plots with just a few lines of code. Statistics and general number crunching will be handled with ease via the Numpy library. These tools are really vast in features; we will barely scratch the surface. However, you should find that they are also rather simple to use and you can easily expand the examples given here to create more fancy stuff on your own - which is the goal of this article series. Now, let's get going.



Printing the Data Stream


What we need first is access to serial data sent from Arduino to PC. In Linux 'everything is a file', including serial ports - so basically you can read Arduino's data stream from a special file that is in fact mapped to a serial port. However, a more generic solutions is to install pySerial library, which works well on all common operating systems. For example, in Ubuntu Linux you can install it with this command:
sudo apt-get install python-serial

Next, start your favorite code editor (I use Eclipse with PyDev). Create a new file called Printer.py, so that we can start writing code. Then type or copy these lines:

import serial

serial_port = serial.Serial('/dev/ttyUSB0', 9600)
while 1:
    print serial_port.readline().strip()
serial_port.close()

Here we import the serial library and open the serial port where Arduino is connected to. In Ubuntu it's probably /dev/ttyUSB0, while in Windows it could be something like COM1 and with Mac you need to look for /dev/tty.usbserial. Check e.g. from the Arduino IDE to which serial port Arduino board is connected to, and use that name. Also, we must use here the same baud rate that we selected in Arduino code in the previous article: 9600 bps. Else, serial data will get corrupted. Next we have a simple eternal loop, which prints the lines read from the serial port - after stripping the line end character, as Python's print command will add one anyway.

Save the file, and in case you're on Linux add execution rights to it:
chmod +x Printer.py

Then run it:
python Printer.py

Now we have re-implemented same kind of console view that Arduino IDE provides, using just a few lines of Python code! You can stop it by pressing CTRL+C.


Logging into a File


In order to save the data stream into a file, create a new code file called Logger.py. We'll use the code above as a basis but add a few more lines to open a file and direct the stream there. Here I've selected to use CSV file format (comma-separated values), which is common for storing measured values. You can open these files for example in Microsoft Excel or LibreOffice Calc. If you run the following code, you should get a file energy.csv that contains measured number of blinks per second, one value per line. File is updated after each sample that is received from Arduino.

import serial

serial_port = serial.Serial('/dev/ttyUSB0', 9600)
data_file = open("energy.csv", "w")
while 1:
    data_file.write(serial_port.readline())
    data_file.flush()
data_file.close()
serial_port.close()

Save the file, and in case you're on Linux add execution rights to it:
chmod +x Logger.py

Then run it:
python Logger.py

However, in this form the data is not yet very useful. It'd be a good idea to at least 1) support giving target folder as a parameter, 2) add a timestamp for each measurement, so that we know when these LED blinks were measured, and 3) to start a new file for example every midnight. This way we can keep the file sizes reasonably small, avoid writing long time stamp for each measurement (as date can be seen from the file name), and don't need to mix any metadata fields into the file. Here's an improved version with many tweaks:

import serial, datetime, os, sys

serial_port = serial.Serial('/dev/ttyUSB0', 9600, timeout=10)
log_path = os.path.abspath(sys.argv[1]) if len(sys.argv) > 1 else '.'
if not os.path.exists(log_path): os.makedirs(log_path)
data_file = None; last = None

while 1:
    data = serial_port.readline()
    now = datetime.datetime.now()
    
    if last == None or now.day > last.day:
        if data_file is not None:
            data_file.flush(); data_file.close()
        file_name = os.path.join(log_path, now.strftime('%Y%m%d_energy.csv'))
        data_file = open(file_name, 'a', 1024)
    
    data_file.write(now.strftime('%H%M%S,') + data)
    last = now

First, there's a 10 second timeout in opening the serial port (in case Arduino is not connected). Next, we check if any command line parameters were given, and if so use the first given argument as a target folder. We also automatically create any missing folders. If there are no given parameters, current folder will be used for output. We request current time for each sample using Python's datetime object. We also store it, so that we can observe the moment when the day changes, and hence flush and close current file and open another one for the new day that just started. Date is stored only in the file name, while clock time is written for each sample using a comma ',' as a separator between time and sample columns. File is now opened in append mode, so that we can continue writing data to the same file in case it stops for a while because of some incident. Flushing has also been removed after write command, and buffer size for automatic flushing added (i.e. writes data to file when there is at least 1024 bytes to be written).

So there it is, a nice logger with lots of  features written in Python with just 15 lines of code! Notice that we could optimize time stamps away and save disk space by including the start time into the file name, as each new line in the file corresponds to "one second later" i.e. line number equals elapsed time. However, then we couldn't notice if logging restarts i.e. if there are occasional gaps in data. A compromise would be to use the number of seconds elapsed from file start time (mentioned in the file name) as a running time stamp. But, disk space is cheap and clearly expressed time stamp makes viewing and processing the data easier.


Autostart


Since we typically want to continue logging 24/7/365, it's a good idea to make sure that the script is automatically started after boot. Let's do that on Ubuntu Linux. First, put this as the first line of the script:

#!/usr/bin/python

For Linux system, this means that the script should be run with Python, and now you can start it simply by calling ./Logger.py instead of python Logger.py.

Next, copy it to /usr/local/bin and make sure that execution right is set:
sudo cp Logger.py /usr/local/bin/EnergyLogger.py
sudo chmod +x /usr/local/bin/EnergyLogger.py
Let's make it start during boot. Open startup file for editing:
sudo nano /etc/rc.local
Then, just before the last line that says exit 0 add this text:
/usr/local/bin/EnergyLogger.py /home/tapani/energylogs &
Here we give a target directory for log files: /home/tapani/energylogs. Of course, you can pick any target folder that you want. If not given, logs will be written to /usr/local/bin folder! Notice that the last character '&' is important, as the process must be left running in the background. Save the file by pressing CTRL+O and exit with CTRL+X. If you now restart the computer, logger should automatically start. You can test it, this should print two lines if it is running.
ps aux | grep EnergyLogger.py


Watchdog


What if something bad happens and logging stops? Wouldn't it be great if the system could notify you about this? Wouldn't it be even better if it could also automatically restart logging? That's pretty simple to do. Create a new file called EnergyLogger_watchdog.sh to folder /usr/local/bin, and write there these lines:

#!/bin/bash

until /usr/local/bin/EnergyLogger.py /home/tapani/energylogs; do
    EMAIL="/tmp/energylogger.txt"
    echo "EnergyLogger crashed with exit code $?. Restarting..." > $EMAIL
    sendEmail -f energylogger@sensorbay.com -t your@email.com \
    -u "EnergyLogger restarted" -s your.smtp.server <$EMAIL -o tls=no
    sleep 1
done

Set execution rights:
sudo chmod +x /usr/local/bin/EnergyLogger_watchdog.sh
You can run it like this:
sudo /usr/local/bin/EnergyLogger_watchdog &
When run, it will first start the EnergyLogger.py process, and then wait until that process stops. If it has stopped prematurely, it will be automatically restarted and user notified with an email. Notice that you should set your target email address and web provider's SMTP server. In case you don't have sendEmail installed, install it with sudo apt-get install sendemail.

To test this, type
ps aux | grep EnergyLogger
You should get something like this:


root      2618  0.0  0.2   2116  1040 pts/1    S    14:55   0:00 /bin/bash ./EnergyLogger_watchdog.sh
root      2662  0.0  0.8   7664  4012 pts/1    S    14:57   0:00 /usr/bin/python /usr/local/bin/EnergyLogger.py /home/tapani/energylogs
root      2771  0.0  0.1   1536   612 pts/1    S+   15:06   0:00 grep EnergyLogger


Observe the PID number from the .py process, here 2662. Then kill the process, so that watchdog will restart it and send an email notification:
sudo kill 2662
One more thing. If you want to autostart this watchdog-version:
sudo nano /etc/rc.local
Then, just before the last line that says exit 0 put this text instead of the one we added earlier:
/usr/local/bin/EnergyLogger_watchdog.sh &


Calculating Statistics


After the logger has been running a few days and several data files have appeared on your hard disk, you probably start wondering what to do with them. In order to analyze and understand this numerical data, we need to be able to calculate some statistics that provide interesting energy usage information in a nutshell. While it is perfectly OK to open a CSV file in a spreadsheet application and do calculations there, it is much more convenient to process multiple files with a to-the-point style script. To help with this task, we'll use the Numpy library. You can install it as follows:
sudo apt-get install python-numpy
Now, create a new file called Statistics.py. Then type this as a starter:

import sys, numpy

kW_factor = 10000 # LED pulses per kWh

data = numpy.loadtxt(open(sys.argv[1], "rb"), delimiter=",")

print 'Total samples/d:', data.shape[0]
print 'Total pulses/d:', numpy.sum(data[:,1])
print 'Total kW/d:', numpy.sum(data[:,1]) / kW_factor

Here we define the number of LED pulses per kWh of energy, taken from a text printed on the energy meter. Then we load in the data file whose name is given as a command line argument, assuming the CSV (command-separated value) format. It is now rather simple to print some basic statistics. Numpy stores the data in an array - in your mind you can think about it as an Excel sheet with time stamps in the first column (index=0) and data samples in the second (index=1). The rows are the lines of the data file (index=0...n). For example, to sum all values in the second column i.e. LED blinks, we first select all rows from that column only by saying data[:,1] and then pass the result to numpy.sum() method, which returns the sum.

Save the file, and in case you're on Linux add execution rights to it:
chmod +x Statistics.py
Then run it:
python Statistics.py 20121216_energy.csv
As a result, you should get something like this:

Total samples/d: 84359
Total pulses/d: 1126855.0
Total kW/d: 112.6855

Notice that in a single day there are 60*60*24=86400 seconds, and we've got only 84359 samples. Hence, we're missing 2041 samples, or about half an hour of data. Since the first time stamp is 00:00:00 and last one is 23:59:59, the logger has captured full 24 hours of the day and as a consequence, some samples have been lost during logging. We'll come back to this later.

Next value tells that over a million LED pulses have been captured. Nice, but trivial. The next value is more interesting: 112.7 kWh of energy has been consumed. I compared this value to the statistics given by my electricity company, which tell that the meter has in reality captured 112.5 kWh of energy that same day. Hence, the sensor box has measured 112.7/112.5*100%=100.2% of the actual LED pulses, which is pretty accurate result. Main reason for the difference in results is difference in time i.e. PC clock vs. meter clock, not the lost samples. How do I know? I checked the statistics from the two previous days:

  • 101.0 kWh measured, in reality 101.0 kWh (samples: 84360)
  • 94.6 kWh measured, in reality 94.6 kWh (samples: 84361)

We can conclude that the actual measurement is very accurate i.e. each data file contains very closely the LED pulse count that it should contain, but the number of samples i.e. lines in a log file is steadily off from the theoretical value. Such errors can be detected from the files; they look like this:


000057,16
000059,15


It seems that one second has been lost. Notice that the latter value (15 pulses) is close to first value (16 pulses), not for example about double to it.

It is rather simple to realize what is the reason for this error: we don't have a real time clock in Arduino, and don't use timer interrupt for time keeping - instead, we simply wait 5ms in a loop that contains other commands as well, but haven't compensated for the time required to run these extra commands! Hence, each loop takes slightly more than 5ms to complete. Yet, we simply count 200 loops (aiming for 200 Hz sampling rate) before we send LED pulse count to PC, instead of actually checking the elapsed time. Now we pay for this laziness: our "seconds" are actually slightly too long, thus we get only 84360 per day instead of 86400. However, our Arduino code is smart enough to not lose any LED pulses, so the collected count is right. If you want to fix this, here's a few options how to change Arduino code:

  • Reduce the sleeping time from the loop. You have to use delayMicroseconds() function, as delay() has 1ms steps. Proper length: 84360/86400*5ms=4.882ms.
  • Instead of counting to 200 samples, check Arduino's running time to find out when 1 second has elapsed. Function millis() returns the time since Arduino started running. Notice that this counter overflows after about 50 days!
  • Use a timer interrupt for triggering sending data to PC. This is an advanced method.

You can calculate more statistics with Numpy, here's the official tutorial. For example:

  1. Calculate average pulse count per day
  2. Calculate standard deviation from daily pulse count
  3. Calculate minimum and maximum pulse counts and energy consumption

However, statistical numbers become quickly rather uninteresting to observe. Next, we'll learn how to visualize data.


Plotting Graphs


A graph well done tells more than 10 million data samples! In order to make plotting simple, we'll utilize the wonderful Matplotlib library. Install it as follows:
sudo apt-get install python-matplotlib
Then, create a new file called Plotter.py, and write the following lines:

import sys, numpy, pylab

data = numpy.loadtxt(open(sys.argv[1], "rb"), delimiter=",")
pylab.plot(data[:,1])
pylab.show()

Here we load in a data file using Numpy, then select the LED blink column and create a plot using Matplotlib's PyLab interface. Finally, we request the plot to be shown (drawn), which makes a new window appear on screen.

Save the file, and in case you're on Linux add execution rights to it:
chmod +x Plotter.py
Then run it:
python Plotter.py 20121216_energy.csv
As a result, you should see a window like this on your screen:


Simple plot drawn from logged data, representing LED blinks per second over 24h period.

Take care to notice the toolbar on the lower left corner! Unlike spreadsheet application's plots, here you can freely pan and zoom the image, browse back and forth like in a web browser, and even save a .PNG file of the visible plot part. All this with only 2 lines of code is pretty amazing! Here's another image that I created using nothing but that toolbar:

Detailed view, created using  plot toolbar.

The images above were simple to create, but the signals are difficult to read because of great amount of oscillation. In order to make the plots more informative, let's do some filtering with moving average. Averaging is useful here because of two reasons:

First, by its nature, a blinking LED provides a discrete value that is quantized. As you can see from the image above, it is very common that the LED blink count per second oscillates between two values from second to second, for example between 8 and 9 blinks/s. This means that the electricity consumption is actually between these values. Because of the way we measure it, we cannot get 8.5 blinks per second, or 8.32, or 8.877... you get the picture. To reveal the actual value, we use a moving average filter with a short window, such as 20-100 samples.

Second, if we look at the big picture i.e. whole day's energy consumption, there happens to be a lot of fluctuations because it is winter and thermostats of electric heating system in my house frequently turn high loads on and off. The actual energy consumption becomes disguised behind all the traffic. To reveal that, we can use moving average with pretty long window, say 1000 or even 10000 samples.

import sys, numpy, pylab

data = numpy.loadtxt(open(sys.argv[1], "rb"), delimiter=",")

WINDOW = 10000 # Averaging window length in samples
avg_data = numpy.convolve(data[:,1], numpy.ones(WINDOW)/WINDOW)

pylab.plot(avg_data)
pylab.show()

To calculate the moving average, we use a mathematical concept called convolution. It mixes our LED blink values with another array of values, which is called a window. The values in the window are called weights. In the case of calculating an average, the weights are all the same. In short, convolution means here that each of the LED blink values will be re-calculated as an average of its nearest neighbors just before it (or perhaps from both sides). The amount of neighbors to be used is given as window length, and that controls how smooth signal we want to get. Not smooth enough? Increase window length! Notice that a window of 10000 means that average of 10000 values will be calculated for each 84000 samples in our array - that's a lot of calculations! Yet, with a modern PC it runs under a second.

Signal smoothed with 10000 point moving average.
Now, this is much easier to read but it doesn't seem right... look at the ends of the signal! In the beginning, it takes a long time to rise up, and in the end it is just the opposite case. Why? Filter buffer needs to be full of data, otherwise its output will be invalid. In the beginning we are still filling the buffer, in the end we are emptying it.

Simple fix is to give one additional parameter to convolution method: 'valid'. However, this will cut away both parts where buffer is not full: we will get rid of the false tail, but we also lose the begin of the day! A smarter method is to first add filter length of the first data value (i.e. repeat sample at position 0 window length times), then filter the signal, and cut the ends. Here's the code and a new plot:

import sys, numpy, pylab

data = numpy.loadtxt(open(sys.argv[1], "rb"), delimiter=",")

WINDOW = 10000 # Averaging window length in samples
extended_data = numpy.hstack([[data[0,1]]*(WINDOW-1), data[:,1]])
avg_data = numpy.convolve(extended_data, numpy.ones(WINDOW)/WINDOW, 'valid')

pylab.plot(avg_data)
pylab.show()

Signal smoothed with 10000 point average filter, with fixed begin and end.

This looks much better! If we call PyLab's plot()-method twice, for both original and filtered data, we get another plot where both signals are shown simultaneously. Now you can see that the averaged signal is actually a pretty good representation of the overall energy consumption. The only remaining problem is that filtered signal has a pretty slow response time. For example, the big drop in consumption at about 11000 seconds takes - yes, 10000 seconds i.e. the length of the filter - to fully appear. 10000 seconds is nearly 3 hours. To overcome this problem, we could use smarter methods such as a FIR filter, but designing one is an advanced technique that I will rather discuss in a separate article. This time we simply leave it up to you to adjust a suitable window length - try values between 1000 and 10000 until you get a nice trend line.

Original signal (green) and trend line (blue) plotted to same image.

To finalize our plot, let's make some minor adjustments:
  1. Y-axis should represent kWh instead of LED blinks
  2. X-axis should represent time as HH:MM:SS format instead of samples
  3. Add title and labels for X,Y axis
  4. Add legend that explains the difference between the two signals
  5. Add grid for making comparisons easier

import sys, numpy, pylab, time, datetime

data = numpy.loadtxt(open(sys.argv[1], "rb"), delimiter=",")

kW_factor = 10000 # LED pulses per kWh of energy consumed
kWh = data[:,1] * (3600.0 / kW_factor)

WINDOW = 3000 # Averaging window length in samples
extended_data = numpy.hstack([[kWh[0]] * (WINDOW-1), kWh])
avg_kWh = numpy.convolve(extended_data, numpy.ones(WINDOW)/WINDOW, 'valid')

date = sys.argv[1].split('_')[0]
timeline = []
for t in data[:,0]:
    time_string = date + '%06d' % t
    time_struct = time.strptime(time_string, '%Y%m%d%H%M%S')
    dt_object = datetime.datetime(*time_struct[:6])
    timeline.append(dt_object)

pylab.title("Electricity consumption %s" % date)
pylab.ylabel("Energy (kWh)")
pylab.xlabel("Time (hh:mm:ss)")
pylab.xticks(rotation=30)
pylab.subplots_adjust(bottom=0.15)
pylab.plot(timeline, kWh, color='g', label='Momentary')
pylab.plot(timeline, avg_kWh, color='b', linewidth=2.0, label='Trend')
pylab.legend()
pylab.grid(True)
pylab.show()

Here we take the LED blink column and scale it with the number of LED pulses per kWh of energy. Then we do the averaging trick to create a smooth trend line as another signal. After parsing the date from the log file name, we convert the time stamp column. Numpy has read it in as numerical floating point values. Hence, we first create a string that contains both date and time, then parse it in as a python time struct, and convert to python datetime object. The point is that Matplotlib can handle datetime objects directly when given as X-axis to a plot. For example, if you zoom the graph in or out, X axis labels follow correctly. Time parsing is here done inefficiently; I wanted to clearly show the steps. Finally, a number of simple adjustments have been made to set titles, colors, legend, grid etc.

Final plot from a single day's energy consumption.

Above is the final plot from one day's electricity consumption. In case you're wondering what's going on, here's a couple of explanations:

  • My water boiler turns on automatically every night about 22:30 (10:30 PM). Based on the image, seems that it takes about 3 kW of power and this time it has turned off about 3:30 (AM). Thus, it consumed roughly 3kW * 5h = 15kWh of energy, and cost me 0.11€/kWh * 15kWh = 1.65€ per day - or about 600€ per year!
  • Another interesting thing is that two bedrooms in our house have electric floor heating, but it is off at night. You can easily see how those long spikes disappear at midnight and start again at 06:00 (AM). 
  • Below is a zoomed-in view from one of such spike. Seems that first one bedroom thermostat turns on, and soon the other one as well. Then one drops off, then the other one. The floor heating uses self-adjusting cable, which has initially low resistance when turned on, and when the cable warms its resistance gradually increases and hence current (and power) gradually decrease, which you can easily see from the image.


Zoomed-in view, created using Matplotlib's default toolbar.



By the way, this is exactly the reason to start the energy measurement project in the first place: you don't get such detailed data from your electricity company, but it is available from the energy meter if you have a device that can capture it! 

And why do you need this information? Well, if you have enough resolution, you can analyze power consumption (and cost) of individual devices, provided that they consume enough power to be distinctive (and if they don't, you don't need to worry much about their cost either). Go make some tests: turn a device on or off, and see what happens in the plot.

Knowing what the energy hogs are allows changing your behavior as an energy consumer in such a way that it is actually relevant i.e. energy and cost savings become real and you can measure them. Despite of the usual propaganda, you really can't save the world just by changing a couple of light bulbs to energy savers. But if you lower the room temperature and take shorter showers, you might start to see some effect in your monthly electricity bill.

Here's a couple of ideas:
  • If you heat your house with electricity, you might want to add outside and inside temperature to the plot so that you can observe correlation.
  • If you have a heat pump, solar power system, or something similar and want to measure their effect, make a test: compare two (otherwise similar days) with the device one day ON and one day OFF. What's difference in the plots?
  • Heating by burning wood? Make yourself comfortable in front of the fireplace and read a book. Then go check from the plot how much electricity consumption dropped and for how long.


Uploading Sensor Data Streams to the Cloud


Manually creating plots from log files is not much fun in the long run. You could automate it, for example so that you get a nice plot as an email attachment every night. But there's another option: you can continuously upload your sensor data to the cloud, and then check up-to-date plots e.g. with your smartphone or tablet.

Why would you want to do that? Mostly, because it is easy and convenient to access sensor streams this way. But there's more to it. Let's assume you're on holiday. While sunbathing in the hotness of Thailand, you read news with your cellphone and learn that temperature has fallen to -35 degrees Celsius back in Finland. You become a little uneasy, and start worrying about your house... Then you remember your energy meter setup, and simply click a bookmark to open a web page that shows whether energy consumption at home is normal or not. That's pretty cool, isn't it? Or you can do it the other way around: install a similar setup to your lakeside cabin, and check its energy consumption status from home.

Now, let's push the energy meter stream to cloud.


www.cosm.com is a cloud for your sensor data.

We will use a cloud service that is specialized for sensor streams: Cosm. First, we need to setup an account to Cosm and prepare it for receiving our sensor data:
  1. Open a web browser and go to http://www.cosm.com
  2. Create a (free) account (and activate it via confirmation email)
  3. Add a device/feed: choose Arduino as feed type, title = Power Consumption, tags = Energy
  4. Observe Feed ID number (copy-paste it somewhere)
  5. From given Arduino example code, find your Cosm API key (copy-paste it somewhere)
  6. In Cosm's console view, click +DataStream button and add basic details, for example ID=Home Energy Meter, tags=energy, units=kW. Remember to save changes.
Your Cosm account is now ready.

Next, we will install software tools for accessing Cosm streams. For Python, there's a library for it: python-eeml. It makes communicating with Cosm a piece of cake, and requires itself just a few steps. In console, write these commands to download and install it:
wget https://github.com/petervizi/python-eeml/archive/master.zip
sudo apt-get install unzip python-setuptools python-lxml
unzip master.zip
cd python-eeml-master
sudo python setup.py install

Now we can modify the logger software to push data to Cosm stream. Here's the code:

#!/usr/bin/python

import serial, datetime, os, sys, eeml

serial_port = serial.Serial('/dev/ttyUSB0', 9600)
log_path = os.path.abspath(sys.argv[1]) if len(sys.argv) > 1 else '.'
if not os.path.exists(log_path): os.makedirs(log_path)
data_file = None; last = None

COSM_API_KEY = "J4fmHG_ZDMhP43aXGM3V..."
COSM_FEED_ID = 94...
COSM_STREAM_ID = "EnergyMeter"
COSM_API_URL = '/v2/feeds/%s.xml' % str(COSM_FEED_ID)
COSM_INTERVAL = 10 # Seconds to wait before next push to COSM
BLINKS_IN_KWH = 10000 # LED pulses per kWh of energy consumed
SECONDS_IN_HOUR = 3600.0
WATTS_IN_KWH = 1000
BLINKS_TO_WATTS = SECONDS_IN_HOUR / BLINKS_IN_KWH / COSM_INTERVAL * WATTS_IN_KWH
timer = 0
cumulative = 0

while 1:
    data = serial_port.readline()
    now = datetime.datetime.now()
    
    if last == None or now.day > last.day:
        if data_file is not None: 
            data_file.flush(); data_file.close()
        file_name = os.path.join(log_path, now.strftime('%Y%m%d_energy.csv'))
        data_file = open(file_name, 'a', 1024)
    
    data_file.write(now.strftime('%H%M%S,') + data)
    last = now

    timer += 1
    try:
        cumulative += int(data)
        if timer >= COSM_INTERVAL:
            cosm = eeml.Pachube(COSM_API_URL, COSM_API_KEY)
            cosm.update([eeml.Data(COSM_STREAM_ID, 
                                   str(cumulative * BLINKS_TO_WATTS), 
                                   unit=eeml.Watt())])
            cosm.put()
            #print (cosm.geteeml())
            timer = 0
            cumulative = 0
    except:
        pass


Most of this should be familiar to you already, but there are two new sections:

  • Set of constants before the eternal while loop define Cosm account details. I have removed part of the API key and Feed ID, for obvious reasons - use your values from your own account. There's also conversion from LED blinks to watts.
  • Lower half of the while loop, where LED blinks are collected in a cumulative manner until it is time to push them to Cosm, which is then handled using the eeml library.

Now you can kill the logger and restart it (check from logger chapter above if you don't remember how). Soon data should begin to be pushed to your Cosm account, and eventually a graph like this will be available via your Cosm account page:

Energy meter data pushed to cloud and viewed with Android tablet's web browser.

I use this method all the time for viewing data from my sensors, whether I'm at home or on the road - it's just so convenient. But, I also keep the log files on my hard drive, and visualization and statistic scripts at hand, and totally recommend that you do the same. When I see something odd or interesting in the cloud stream that I poll frequently, it is necessary to be able to analyze the data hands-on without limitations - and this is why you should keep your own copy and tools for it!

Notice that with a single account you can add multiple different types of streams to Cosm. For example, you could add a trend line of energy consumption as a separate stream. Or, add outdoor/indoor temperature sensors, water consumption feed, broadband byte count stream... you get the picture.

It is also possible to get a live view that can be embedded on your own web page using a stream's Graph Builder tool. See below image link from Graph Builder tool and actual image from stream data:

https://api.cosm.com/v2/feeds/94462/datastreams/EnergyMeter.png?width=730&height=250&colour=%23f15a24&duration=1week&legend=Watts&title=Home%20Energy%20Consumption&show_axis_labels=true&detailed_grid=true&scale=auto&timezone=Helsinki




Summary


In this article, I have presented methods for accessing Arduino sensor streams from PC software, printing the stream on screen, logging it to file with auto start and email alerts, calculating statistics from log files, plotting interactive graphs from log files, and finally pushing the data to a cloud and embedding live data stream from cloud to a web page. 

While an energy meter has been used as a data source, the methods used in the article should be generally useful and easy to apply to other sensor streams.

This concludes the article trilogy about home energy meter, which became a lot longer journey that I initially planned! If you have benefited from these articles or created something similar, please let me and readers know e.g. by commenting to this article.

By the way, next article for SensorBay.com is already under works, so stay tuned for more!

2 comments:

  1. Very good and detailed information.

    I had a hard time getting upload to cosm working and then i found out that the command is "cosm = eeml.Cosm(COSM_API_URL, COSM_API_KEY" instead of "cosm = eeml.pachub(COSM_API_URL, COSM_API_KEY" in the newer eeml package.

    Regards
    Johan

    ReplyDelete
  2. I really like this tutorial.. If you had to rewrite this today, what alternative would you use? as i believe cosm is no longer a free service..?

    ReplyDelete