Too Many Useless Data Points = Inefficient Solar Farm Operation

Too Many Useless Data Points = Inefficient Solar Farm Operation

By: Adam Baker

Two is Better Than One. Combine the Power of Historians and SQL Databases to Run Your Utility-Scale Solar Plant More Efficiently.

Data is continuously produced by various devices around a utility-scale solar plant. Devices like inverters, meters, and meteorological stations are the basics, but others may exist as well. In order to understand how the plant is running, operators must be able to log site data to compare what happened yesterday, last month, and last year.

Once collected, there’s an immense amount of data in a solar plant, and finding the correct information that helps you operate the plant well is not for the faint of heart.

Hoarding too much data is like hoarding junk. Utterly useless and unsightly.

There are two big-picture ways of collecting and saving data. The easy, efficient, but (relatively) expensive way is using a historian, and the difficult, inefficient, and relatively inexpensive way is using a SQL database.

The idea of using a historian to collect and save data is not a new concept. Every power utility uses historians to keep track of generation, transmission, and distribution, but historians are not pervasive in the utility-scale solar industry, particularly in smaller projects less than 100MW.

If you have a database paired with a good historian, there’s almost no such thing as too much data. However, if you’re logging data in a database without a historian, you’re probably drowning in inefficient data that’s really not helping run your plant more efficiently.

SEE ALSO: PV Voltage Control Makes PV Projects Viable Again


Using a Database at a Utility-Scale Solar Site

Many utility-scale plants use some flavor of SQL database to log and store plant data. Databases are fantastic at writing and storing tens of thousands of values, then spitting out reports.

The problem is, databases need to be queried at regular intervals to provide data...which means they end up generating a giant list of data that makes it difficult to sort through when you really need it.

The typical record might look like this:

“time/date”  |  “field”  | “value”

At 11:00:00 this morning, inverter11_kW_output was 600.

Five minutes later (or however long you designate between queries), your database ends up with the same pull: “time/date”, “field” and “value” for each value you’re tracking. At a 100MW site, you might be writing tens of thousands of values to catalog all the facets of your plant.

If you pull 10,000 values at a resolution of every 5 minutes, in a single 24-hour period you’ll have over 2.8 million lines of data. If you pull values every second, you’re looking at 864 million lines per day.

Even with all that data, a lot of crucial data is easily missed. If you ever need to extract usable data from a database, you’ll find it’s extremely difficult to find a piece of data that occurred at a specific time.

For example, if you log data at the current industry resolution of 15 minutes (1:00, 1:15, 1:30) but want to know what happened at 1:05, the database won’t know. You’ll have to look at the data from 1:00 and 1:15 and piece it together yourself. A lot of vital data can happen in the space of 15 minutes, and if you ever have a crucial event, you might be left in the dark.

This database problem isn’t necessarily solved by pulling at higher resolutions. The higher resolution, the more data is queried. The more data you have to manage, the longer your reports take.


Pros of database reporting:

  • Made to handle lots of data
  • Total cost is virtually free. At a 1 second resolution, 20 years of storage might cost you $120 (1 Terabyte), and the SQL database itself might only be a few hundred dollars
  • Many reporting tools to choose from


Cons of database reporting:

  • More difficult to extract information
  • Ad hoc reporting is slow. (If you have a problem happening right now and you query the database to show the last time the same conditions happened within the last year, be prepared to wait.)
  • High IT burden
  • Loss of data during communication outages


SEE ALSO: Don’t Fear Industrial Wireless Integration in Solar Environments


Using a Historian at a Utility-Scale Solar Site

I like to think of historians as databases with intelligence. Historians are widely used in power and manufacturing industries, but haven’t yet been widely accepted in utility-scale solar, mostly due to their cost.

Historians are usually configured to ‘log on change', where data is only written when the value changes by an amount set in configuration. You can be pretty confident that frequency is going to be pretty close to 60Hz, but it’s notable when frequency varies by .1Hz. Forget logging when values bounce between 59.999 and 60.001 as this is all ‘normal operation’. However, when frequency drops to 59.9, you may want to pay attention. Historians work best when logging changes bigger that a % or step value.

Ultimately, less data is collected but you don’t give up efficiency. After all, in order to get the same data via database reporting, you’d have to log every value every minute.

If you query a historian about a certain value at a point in time, the historian linearizes the data between data points and tells you the most likely value, even if it only changed by a tiny bit. If the historian logs an extended period of consistent data change, it will analyze the rate of change, keep the first and last point, and throw out the rest to save on space. This is especially useful during sunrise when production increases at a relatively linear rate.

Historians collect data at high frequency, which means you can see what changed just before an event, or track trends. For example, what does an 80° day in June with good irradiance in 2016 look like when compared to the same type of day in 2015? How much energy did you produce? Is the plant on track? Do you have maintenance issues?

In addition, you’re more likely to retain data in a communications outage if you use a historian. If the historian server is ever offline, it will buffer data collected locally at the site until the server comes back online. Further, the historian manages active and archived data in the background, so though a month of data may exist in active memory, a query for last year’s data will be retrieved from the archive transparent to the user. There may be a slightly longer delay to open the archive, but it will certainly be faster than a SQL database with all records in the DB active.


Pros of historian reporting:

  • Good at managing lots of data
  • Efficient at collection and reporting, even when offline
  • Low IT support burden (you don’t have to be an admin to backup or restore data)


Cons of historian reporting:

  • Costs vary based on size, but on average expect to pay $10 total per point for software and configuration. If you own a large site, a historian logging each and every data point could cost you tens of thousands.


Solar Reporting Best Practice: Use Both

Data is not the same as information. One of the biggest problems with collecting solar plant data is there is just too much of it, and most points don’t provide valuable plant operation information. Yes, tens of thousands of data points are nice to have for maintenance purposes, but most data you collect will not help answer the question: “How can I ensure my plant is at 100% production?”

It seems you’re forced to choose between critical data that’s easily accessible, or a giant list of mostly useless data. I would argue: why can’t you have both?connect-20333_1920

Use a historian to track and store critical data, and collect “nice to have data” in a SQL database. Reduce costs by only tracking the data in your historian that really helps maintain your plant on a day-to-day basis.

For example, solar tracker systems have thousands of points per MW, but there are only two that really matter: commanded position, and actual position. Set up the historian to track those points so you can see how well you follow the sun, and configure the database to track other important data that tends to change slowly over time (like actuator motor current to detect binds, or lubrication needs of bearings).


Adam-SuitAdam Baker is Senior Sales Executive at Affinity Energy with responsibility for providing subject matter expertise in utility-scale solar plant controls, instrumentation, and data acquisition. With 23 years of experience in automation and control, Adam’s previous companies include Rockwell Automation (Allen-Bradley), First Solar, DEPCOM Power, and GE Fanuc Automation.

Adam was instrumental in the development and deployment of three of the largest PV solar power plants in the United States, including 550 MW Topaz Solar in California, 290 MW Agua Caliente Solar in Arizona, and 550 MW Desert Sunlight in the Mojave Desert.

After a 6-year stint in controls design and architecture for the PV solar market, Adam joined Affinity Energy in 2016 and returned to sales leadership, where he has spent most of his career. Adam has a B.S. in Electrical Engineering from the University of Massachusetts, and has been active in environmental and good food movements for several years.