Degree Days

Degree Days

Weather Data for Energy Professionals

Weather Underground

Integrating with the Degree API

This page is for anyone planning to integrate degree days into their software system. It is moderately technical, but you do not need to be a software developer to read it. It offers a fairly high-level overview of the Degree API and the various common approaches to integrating with it.

If you already have compatible software that needs API access keys to unlock its degree-day-based functionality, please sign up for an API account here and enter the API keys into your software. You should not need this integration guide unless you are actually building the software yourself.

If you are a software developer you will find technical details and code samples in the language-specific guides listed under "Programming language options" below. But we recommend that you read this page at some point as the higher-level concepts it describes should help you decide on the best approach for your integration. We have advised a lot of companies that have integrated with our API, and the same patterns recur again and again.

On this page:

Programming language options

You can use the API with any programming language, but it is currently easiest to use with Java (and other JVM languages like Scala, Clojure, Jython, and JRuby), .NET (e.g. C# and VB.NET), and Python, as we have developed robust client libraries for those platforms:

Client libraries and quick-start guides: Java, .NET, and Python.

There's also some sample PHP code and sample Ruby code that will help you get up and running fairly quickly, albeit not as quickly as with the full client libraries above.

All the client libraries run on top of an XML API, which can be accessed using any programming language. The XML API is robust and stable, but it is not so easy to work with as you have to navigate the security scheme, generate the XML requests, and parse the XML responses yourself. We suggest that you use one of the client libraries above if you can, as they are high-performance, full-featured implementations that are very easy to use and will enable you to be fetching data in minutes.

The links above cover the technical details. The rest of this page gives a higher-level overview of the API and the common approaches to integrating with it:

Back to top

Degree API basics

The Degree API lets you get all the data you can get through the website, but automated, faster, and in much larger quantities.

You send a request to the API specifying what data you want, and you get a response back containing the data you specified.

A request is for one specific location. So, to fetch data for 1,000 locations, your software would make 1,000 requests.

A location can be a weather station ID or a geographic location: a postal/zip code or a longitude/latitude position. The API has a sophisticated system for figuring out which weather station can best represent a specific geographic location over any particular time period. It is more complicated than just picking the closest station. Data quality and data coverage (how much data history is available) varies from station to station, and the API takes all this into account when choosing which weather station to use for any particular request.

Each request takes a certain number of request units. A big request (like fetching 10 years' worth of data) takes more request units than a small request (like fetching data for just the last day, week, or month). API accounts have an hourly rate limit which determines how many request units they can handle in any given hour. When you sign up for an API account you should choose a plan that will accommodate the volume of data you intend to fetch out of the API. The Pricing & Sign-Up page has a request-units estimator that can help you choose.

A LocationDataRequest is the type of request you use to get degree-day data. You specify a location (as a weather station ID, zip/postal code, or longitude/latitude position), and the data you want for it (e.g. HDD, CDD, daily, weekly, monthly, average, what base temperatures, covering what period of time etc.), and the API will send back the data you requested in the response.

A LocationDataRequest always takes at least 1 request unit, but if you fetch a lot of data in one request, it can take many more. Generalist weather-data APIs often only allow you to fetch data one day at a time, forcing you to make hundreds of thousands of requests to assemble a relatively small amount of data. But the Degree API will let you get many years' worth of data in a single request. The only catch is that big LocationDataRequests take lots of request units. But it is always more efficient to fetch the data you need in as few requests as possible.

A LocationInfoRequest is specified in almost exactly the same way as a LocationDataRequest. You specify the data you want and the location you want it from (typically a geographic location). But you don't get any degree-day data back in the response, just information about the location. Most importantly this will include the ID of the weather station that the API would use to generate the data you specified. So you can then use a LocationDataRequest to fetch the data you want directly from that weather station. Or, if you already have that data in your database, you can get it straight from there.

A LocationInfoRequest only ever takes 1 request unit. So you can use it to efficiently map geographic locations (e.g. a big list of zip/postal codes or longitude/latitude positions) to weather station IDs. This enables you to save time and request units when assembling data for thousands of building locations, many of which would map to the same weather stations.

Back to top

Accessing data on demand

The simplest integration option is often to pull data from the API as and when it's needed. For example, consider:

For systems like these it often makes sense to fetch degree-day data from the API on demand. If you request data from the geographic location of the target building (by specifying a postal/zip code or longitude/latitude position), the API will automatically generate the data you requested using the weather station that's best placed to supply it (considering data quality and coverage as well as distance from the target location).

This on-demand approach is particularly likely to make sense if:

Back to top

Building and maintaining a local database of degree days

When the locations of interest remain fairly constant it can often make sense to build and maintain your own local database of degree days. This is a common pattern for:

Backfill and updates

Building and maintaining a database of degree days will typically involve:

The initial backfill is a one-off job, so it is unlikely to be a huge problem if it takes a while or is a little inefficient. You should consider the efficiency of the backfill for new locations as they are added over time, but it is unlikely to be a critical factor unless you are likely to add a lot. The most important thing is usually to ensure that regular updates can be performed efficiently.

Back to top

Approach 1: Using weather station IDs only (not usually recommended)

With this approach you would fetch data by weather station ID (using a LocationDataRequest for each ID) for your initial backfill and for all regular updates.

This approach is ideal if it is weather station IDs that you ultimately want data for. But it is less ideal if your locations of interest are really geographic locations like postal/zip codes or longitude/latitude positions of real-world buildings. Rather than figuring out your own scheme for assigning a weather station ID to each geographic location, it usually makes sense to use the sophisticated system that the API provides specifically for this purpose. Good station selection is more complicated than it seems on the surface. Please see Approach 2 and 3 below for more.

Back to top

Approach 2: Using geographic locations only (simple, but not very scalable)

A simple approach to building and maintaining a degree-day database is to fetch data by geographic location (e.g. postal/zip code or longitude/latitude position) for both the backfill and for regular updates.

The API is very good at handling geographic locations - give it a postal/zip code or longitude/latitude position, and tell it what data you want, and it will choose the best weather station automatically. Data quality varies considerably (both between stations and over time), and some stations have been recording temperatures for longer than others, but the API takes all of this into account, choosing which weather station to use based on data quality and coverage as well as distance from the target location.

The main problem with this approach is that, although it is OK for a relatively small number of locations (e.g. up to 1,000 or so), it gets less and less efficient as the number of locations grows. The more locations you are dealing with, the more likely it is that many of them will share the same weather stations. So you will probably end up fetching the same data multiple times in your initial backfill and in each of your regular updates. This will slow your system down and you may need a higher-level API account than you would need if you used a more efficient approach.

There can be a consistency issue too: the updates (recent data only) for a geographic location might sometimes use a different weather station to the one used for the initial backfill (longer data history). For example, you could backfill a location with a long data history going back 15 years, and the API might choose a weather station 9 miles away that could provide that long data history. But then for an update you might only be fetching a months' worth of data so the API might instead choose a newer station that is only 2 miles away. In reality this issue doesn't occur all that often, as the API does tend to favour stations with a good data history, even for requests that only want a month's worth of data. You could also get around it by fetching updates by station ID (always returned with the degree-day data generated for a geographic location), but, if you were doing that, you might as well use proper two-stage data fetching instead (see Approach 3 below).

On the plus side, this simple approach is very easy to design and program. Each building location has its own set of data - this is simple even if there is duplication. You also don't need to worry about what happens if a weather station stops working - the API will just automatically choose another station to replace it. Despite its limitations, this is not a bad way to integrate with the API if you want to get something working quickly and you aren't dealing with thousands of locations.

Back to top

Approach 3: Two-stage data fetching (mapping geographic locations to station IDs for better scalability)

The more locations you have, the more likely it is that multiple locations will share the same weather station. Fetching data (using LocationDataRequest) can take a lot of request units, so it is inefficient to inadvertently fetch the same data multiple times for nearby locations that share a weather station. If you are dealing with more than a thousand or so locations, or you know that many of your locations are close to each other, you should think about using two-stage data fetching for better scalability:

  1. Use LocationInfoRequest to map each of your geographic locations (e.g. building locations) to a weather station ID, before actually fetching any data. Store a station ID with each of your geographic locations.
  2. Use LocationDataRequest to fetch data for each of the weather station IDs from step 1. You would use the station IDs both for the initial backfill and for later updates. To avoid duplication, store the data separately from the geographic locations, using the station ID as primary key.

10,000 building locations will typically map to under 2,000 weather stations, or less if the buildings are concentrated in certain countries or regions. You only need to do this mapping once when you backfill (for which performance is unlikely to be critical), and with LocationInfoRequest it will only take one request unit for each building location. You will save request units in your backfill and in each of your updates, as you will only need to make a heavyweight LocationDataRequest for each of your weather stations (e.g. 2,000) rather than for each of your geographic locations (e.g. 10,000).

The more locations you have, the greater the advantage of this approach.

A few more tips:

You will find more technical info on LocationInfoRequest in the more technical programming-language-specific documentation on this website. But for now the main thing is just to understand the general approach of two-stage data fetching and determine whether it makes sense for your application.

Back to top

Tips for a robust system

We have designed the API with careful consideration of the edge cases and things that could go wrong, with the aim of shielding you from as much of the underlying complexity as possible. But there are still some particular things you might want to plan for if you are aiming to build a robust system:

Preparing for stations with shorter data histories than you might want

At present it is not possible to fetch any data from before the year 2000. We calculate degree days accurately using detailed temperature readings taken throughout each day (typically hourly or more frequently than that), and continuous records from the last century are patchy and difficult to get hold of. This is the only real disadvantage of the accurate calculation method that we use.

Also, many of the stations in our system were set up more recently than 2000. And some stations have had measurement or reporting problems that have caused us to discard some of their earlier data. So the length of data history available varies from weather station to weather station.

If you specify a geographic location (a postal/zip code or longitude/latitude position) and leave the API to choose the weather station automatically, it is always best to specify the longest data history that you will ultimately want, as that may affect which weather station the API selects to satisfy your request. This is the case whether you are using LocationDataRequest to actually fetch the data, or LocationInfoRequest to just get the ID of the weather station that the API would use to satisfy your data specification.

If you request more data than the API can supply for your specified location (whether it's a weather station ID or a geographic location), the API will, by default, return what it can from within the range you requested. Except in the unlikely event of your geographic location having no active stations near it, recent data should always be available ("recent" meaning to within around 10 days of yesterday in the location's local time zone, and usually up to and including yesterday). And the API will never return data with gaps in it. But there are limits on how far back in time you can go, so you might find data missing from the start of your requested range. If you'd rather receive an error than a partial set of data, you can specify a minimum required range in your request.

Dealing with stations going down (if you are fetching data by weather station ID)

Although a station might be working well today, there is no guarantee that it will still be working well next month or next year. Unfortunately not even the best "airport" stations managed by organizations such as the NOAA or the UK Met Office are exempt from reliability problems.

If you're only storing data from a handful of locations, it might not be worth worrying about the possibility of one of your stations going down. But, if you're storing data from hundreds or thousands of locations, it's likely that you'll run into station downtime at some point, so you might want your system to be prepared for it. We've designed our system to make it as easy as possible for you to handle station downtime in a robust manner.

Small patches of downtime are automatically filled with estimated data. But our system will only do this when it has temperature readings on both sides of the gap. So, if a station goes down for a while, its most recent data won't be available until it comes back up again. Bear this in mind if you're fetching updates at the start of each day/week/month: if a station went down towards the end of the last day/week/month, it will need to come back up again before our system can patch the gap with estimated data and supply a value for that last day/week/month.

If you need the latest data for a given location, but the station you have been using does not yet have it, you can put in a request for the missing data from the underlying geographic location of interest. Specify a minimum-required range that includes the latest data, and, provided you haven't got your timezones mixed up, the API will hopefully find a stand-in station that can supply the data you need. If you store the stand-in data, you could replace it later if/when your original station recovers (if you think it's important to use a consistent source).

A long period of downtime will result in a station being labelled "inactive". This happens if a station doesn't report any usable temperature readings for around 10 days or more (10 being an approximate number that is subject to change).

If you try to request data from an inactive station, you'll get a LocationNotSupported failure. (In .NET or Java this will appear as a LocationException, in Python a LocationError.) This is an indication that you should find an alternative station to use as a replacement. Typically this would involve you making another request for data from the geographic location that you ultimately want the data for (e.g. the location of the target building) so that the API can automatically choose a replacement station for you.

Dealing with occasional volatility in recent data

There can occasionally be some volatility in the most recent data. Sometimes a station's automated reporting system goes down, then comes back up again, leaving a gap in the reported data... As explained above, our system will plug the gap with estimates, but occasionally the station will recover the missing data and report it several days later, enabling our system to calculate the degree days more accurately.

It's unusual for this sort of thing to happen, but it can happen occasionally. Such volatility would generally only affect the latest 10 or so days of data, so it's easy to counteract by fetching a little more data than you need each time you update your database. For example, instead of fetching the latest day, fetch the latest 30 days; instead of fetching the latest week, fetch the latest 4 weeks; instead of fetching the latest month, fetch the latest 2 months. Overwrite any previously-stored values with the most-recently fetched values. The vast majority of the time it will make no difference (and when it does the difference will almost always be small), but this is a good approach to maintaining data quality that doesn't usually add much complexity.

Back to top

Account management for installable software

To use the API you need API access keys. These exist so you can ensure that your API account is only used by the people and software systems that you've authorized to use it.

If you're making an internal system, or a public-facing web application that is hosted on servers you control, you will probably only need one API account and one set of access keys. You can securely embed them into your application without much risk of them escaping. That's the simple case.

Don't embed your API access keys into an application that customers install themselves

If you are making installable software (a desktop, mobile, or server application that your users install themselves), it would be unwise to embed your API access keys into that application. If those access keys escape, then anyone who gets hold of them will be able to use up the data-generation capacity that you have reserved for your application. And you won't be able to replace the compromised access keys without redistributing a new version of your application.

Let your customers enter their own API access keys into your software

One good option is to leave it to your customers to get their own API accounts and enter their access keys into their installations of your software to unlock the degree-day based functionality. Just point them to our sign-up page and tell them where to enter their API access keys once they've subscribed. We have designed our account plans specifically to support this model.

This is a particularly good approach if not all of your customers want the degree-day-based functionality - you can leave it as an optional feature that they can unlock if they want it, or ignore if they don't. It's also a good option if you expect some of your customers to make heavy use of the API (e.g. if you're making software for large multi-site organizations).

Build an intermediate server

Another option is to build an intermediate server. The customer's installed application would request data from your intermediate server, which would fetch the data from our API (using access keys that are stored securely on your server only) and pass it back down to the customer's application. This approach is more complicated but it gives you tighter control over the licensing side of things.

Running things through an intermediate server makes sense if you're building an installable application for the consumer market, as, although our low-end accounts are attractively priced for businesses, they're not really priced for consumers. If you're selling a $2.99 iPhone app for residential energy tracking, your non-business customers are unlikely to want to pay for an API subscription from us.

Also, with a mass-market consumer application you will find, at scale, that many of your customers will share the same weather stations (as groups of them will live in the same neighborhoods). By routing everything through a server that you control you can cache data locally, reduce your API access by generating data for each station only once, and pass on the cost savings to your customers. With a few thousand customer locations this sort of caching may be more development effort than it's worth for the limited data-reuse it would make possible at that scale, but with hundreds of thousands or millions of customer locations it could certainly be worthwhile for consumer applications where keeping costs down is a priority.

Back to top

Enough of the theory...

If you're a programmer, we suggest you take a look at the Java quick-start guide, the .NET quick-start guide, the Python quick-start guide, the PHP sample code, the Ruby sample code, or the XML API docs. With the Java, .NET, and Python client libraries you can literally be fetching data from the API within the next few minutes. How best to integrate with the API will probably become a lot clearer once you're familiar with the code itself.

Choose your Plan and Sign Up Today!

© 2018 BizEE Software Limited - About | Contact | Web Tool | API | Integration Guide | API FAQ | API Sign-Up