banner
Home / News / Solar Inverter Connectivity Problem Reveals Weak Troubleshooting - TidBITS
News

Solar Inverter Connectivity Problem Reveals Weak Troubleshooting - TidBITS

Oct 21, 2024Oct 21, 2024

I like to think I’m decent at troubleshooting, but a recent failure reminded me how important it can be to break recalcitrant problems into their constituent parts. No one will encounter this precise issue, so take this story as a general lesson.

In short, I interpreted a communications failure from our solar panel inverters as a network problem, but it turned out to be a power issue associated with a feature in an uninterruptible power supply (UPS). Embarrassingly, I didn’t figure this out until the solar repair tech came to take a look and walked me through the parts of the system.

In 2015, we installed 18.9 KWh of solar panels to provide nearly all the electric power our house uses for geothermal-based heating and cooling and one of our two cars, a 2015 Nissan Leaf. (“Nearly all” because we had to anticipate future usage when sizing the system, and we guessed slightly low. NYSEG usually charges us a few hundred dollars in late winter after we’ve used up our generation credit from the summer.)

Overall, the solar panels have worked well, feeding into three SMA Sunny Boy inverters (one for each array of panels) that distribute power to the house and send the excess to the grid for our neighbors. The inverters publish generation statistics to the Web-based Sunny Portal, which emails me the totals daily. I don’t care about the exact numbers, but those email messages alert me when something has gone wrong. We’ve had an inverter fail, the connection with the portal occasionally drops, and I once forgot to turn on the main disconnect switch after some troubleshooting.

We didn’t install any battery-based power storage that could power the house in the event of an outage, partly because when we designed the system, power outages had been infrequent and brief. We did opt to put a standard electrical outlet on each inverter so we could tap into whatever power was being generated during an outage for specific uses. In the winter, that would be sufficient to charge our phones (100–300 watts of output, if the panels aren’t covered with snow, with chargers drawing 5–60 watts); in the summer, it’s more than enough to power our garage chest freezer (3000–5000 watts of output, with the freezer drawing only 150 watts after a startup spike of maybe 300 watts).

From July 10 through July 16, we lost power three times. The first happened during a tornado warning but was actually caused by the local highway department repaving our road, raising it enough that a passing truck caught and broke the overhead electric wires leading to our house. Oops! The remaining two outages stemmed from the severe storms that have become commonplace with climate change. I grew up on a farm 12 miles away, watching the summer weather intently to make hay with my father, and today’s weather patterns are vastly more volatile, as you can see from the alerts that rolled in just while I was writing today. Shortly after I finished writing this article, we lost power again. It’s getting tedious.

After a previous power outage this spring (we’ve had six so far in 2024, a new record), the SMA inverters lost connectivity, and I had to perform a reset sequence on them to get them to talk to the portal again. Happily, Halco, the company that installed them, was happy to tell me via email what to try so I didn’t have to wait for a tech just to flip three switches and circuit breakers in the correct order.

The inverters lost connectivity again after the first of the recent power outages on July 10. I assumed—bad idea!—it was the same problem, so I ran through the reset sequence. But we lost power again on July 11, so I didn’t know if my reset worked. I remembered to reset it again a day or so later, but the inverters didn’t appear on the portal, and I got caught up in weekend activities and failed to check again until July 15. I never saw if that reset worked because we lost power again that day, our longest outage ever at 29 hours.

During that last outage, I used an inverter outlet to power our chest freezer for the first time, so I was fussing with the inverters for much of the day. When disconnected from the grid, they report only the draw, not production, and it took me a while to figure this out and prove to myself that they were producing enough that I didn’t have to worry about browning out the freezer. I flirted with connecting the freezer to my iMac’s unused UPS to eliminate the possibility of a cloud or darkness dropping the power generation to a dangerously low level, but the freezer wouldn’t run from the UPS for reasons I still don’t understand. Perhaps the UPS could sense that the freezer’s power draw was unusual, even though it should have been able to handle 150 watts of draw.

Once the power came up for good, I ran through the reset sequence again on July 17, to no avail. I tried it a few more times before contacting Halco. They agreed that I’d tried all the basics and promised to send a tech out. Luckily, he had some time a few days later, well before the scheduled appointment, and he quickly revealed something I hadn’t known—that you can knock on the front of the inverters in a certain way to get them to cycle through status messages, one of which showed that they had IP addresses and thought they were communicating correctly. Yes, part of the troubleshooting process was to smack the case, just as we occasionally had to do to early computers.

“It must be something inside,” he declared, so we went in to look at my networking gear. I showed him where the Ethernet cable from the inverters plugged into an Ethernet switch… and why wasn’t that switch showing any activity lights? “Maybe it got fried in the outage,” he suggested, but I pointed out that it was plugged into a UPS. We traced its power cable down to the UPS, where it was plugged in next to the hub for my temperature monitoring system (see “Wireless Sensor Tags Protect Against Freezer Failure,” 25 July 2018), whose activity lights were equally dark. The UPS was working, since the Eero base station that manages my network was online, as was the device that lets me manage my Honeywell thermostat from an iPhone app.

We pulled the UPS out from its dim corner and examined it. That was when I realized what had happened. This particular APC UPS has a “master” outlet (the top left outlet in the photo below) and several “controlled” outlets (in the bottom left). When the device plugged into the master outlet goes into sleep or standby mode, or is switched off, the UPS cuts power to the controlled outlets to save energy. However, this mode isn’t enabled on this UPS by default—there’s a physical Master Enable button you press to turn it on. But the Eero Pro, being the most important of the networking devices, was plugged into the master outlet. Why didn’t that keep the controlled outlets powered up? And what changed during the outage such that the controlled outlets became disabled?

Each time the power went off, I turned the APC UPS off to silence its annoying alert beep. Because it’s in a dark corner and I couldn’t turn on a light, I had to feel around for the power button somewhat blindly. (This button is illuminated, but it’s common for a power switch to turn on a separate status light.) During one of the first two outages, I must have inadvertently pressed the Master Enable button before finding the actual power button. The Eero Pro consumes only about 5 watts of power, presumably below the threshold that the APC UPS needs to power the controlled outlets. It makes sense; if I had a Mac plugged into the master outlet and it dropped to 5 watts, there would be no need to power peripherals necessary only when the Mac is in use.

Once I took the APC UPS out of master/controlled mode, the Ethernet switch and temperature monitoring hub instantly powered up and started working, and the SMA inverters immediately connected to their portal. Problem solved!

Why didn’t I figure this out on my own sooner? In my defense, my Wi-Fi network was functional, and because my iMac uses both Ethernet and Wi-Fi, I didn’t notice that the powered-down Ethernet switch was preventing Ethernet from working in my office. In retrospect, Tonya and I did notice some slight network issues that were probably related to the extension Eero in my office being forced to use wireless for backhaul instead of Ethernet.

Plus, two of the four networking devices in the collection were working, so some LEDs were on back there. I had even unplugged and replugged the Ethernet cable from the inverters, so I had looked at the Ethernet switch without realizing it was powered down. I often go months without a notification from the temperature monitoring system, so it wasn’t surprising not to have heard from those sensors during all this. And I very seldom need to control the Honeywell thermostat from my iPhone, so I didn’t notice it was offline (since it was plugged into the disabled Ethernet switch, even though it had power).

In other words, my troubleshooting was sloppy. Because I had solved the inverter connectivity problem in one way previously, I didn’t look beyond that solution to follow the connectivity path upstream. Once the Halco tech made me do that, the problem quickly became apparent.

And that, dear readers, is the moral of the story. If your first attempt at solving a technical problem fails, break the problem into parts and verify that each of those parts is functional on its own. Every problem will be different, but by starting at one end and eliminating every possible variable, you should be able to discern the cause.