Debugging
Errata: Misspelled a word last time. There are no such things as "simulatories." It should have read "simulators." Sorry about that.
Anyway, it's Monday again and I already debugged an issue with a recent test. The spacecraft and payload (instrument) teams will be running checkouts on the various instruments starting today. One will be on the Thermal and Evolved Gas Analyzer (TEGA).
The tests we ran last week on the simulator had some issues. Basically, that sequence sets some parameters that the hardware uses, but setting those same parameters in the simulator involves a little more work. I had to compare the various runs, and that's where I discovered the cause of the issue and the solution to it. This is why we run tests on the sim and Engineering Models (EMs) before running the sequences on the Flight Models (FMs).
For every single test we run, there are dozens of documents produced during the processing of the data. Examples are log files of every command or telemetry check, a log of every Event Report (what we refer to as an EVR), an as-run copy of the test and many more. These are just text files, often in spreadsheet format. They include the date and time of each event and are ordered as such. I know, based on how the script was written, in what order things occur, and the responses to expect. I know which telemetry values should change and how, and I often know how long it should take for that to happen. By comparing the order and values between what is expected (in the script) and what actually happens (in the logs) I can begin to debug problems. I also compare data between different runs, i.e. same test but at different times or in different venues, like the hardware or the sims. And sometimes, the same process is done in two completely different tests. But I can compare the responses, as they should be the same.
By "telemetry" I mean the data that we monitor. We can monitor things like ON/OFF or temperatures or positions, plus hundreds of others. We use this data to track how the system is responding to commands, and make sure the spacecraft or instrument are in good condition. There is also a lot of telemetry that is used when things are failing. And we use these to debug what went wrong, or right, as there are safeguards in the software to prevent damage from erroneous commands.
From all this data, I can compare large swaths of code, or I can monitor response time to less than a second. It can get quite tedious, and confusing, too. Looking at the same snap shot between two different runs, and keeping them separate can be quite challenging. This is my job. And I'm good at it.
Posted
at 02:32 PM
E-mail this entry to a friend