Everything Else

Event, Acquired

25 April 2014 —Take a deep breath and absorb these three words: All software crashes. This happens because all computers and the software that runs on them are made by humans and humans are imperfect. Overcoming our limitations long enough to make something that is almost perfect is an incredibly difficult task. Yet on top of that, we always burden our imperfect selves with the requirement that we add new features and capabilities that new hardware and operating system (OS) software might afford us.

Throughout the relatively brief history of computing, three trends constantly overlap and propel each other. These are the development of hardware, the development of OS software and the development of application software (like Helix). These three development cycles are rarely in sync, somewhat unpredictable, but continually advancing. The cycles never stop, because, as the foregoing discussion of human imperfection implies, they never totally succeed. But every once in a while, a milestone is reached, ‘the stars align’ and a period of relative stability begins.

Today is one of those days. Today, at last, we introduce Helix 6.2.3, the culmination of work on Ganymede, the code name of the project that brought Mavericks compatibility to Helix. Helix 6.2.3 contains more than 50 improvements across the entire product line. Nearly one third of them address crashes that were reported in earlier versions. The complete list of bug fixes and other improvements is found on the Helix 6.2.3 Release Notes page. If you’ve already upgraded to Helix 6.2, you’ll want to download today’s free maintenance update right away. We’re confident that it will dramatically enhance your Helix experience. If you have yet to make the leap from a version prior to 6.2, there’s no time like now to make that move. We fully expect this will be the last of the Helix 6.2.x updates.

A crashing bore

Helix 6.2.2 got us to Mavericks in fairly short order and, for a brief moment, Helix, always the straggler, actually led the pack as the only Macintosh database application product that was ready, beating even FileMaker to the punch. Helix RADE and Helix Engine thrived, but it quickly became apparent that Client/Server was not thriving; some users even began reporting irritating Client and Server crashes. Few things can be as infuriating as looking up from your keyboard to find that all the work you’ve been doing the past few seconds, minutes, hours or longer has disappeared because your program has crashed. Word crashes. Photoshop crashes. Safari crashes. macOS crashes. Even iPhones and iPads crash. In spite of how much devotion there is in the Helix community, its users have had endure these indignities like users of any other program or device.

Shortly after the release of 6.2.2 in January, we quickly solved a number of the problems that were driving users to distraction, including one that prevented users from logging back into a collection after a Client crash, but it soon became apparent that the irritating Client and Server crashes that were being reported were actually all stemming from a single source. Armed with this knowledge, this insidious little viper, which has come to be known as the ‘AcquireEventFromQueue’ crash (AEFQ for short), was deemed potent enough to defer the release of 6.2.3 until we got it under control.

Early in each development cycle, software crashes more frequently than it runs. Creating great software always, at some point, calls for minimizing the occurrence of crashes. The elimination of every crash is not possible. It’s a fantasy. Anyone who tells you different does not understand software. Once again, all software crashes.

AEFQ was not an issue in either RADE or Engine; it manifested itself only in Helix Server and Helix Client, which means that when it happened, more users were affected. It’s one thing to occasionally frustrate and infuriate a single person, and we certainly don’t take such disruptions lightly. But it’s quite another to do it to an entire company full of hard working people. Data security has been a hallmark of Helix for years, but despite the fact that no data is lost during a Server crash, the lack of confidence inspired by the crash itself always overshadows that fact.

When Helix moved to macOS, we gained access to better diagnostic tools. Since our first macOS product hit the street, this has enabled us to solve not only most of the new problems we encountered, but many old — indeed ancient — ones that had long evaded isolation as well.

AEFQ has been with us since we first came to macOS, and these new tools informed us from the start that it wasn’t our code that was crashing. We were asking macOS to do something for us, and macOS was shutting us down, rather than giving us the feedback we had come to expect. The only way we could solve this problem was to find a way to reproduce it reliably.

While every crash is an annoyance, recurring crashes are the most annoying. But recurring crashes are not necessarily reproducible crashes, which are the “holy grail” of the debugging process. A reproducible crash is the key that unlocks the conundrum, as it allows us to study the problem again and again from as many angles as it takes.

Beating AEFQ and moving on

As we said earlier, we were committed to fixing AEFQ before releasing 6.2.3, so we started throwing every resource at our disposal at it. Finally, our persistence paid off, the long-sought breakthrough was found, and we went from having test sites that crashed multiple times per hour to not a single report of an AEFQ crash since the fix was put in place.

Every battle we have fought on this journey has gone pretty much the same way: discovery, frustration, diligence and salvation. The hard work has consistently paid off, as it did once again.

Now, nearly a decade has passed since we first gathered around the digital campfire to figure out how on earth this far-flung but determined crew was going to get Helix from where it was all the way to macOS.

While we could accurately predict very little of what lay between us and our goal, we clearly saw bright, gleaming beacons in the distance, compelling us forward, shaping a clear vision of what had to be done in order in order to be able to say “We did it right” when we finally reached our destination.

From the outset, we dared not contemplate what might lay beyond the peak our predecessors fought so diligently to avoid climbing. What would we do once the playing field was finally level? How could we best prepare Helix for a future beyond our own temporal existences? Should we set aside any wild dreams of the future, put our heads down and play a marathon round of catch up? Or simply give up, and go home?

We chose to stay and fight, especially for our devoted supporters who have stuck with Helix through this turbulent phase of its life. And now, with the AEFQ crisis behind us, forward motion resumes.

Next week we will test the first alpha build of Callisto, the next generation of Helix. Our self-imposed demand for backwards compatibility — Helix 6.2.3 remains compatible back to Helix 6.0 — prohibited us from making the structural modifications required to build the exciting new features you have been asking for. Now we are free of that restriction, and we have big plans.

Find PreviousFind Next