Dear SPSS: We Have To Break Up

Dear SPSS (or PASW or whatever you call yourself these days),

It’s not working out. For the past few years I’ve tried to pretend everything was all right–and even before that I wasn’t completely satisfied, but I never really expected to be, because there’s no such thing as a perfect statistics application, right? So who’s to say what’s “good enough” in this crazy, mixed-up world? Maybe my standards are too high if I’m not content with your admittedly vast array of analytical features. I guess what I’m trying to say is, It’s not you; it’s me.

I never expected you to be perfect, or to do absolutely everything–I was comfortable processing my data with Excel or Notepad++, doing power analysis with Gpower, and structural equation modeling with MPlus, for instance. I never expected you to be easy to understand, though you weren’t that difficult to get to know, as long as I didn’t try to get friendly with unnecessarily complex and occasionally baffling inner you–you know, your syntax. No, it wasn’t those things, though I admit I occasionally dreamed of something better. But to say this isn’t about my expectations would be a lie.

When I met you, I didn’t know what to expect, or what was reasonable to expect. Everyone said you were the best match for me, and I didn’t question that. As time went on, though, I learned what might be possible, I talked to other researchers about their statistics programs, and I started to wonder. Some of your oddities started to annoy me. I remember wishing you understood trimmed means. I wished you had better generalized linear modeling capabilities. I wished you could transform a freakin’ variable without going through the uber-clunky “transform” or “recode” dialogues. For that matter, I started to wonder why I had to point and click half a dozen times to do simple things, like producing a histogram. And why medians aren’t included by default in “descriptive statistics.” For that matter, why can’t I change the defaults? Sure, your mysterious, quirky syntax was the answer to many of these questions, but what kind of answer was that? Complexity for complexity’s sake; skills completely useless for anyone but you. It was programming in the least exciting way: open a file, write the code, run the code, check the output window, debug the code, repeat. Call me impatient, but that is a slow, unrewarding, and occasionally painful process. After a few years of it, I found reasons to avoid anything involving your syntax. I’d claim I didn’t really want those results, anyway. I’d say I could probably do that with Excel. I’d say I was too busy. I’d say I had a headache.

There are some things I know you’re capable of, but I’ve never done with you: the “extras.” When I’ve suggested trying certain things–pretty normal things, no matter what you pretend–like SEM, replacing missing data (and no, I don’t consider mean replacement or single regression imputation to be particularly exotic), advanced modeling, or exact tests, let alone anything really freaky like map graphics, text analysis, or classification trees–you act all shocked, like you’ve never even heard of those things, or like no self-respecting application would do those things; those kinds of things are only for a few odd researchers working in esoteric fields. Oh, but if I really want to do them, you might be able to oblige me… it’ll cost extra, though.

Another thing: lately you haven’t really been available for me. Back in the day, we’d do analyses anytime, anywhere I wanted. Well, I still want that. I need that. You suggest my needs are unrealistic, but I suspect a lot of researchers can relate: sometimes I need analysis in the middle of the night, sometimes on the weekends, and sometimes even in Canada. You seem to have walled yourself off from me, though. You used to give me what I needed when I needed it, but now your licensing agreement says I only get analysis when I’m physically on campus. Sure, you have a “commuter license” option, generously allowing me to do my research for as much as two weeks at a time from off campus (it used to be a month…), but that really doesn’t help me over a Christmas “break” that lasts twice as long, and it didn’t help me during those three-month summers living with my wife in Ontario while I tried to catch up on all the research I had neglected through the teaching year. Before you say it, yes, I know I could have your, ah, services more frequently… for a rather stiff price. Duly noted.

And how about your price? Sweetie, you are expensive. I don’t know if there’s some universal value for what you provide, but your price sure seems steep. I’m glad my job pays for you because, honey, I certainly couldn’t afford you even for one year, and your fees need to be paid annually. They say money can’t buy certain things, and it’s apparently true. Buying just isn’t an option for your services; only leasing.

I didn’t want to bring this up, but since we’re on the topic, I might as well: you’ve changed since I met you. Quite a lot. I know everyone has to adapt in this fast-paced data analytical world, but in addition to substantive changes in your procedures, you seem to always include a few cosmetic (no problem) and arbitrary changes to your UI (problem). I don’t tend to notice these until right when I’m preparing to teach a class, or when I need a graph or analysis for a conference presentation in four hours. People in neighboring offices have heard me say, “Why the hell did they put that there in this version?!” at high volume. Overall, I have to admit that you are getting better–in some ways–with time, but you’re always a follower, always a few years behind the trends. And then there’s the weight.

You’re constantly reinventing yourself, it seems (except that core, deep version of you, which never seems to get an update), and one of the most frustrating aspects of this is that you get heavier each time. I remember when we first got together… was it 1995? You fit on a few 3.5″ floppies. Now, I’m informed by my university IT department that I have no choice but to load all 1.3 gigabytes of you onto my poor little laptop. It’s become un-economical, they say, to support previous versions of you. And I don’t disagree; I’ve had some experience with your economy. I wish I weren’t so shallow, and I know I’ve gained some fat around the middle, too–all that sedentary desk jockeying–but your weight really does bother me.

So, I think we need to break up. It’s just not there for me, any more. I could say the magic was gone or I’d grown into a different person or something, but really it’s more fundamental: you don’t do what I need you to do when I need it or at a price I can stomach. It’s over, SPSS. I’m sorry. You might think this is because I found someone else, and I have to admit you’re right. I’m sure you noticed when I brought home product literature for SAS and Minitab from conferences, and, seriously, what red-blooded researcher wouldn’t do a double-take when STATA was in the room? Yes, there’s someone else, but it’s not what you think. Well, depending on your views, maybe it’s much worse. It’s R.

I know R is younger than you–a lot younger–but try not to feel bad about that. Age has nothing to do with it. Almost nothing, anyway. And R is responsive: I type something, R does it. If I want to see a histogram of x, I don’t point and click a bunch of times; I type hist(x). If I want to see what that would look like, log-transformed and divided by its own median, I don’t have to use the transform dialogue or use the TRANSFORM command with an EXECUTE afterward, remembering the period at the end of each line, then run the file and check the syntax… only to realize that SPSS can’t recurse like that… and when I get the recursion figured out and a new variable created (which I’ll have to go delete after I realize that I didn’t really want that particular transformation), then I’d have to do all the pointing and clicking (or more syntax running) to see the histogram. No, I just type hist(log(x)). Oh, and by the way, sweetie, that period is for this blog post only; R doesn’t need your superfluous periods at the ends of lines; it also doesn’t need punch cards. Was that a cheap shot? Yes. But I said it, and I’m not taking it back.

I know right now you’re thinking, You call my syntax complicated, then you shack up with R? Yeah, R has a learning curve; no denying it. But it’s not meaningless complexity. It’s not the complications of game-playing or anachronistic devotion to 1970s programming standards; it’s the complexity of a fundamentally good application. It takes a while to learn, and I’m sure I’ll be learning about my new partner for many years, but everything I learn means something, you know?

And R is free. And R is available whenever I have needs. R doesn’t have your air of (ahem) tradition, or SAS’s gloss-black BMW SUV thing going on, or STATA’s almost-European glamor and stylishness; but R has class–even if some people can’t see it, distracted by the punky haircut and open-source attitude.

When a problem comes along, R doesn’t waste time complaining about how hard it is or upselling me for a few hundred bucks more; R rolls up the ol’ sleeves and gets to work. R will do what’s gotta be done, as long as I’m there doing my part. R feels more like a partner than either a servant or a master. R can process my data six ways from Tuesday, extract meaning from open-ended interview transcripts, and do any kind of modeling or simulation I want. R has basic descriptives (and inferentials) at the touch of a few keys. Hell, R can even do crazy things, like animation and ganking my data directly from online sources. If R can’t do something, it’s a sure bet it can be made to do it with some research and coding.Wanna know what I did with R last weekend? I simulated 10,000 different random responding patterns for my questionnaire data, computed a measure of maximum similarity between my actual participants’ patterns across all 10,000 random draws, and then ran correlations between that max similarity and other variables in my dataset. It took about 30 lines of code. I didn’t think about you even once.

Sure, R can be confusing–like I said, R is complex, but with the kind of complexity that comes from just the right kind of simplicity applied to complicated problems, like a good programming language, which is what R is, after all. After the initial learning period, every new thing I learn about R helps me solve a dozen problems later on, it seems. I have to ask lots of questions, of course; sometimes I get tough-love answers like, “write a function yourself,” “RTFM,” or “Why would you want to do that?” And when I buckle down and deal with those things, even if that takes hours or days, I end up a better researcher. Months later, when someone asks me how to do something, I can often sit down at my desk and, in a few minutes, do something that SPSS-only colleagues are amazed by. But sometimes the answer to my question is, “Here is some code I wrote for you, with an explanation of why it works,” or (even better), “Here is a small thing you don’t understand about this function that will make this problem much easier.”

So, I guess that’s it. I’m with R, now. We’re happy. I won’t forget the good times you and I had–I still have the posters and abstracts–and I’ll recommend you to many of my students… with a few caveats. We’ll see each other at collaborative research projects and maybe in some classes, and I’ll be polite when people ask why we’re not together anymore. But it’s over. Don’t pretend to cry. You’ll bounce back. Actually, with millions of users and a healthy profit margin, I don’t think you’ll even notice I’m gone.

On second thought, there are a couple of things you do a little more easily than R, and sometimes you’re just a little more accessible, or easier to explain to my research assistants, so I hope you don’t mind if I drop by every few weeks for a quickie.


Leave a Reply

Your email address will not be published. Required fields are marked *