Out With the Old

Happy New Year, Stats-Heads! I hope you all had a wonderful and safe holiday season. I, for one, am glad that the hiatus is back off again (to paraphrase the Beastie Boys. Hint: for a good time, go here – http://en.wikipedia.org/wiki/Oscilloscope_Laboratories).

Sorry to disappoint, but this week’s blog isn’t so much about statistics or programing, but all about my hair. I had been growing growing my hair to give to a wig-making not-for-profit, and finally just couldn’t stand it any more.

Soooo…. (drumroll)… I chopped off almost nine (NINE) inches and sent my hair off to this wonderful organization benefiting children. See my ‘What I did’ section for details.

Looks are (almost) Everything…

Hey There Smarties! I hope you all had a wonderful Thanksgiving and have spent some time appreciating all the great people in your life. I’m sorry that I missed you all last week, but it was a short week and  –  well, who am I kidding? I sat around obsessing over one of my favorite YouTube Channels: Lifehaker. Science Tuesday makes it so hard to get my blog posts in on time. I think I’m going to try this little trick on my sister this holiday season: http://www.youtube.com/watch?v=6gNcd1QTfNY

Now, though, it’s time to get down to business. This week, I couldn’t think of a topic, so I took it to the streets. After asking a lot of people what I should write about, my friend Chris came up with a great and doable topic for this week.

We’re going to talk about making nice-looking (APA or FDA formatted)  tables in SAS, using our old frienemy Proc Report.

Basics II

Welcome back, Mathletic Supporters! After a splendid long weekend with friends in New York, and then at Ladies Rock Camp (http://girlsrockri.org/, or more generally: http://girlsrockcampalliance.org/ – but more about these next week), I am reminded again of how important the basics are: good friends,good music, and – uh – good foundations in statistical modeling. Last week, we talked about two of the assumptions of linear (OLS) regressions. This week let’s pick that conversation back up with the other two main assumptions.

Back to Basics

Howdy brainiacs! You’re not deluded. This week’s post is indeed two days late, indeed. I have a great excuse, though; I was away visiting old college buddies this past weekend. Great time, but I am jet-lagged.

Let me say, if you are ever in the mood for great food, lovely people, and scenic  – well, scenery, get yourself to Portland, Oregon. I may have to compile a list of the nifty places that we hit up, so you can add them to your tourist lists.

In the meantime, seeing great old friends this weekend made me think about getting back to basics.  For instance, we all run multivariate regressions (don’t we?), but how often do we really check our underlying assumptions? This week, let’s start with linear regression, and build up in the next few weeks to other types of regressions.

Text Parsing and Regular Expression Matching

SO I had these grand plans of writing a brilliant blog post this week about triangulation in general and then talk about text parsing, specifically. The “What I Did” was going to talk about using Python to parse text and then the “How I Did It” section was going to show you how to do a call-out from R to Python, and then pull the data back in to R for analysis…

…But, we’re getting close to Halloween and that meant a party-grrl weekend for me. Honestly, the only time I got on my computer was to watch the Michelle Phan make-up tutorials on YouTube (The Vampire one was my favorite! https://www.youtube.com/watch?v=qg-2rDnWCJA , http://www.emcosmetics.com).

So, this week, I am going to give you a really quick and short how-to on Regular Expression Matching and replacement, then show you the commands in SAS, Stata, and R.

Proc SQL

What is on my mind today is SAS’ Proc SQL. Since SAS is most data analysts’ go-to for parsing large data sets, people ask me all the time how they can get a small subset of data with just the variables they want in the order that they want them.

If you don’t have a ridiculous number of variables, I usually tell folks to just subset using the drop, keep, and where statements in the Data Step (see the ‘how I did it’ page for an example).

Sometimes, though, you need to make a nice clean analysis data set to preserve and share (see my upcoming post on collaboration) . Here’s where I use Proc SQL.

By the way, if you are a SAS user and haven’t already checked them out, the SAS Users Group white papers are great resources for your programming. Here’s their very informative white paper that is an intro to Proc SQL.

Click to access p070-27.pdf

On my mind

Welcome to my new blog! In upcoming weeks, I hope to post my new thoughts on statistics, some quick techniques, and – the best part – my favorite outfit of the week.

I’ll also send some links to my favorite methods sites/books/blogs – as well as a few of my favorite style sites.

If you have any topics that you would like to see addressed, give me a holler. I’d love to start a discussion forum.