For nearly two years, I've been trying to branch out and add another programming language to my brain.  I read and blogged about Seven Languages in Seven Weeks, by Brian Tate, an excellent book that I blasted through in seven days to save a little time.

For nearly two years, I've been trying to branch out and add another programming language to my brain.  I read and blogged about Seven Languages in Seven Weeks, by Brian Tate, an excellent book that I blasted through in seven days to save a little time.  If you read my blog, you'll know that I finally settled on Haskell, started posting about my experience as an object-oriented programmer writing in a functional language, and then things kind of fizzled out.

I really like Haskell.  However, I think I'm one of those people who tend to learn better when under pressure.  Since I didn't have a job requirement to learn Haskell or an otherwise motivating situation, I never really quite got in to it.  I still plan to, some day.

But, I have finally picked the "new" language I want to learn, and that is R (I say "new" because of course R is not a new language).  I had a number of reasons to do so:
  • Big Data is all the buzzword-rage right now, and R figures prominently in many big-data scenarios.
  • I'm taking MOOCs at coursera, and the ones I'm taking use R as the programming platform, ensuring that I must have more than a superficial understanding of the language.  I had actually looked at R once before and never stuck with it for the same reasons I did not stick with Haskell -- no looming deadlines!
  • As I learn more about R, I become more impressed by how handily it performs tasks that require a lot of boilerplate code in any other language I've used, so that experience provides me more motivation to keep learning.
  • I am currently working at a bank, and I'm already starting to use R not only to greatly speed up some tasks that I need to perform, but also to perform analyses that would have required so much Java code that they would have gone on the "back burner."
I'm also happy to report there has been some convergence, for me, among big data, R, Haskell and my recent exposure to functional programming.  R is an interesting language.  I don't have an especially formal computer-science background (instead, I'm from physics, math, and electrical engineering), so I probably would not be the best person to articulate how R checks (and does not check) boxes for functional and object-oriented languages.  But all that Haskell investigation helped a lot when I started learning MapReduce, and seeing functional features in R that also fit well into the MapReduce paradigm makes me feel - as all curious types should - that all that investigation was worthwhile.

I'll still blog about Java occasionally, but my posts for the near future will be focused on my self-training to fill in gaps in my skill set related to big data.  I have started a new blog on this topic, called Data Scientist in Training.  If you read me on DZone, you don't have to do much to find me, as my posts from both blogs will continue to find their way to DZone (the big-data posts go to a microzone called Big Data/BI Zone).  If you read me directly on Blogger, then please bookmark the link above if you're interested in what I'm doing.  At the least, please check out my Welcome! post, where I explain my path and reference some resources that you, too, may want to check out in the event that you want to learn more about big data, too.

My posts about R on Data Scientist in Training will not explicitly say anything in the title like "Java developer struggles with R data frames", but it will still be obvious that my approach to R is that of a developer who has used Java for about 90% of his coding for the last 15 years.  If you're a Java developer and are learning R, I hope there will be some content there of special use to you.  As I've searched online while learning R, I've noticed helpful responders trying to explain how to move from the "use a for-loop to iterate and then build your model in rows" approach to "use a mapping function to create your new column of data, then add it to your data frame".  (In fact, this reminds me of another feature I like about R -- R data frames remind me of tables in the column-oriented databases used extensively in big data).  I'm going to blog in near-real-time so I don't forget those dead ends I encountered as I was trying to map Java onto R, and that perspective is the one I think will be most helpful to fellow Java/OO developers.

There are a few posts on Data Scientist in Training already.  The next one will be specifically about R -- I hope you check it out when it arrives!












I've been experimenting with using Pig on some Fannie-Mae MBS data lately.  While I don't mind writing MapReduce programs to process data (especially the fairly simple tasks I'm doing now), I really do appreciate the "magic" Pig does under the blanket, you might say.
I've recently been writing JMS clients for an application I'm building and keep finding myself having to re-learn some basic configuration.
A few years ago, I posted a how-to on Java-SE-based Web Services. More recently, I've become interested in asynchronous web-service invocation, and, as it turns out, Java SE supports that, too. This post, then, is the asynchronous version of that older post.
In an earlier post, we stepped through the building of an asynchronous web service, deployed in Java SE. I saved my comments for this post to keep things a little cleaner.
As I have mentioned in earlier posts, I am using the Java Debug Interface (JDI) to create a Java process-monitoring tool.
In my Part-1 post on this topic, we actually did all the I/O I'm going to do here. We lazily read in the entire sample data file, a file containing data describing events generated by a process monitor. My next goal was to re-hydrate my Events from the Strings serialized to the file.
I'm about halfway through Real World Haskell, and I've spent a week trying to decide when to write this post. As the authors point out, Haskell I/O is easy to work with.
For the last few weeks, I have been building a Java process monitoring tool based on the Java Debug Interface. Although I've done much of this work before, it has been a few years, and so now I'm retracing my steps.
After my last post scrolled off the bottom of the page, I realized I missed a couple of opportunities: one related to some additional code optimization, and one related to the topic of lazy (or nonstrict) evaluation.

First, let me review what I was doing.
Today I'm going to process a set of structured data using Haskell, tainted by years of Smalltalk, C++, Java and C# experience.
I've been working on a JDI (Java Debug Interface) project lately and have been posting helpful tips as I go along. It has been a few years since I've worked with this API, but although I know there have been a few enhancements, the API is quite consistent with what I remember.
Today I'm looking at Haskell type definition and the use of pattern-matching in functions. Pattern-matching is much more an integral feature of FP, as opposed to OO. But first...
I'm learning Haskell by following O'Sullivan, Goerzen and Stewart's Real World Haskell. I've been writing object-oriented code for well over half my career as a developer, and there are things about functional programming that really stand out to me specifically because of my OO background.
For the last year or so, I've been trying to come up to speed on functional programming, studying bits and pieces here and there. One interesting source was Bruce Tate's Seven Languages in Seven Weeks, which included a number of FP languages.
If you've looked at my recent posts, you know I'm working on a plugin for VisualVM, a very useful tool supplied with the JDK. In one example, I showed how to attach to a waiting Java application using a socket-based AttachingConnector.
Say you've got a good-sized chunk of code, in production, that doesn't always act as expected but it does so often enough that everyone's willing to keep using it (including your customers).
As I mentioned in an earlier post, I noticed recently that the JDK utility VisualVM is extensible, and it was my goal to create a useful extension.
I'm not sure when this happened, but at some point the JDK-included VisualVM utility became extensible. This is really great news for me. Writing a profiler is a lot of work (although admittedly very interesting and fun).
We use DWR (http://www.directwebremoting.org) a lot where I'm working now. It's a Java library used to integrate JavaScript-based web development with Java middleware.
I've reached the 7th and final language of Bruce Tate's Seven Languages in Seven Weeks. While some of the previous languages were functional with some imperative support, Haskell is a purely functional, and statically typed, language.
Today I'm reviewing the discussion of Clojure from Bruce Tate's Seven Languages in Seven Weeks. Clojure is Lisp on the Java virtual machine. Lisp is another language that, despite being around a long time, I have yet to investigate, so this is another new experience.
If you are just dropping in on me, I'm reviewing Bruce Tate's Seven Languages in Seven Weeks, with the slightly lazy (or aggressive, depending on your view) twist of reviewing one language per day.
Scala is the 4th language in Bruce Tate's Seven Languages in Seven Weeks. It is the only language in this book with which I am already familiar, although I've only been learning it for a month or so.
Prolog is the 3rd language covered in Bruce Tate's Seven Languages in Seven Weeks, and is a declarative, rather than imperative, language. Prolog is not new, of course (1972), but I have to admit this is the first time I've taken a look at it.

Like Bruce, I used GNU Prolog (1.3.1).
Io is the 2nd language in Bruce Tate's Seven Languages in Seven Weeks. Io is a prototyping language, where most of the mass exists in the libraries. The syntax itself is refreshingly simple, and he is not exaggerating to say you can grasp it in about 15 minutes.
When I decided to blow through Seven Languages in Seven Weeks in only 7 days, I had yet to read even the introduction to the book.
I recently started learning Scala (you know when the New York Times refers to Java as an "older" language, it's time to update!). As I've started trying to shift my thinking from object orientation to functional programming, I remembered the book Seven Languages in Seven Weeks by Bruce Tate.
Recently I had a repeat of a problem I was unable to solve the first time I encountered it.
Hello, everyone:

I've posted a number of entries in the last year about profiling Java applications. Some of this effort went in to a standalone Java profiler with a Swing interface, called the MonkeyWrench.
DTrace, a dynamic tracing framework on Solaris (see dtrace(1m)) is a valuable and extremely easy-to-use tool if you find yourself analyzing Java performance issues on Solaris.
My entries here are usually about Java, but today I want to call attention to the 1.0 release of FlexMonkey, an Adobe AIR app for testing Flex and Adobe AIR applications (I work for Gorilla Logic, the creator of FlexMonkey).
In recent posts, I've investigated using the java.lang.instrument package to instrument Java classes for a simple profiler.
I've been building a Java profiler lately, partly to address what I think are shortcomings in the current set of free/shareware profilers, and also to enjoy the flexibility of obtaining only the profiling information I need.
Lately I've been looking at profilers and wondering how easy it would be to write one, mostly to be able to provide a little more sophisticated guidance to performance investigations.
About Me
About Me
My Photo
I'm a software architect/consultant in Boulder, Colorado.
Picture
Picture
Blog Archive
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.