Monday, December 11, 2006
Infallibility of Graphs
My job description is perilously close to having "statistician" added in the requirements. For nearly the last two months I feel like the primary function of my job has become in depth data analysis. This is strikingly similar to the research post I held with my dark oppressor Dr. Xue at ASU. I essentially write programs that gather data. I then write more complicated programs that analyze the data the first set of silly programs produced. What always manages to astonish me is the shear volume of data a single engineer can produce when he or she is attempting to find something to analyze. It helps to impart a sense of perspective as to why data mining exists as a proper subfield of computer science.
My personal datamart here at work currently is a MySQL database with several hundred thousand rows generated exclusively by my tiny little apps. I have written a whole array of programs that exist solely to filter and make some of this junk meaningful.
Without geeking out about the problem I have been attempting to solve, I will simply say that it has proven to be beyond a bitch to track down. Hence the reliance on statistics and pretty graphs to illuminate it. This last week was a major breakthrough with this specimen:
Hopefully no one needs a degree in statistics to realize this graph has some issues in the same sense "global warming" has some issues. Five separate computers, identical software, identical hardware, and dissimilar traffic loads all experienced an identical perceived drop of incoming packet data. Identical to the real world problem, and has only been reproduced in captivity once.
All I really know at this point is that I am absolutely thrilled that I have something I can wave in front of management as if to say "Look, see! This is why you pay me my salary!" It also makes me think of the term no-brainer and why I am going to bludgeon someone should it be uttered in my presence. Specifically because I recall someone once saying that the risk of presenting irrefutable evidence related to a previously unknown problem is that someone above you will say "obviously that is a problem! Look at that graph, it's a no-brainer," without giving a second thought to all the research and work that went into simply producing the damn thing. One battle at a time though... Right now I am still enjoying the breathing room this graph has bought me.
---
In other news, last week I sent an email message to Dr. Sen asking if the department needed any TAs for the low numbered classes in the CS department. I have yet to receive any kind of response, but I can not help but feel the answer would be "yes," since so few grad students actually want to TA those courses. I would liken it to a bunch of English graduate students begging to TA or teach English 101. The difference being a lot of computer science graduate students actually suck at programming, whereas I sincerely hope a bunch of English graduate students can all decently write. Anyhow, it is essentially my clever plan to reintroduce myself back to academia slowly to see if I want to pursue more classes. I feel myself becoming increasingly jaded toward the process of enrollment and vastly more critical of the entire academic institution day by day, not to say that corporate America is any better. At this exact moment in time both paths sound like folly.
5 machines
1 program
6 samples per minute
43,200 data points from that program per 24 hour period.
My personal datamart here at work currently is a MySQL database with several hundred thousand rows generated exclusively by my tiny little apps. I have written a whole array of programs that exist solely to filter and make some of this junk meaningful.
Without geeking out about the problem I have been attempting to solve, I will simply say that it has proven to be beyond a bitch to track down. Hence the reliance on statistics and pretty graphs to illuminate it. This last week was a major breakthrough with this specimen:
Hopefully no one needs a degree in statistics to realize this graph has some issues in the same sense "global warming" has some issues. Five separate computers, identical software, identical hardware, and dissimilar traffic loads all experienced an identical perceived drop of incoming packet data. Identical to the real world problem, and has only been reproduced in captivity once.
All I really know at this point is that I am absolutely thrilled that I have something I can wave in front of management as if to say "Look, see! This is why you pay me my salary!" It also makes me think of the term no-brainer and why I am going to bludgeon someone should it be uttered in my presence. Specifically because I recall someone once saying that the risk of presenting irrefutable evidence related to a previously unknown problem is that someone above you will say "obviously that is a problem! Look at that graph, it's a no-brainer," without giving a second thought to all the research and work that went into simply producing the damn thing. One battle at a time though... Right now I am still enjoying the breathing room this graph has bought me.
---
In other news, last week I sent an email message to Dr. Sen asking if the department needed any TAs for the low numbered classes in the CS department. I have yet to receive any kind of response, but I can not help but feel the answer would be "yes," since so few grad students actually want to TA those courses. I would liken it to a bunch of English graduate students begging to TA or teach English 101. The difference being a lot of computer science graduate students actually suck at programming, whereas I sincerely hope a bunch of English graduate students can all decently write. Anyhow, it is essentially my clever plan to reintroduce myself back to academia slowly to see if I want to pursue more classes. I feel myself becoming increasingly jaded toward the process of enrollment and vastly more critical of the entire academic institution day by day, not to say that corporate America is any better. At this exact moment in time both paths sound like folly.
Comments:
<< Home
i have good solid evidence that there are lots of humanities grad students (in relatively prestigious programs) who can not write, at least not in standard english...alas. grad school admission standards are great at selecting people who jump through hoops well but not much else.
Post a Comment
<< Home