Question: What does my sister
Rachael
have in common with Ulysses and a pair of Houston DJ's that I, alas, do not?
Answer: AOL Search.
I assume most readers will have heard about AOL releasing what 650,000 subscribers searched for over a three month period. Although
the data was ostensibly "anonymized" just being able to
correlate a users searches can lead to that user being identified, or
at the very least a pretty clear picture of who and where they are.
Although AOL eventually pulled the data from their
research.aol.com website, it lives
on mirrored on many sites
around the world.
I downloaded the data and this evening finally got around to looking
through it (briefly). I decided to see how many people had searched for
anything that led them here or to planet.cleverly.com.
AOL broke the data up into ten individual files each somewhere between
212 to 228 megabytes (uncompressed). Using standard Unix utilities I
executed an grep -li cleverly.com * and was suprised to see that
9 out of 10 of the files matched!
Upon closer examination, however, people were searching for either
rachael.cleverly.com (my sister's old website) or
stevensandcleverly.com (Stevens & Cleverley, Houston DJ's on—the now defunct?—KRTS 97.5 FM). Nobody was searching for blog.cleverly.com, planet.cleverly.com or even my old michael.cleverly.com website.
So Rachael's more popular among the Internet masses than I am. Or at
least people want to read her old college essays more than mine. Either
way I'm OK with that. :-)
Come to think of it I've never actually read Ulysses and
I don't think I ever wrote an essay analysing a poem in college. So no
wonder nobody is looking for me...
So what kind of profile can we glean from Rachael's anonymous
homework stalkers? Let's see...
#10,536,410
Our first mystery user, lets call her Alice, looks to be a student from
The University of Alabama in Huntsville.
Alice went looking for Rachael's website on March 6, 2006 at 5pm. She
appears to regularly need to use a search engine to search for websites
that she could just go to directly if she knew how to use her browsers
address bar. Oh, and type better.
#8,516,760
Our second user, lets call him Bob, appears to be a student
at Texas State. He searched for
Rachael's old website on March 20th at 11:43 pm. Some of his other
search highlights:
#12,569,041
Our third user, lets call her Carol, was the busiest little searcher of
the bunch. She performed 119 separate searches. Prime interested seem
to revolve around:
#2,799,138
Let's pretend our final user is Dave. In addition to searching for
Rachael's essays on poems he mainly seemed to be wanting information on
different colleges. Perhaps a high school junior getting ready to start
applying to colleges?
As for who these aspiring fans of Rachael really were, they probably
weren't really Alice, Bob, Carol and Dave. They are just the
prototypical example characters in discussions on cryptography.
— Michael A. Cleverly
Thursday, August 10,
2006
at 22:34
310 comments
| Printer friendly version
Last year I wrote about paying more
for the pleasure of lots of layovers—how for an extra $111 I could
visit airports in Missouri & Ohio on my way from Salt Lake to Portland,
Oregon for the 12th Annual Tcl/Tk
conference.
Well, it is time to start thinking about the
13th Annual Tcl/Tk
conference... this year the conference is being held in Naperville,
Illinois (near Chicago). Unlike last year where I had to pay my own way,
this year my employer is paying for it (though I'd have gone anyway
if they hadn't).
I looked up what it would cost to travel by train this afternoon.
(Salt Lake is a major stop on the California Zephyr route between Chicago and San Francisco.)
Much to my surprise coach tickets were actually $10 less each way then
what I could find for flights into Chicago's Midway airport—$120 vs
$130.
Amtrak's website, just like Delta's last year, seems to be programmed
to really go the extra mile and give you every last possible itinerary
option. For an extra $148 ($268 total) I could return home from
Naperville to Salt Lake the round about way:
- Saturday 9:38 AM leave Naperville on the Illinois Zephyr
- Arrive in Chicago, Saturday 10:30 AM
- Saturday 2:15 PM leave Chicago on the Empire Builder
- Arrive in Portland, Monday 10:25 AM
- Monday 2:25 PM leave Portland on the Coast Starlight
- Arrive in Sacramento, Tuesday 6:15 AM
- Tuesday 11:14 AM leave Sacramento on the California Zephyr
- Arrive in Salt Lake City, Wednesday 3:15 AM
Travelling this way would only take the better part of five days (longer
than the conference itself) to get home!
I'm attending the conference with two co-workers who would
apparently rather face the indignities of airport security & several hours
cramped with no leg room than view the scenic beauty of the American midwest
if it means cutting twenty-nine some odd hours off the trip.
As for me, I have only vague memories of traveling by train from
Salt Lake to Los Angeles as a young child. The Zephyr has a
certain romantic appeal to it. If I don't "seize the day," so to speak,
will I ever get around to it otherwise?
Something worth thinking about for a few days before booking airfare
I think...
— Michael A. Cleverly
Monday, August 14,
2006
at 19:45
1251 comments
| Printer friendly version
I'm reading Henry Petroski's
Success
through Failure: The Paradox of Design (and quite enjoying it).
While illustrating that "the connection between intention and result,
between cause and effect, is not always what it seems" Petroski sheds light on
a great mystery I've wondered about before: does pushing the crosswalk
button actually accomplish anything?
Blaming an unfortunate occurence on bad design may make for a convincing
damage claim—or even a succesful lawsuit—but the connection between
intention and result, between cause and effect, is not always what it seems.
Over three thousand intersections in New York City have signs instructing
pedestrians, "To Cross Street / Push Button / Wait for Walk Signal." A good
deal of time often elapses between pushing the button and getting the
go-ahead, but conscientous citizens obediently wait. They presume, one
presumes, that a delay is part of the system's design. It may be a "bad
design," but the light does change—eventually.
New York intersections began to be fitted with these "semi-articulated
signals" around 1964. They were the "brainstorm of the legendary traffic
commissioner, Henry Barnes, the inventor of the 'Barnes Dance,' the
traffic system that stops all vehicles in the intersection and allows
pedestrians to cross in every direction at the same time." Walk buttons were
installed mostly where a minor street intersected a major one, along which
traffic would be stopped only if a pavement sensor detected a vehicle waiting
to enter from the minor street or if someone pushed the button, causing the
light to change ninety seconds hence. With increased traffic (by 1975, about
750,000 vehicles were entering Manhattan daily), the signals were being tripped
frequently by minor-street traffic. The walk button hardly seemed necessary,
and pushing them interfered with the coordination of newly installed
computer-controlled traffic lights among many thoroughfares. Consequently,
most of the devices were deactivated by the late 1980s, but the buttons
themselves and the signs bearing the instructions for their use remained in
place. Evidently there was never any official announcement about the status
of the "mechanical placebos."
Which doesn't necessarily mean the buttons are placebos anywhere other
than New York, but it does make one wonder...
— Michael A. Cleverly
Monday, August 14,
2006
at 22:08
908 comments
| Printer friendly version
Paul Boutin at Slate took AOL's
published search data and used a commercial software package
to analyze what people searched for. His conclusion:
AOL's
data leak reveals the seven ways people search the web.
Briefly, his seven classifications are:
- The Pornhound
- The Manhunter
- The Shopper
- The Obsessive
- The Omnivore
- The Newbie
- The Bakset Case
Unfortunately the article doesn't give us percentage breakdowns for
the relative population size of each of these seven groups. (For
the record I believe I'd be an Omnivore, though I'd
never used AOL's search prior to their releasing this data.)
Nor does the article indicate whether each person is strictly limited
to being placed in a single group, or whether one person might be classified
as both a Newbie and a Basket Case at the same
time.
I suspect people can belong to multiple classifications since an
illustrative characteristic of being a Newbie is one
"who confused AOL's search box with its browser address window."
Writing a short Tcl script to count the number of unique users who
had at least one search that matches the following regular expression:
{^[a-z0-9-]+(?:\.[a-z0-9-]+)*\.[a-z]{2,6}$}
I found that over 78.6% of AOL users had searched for—what appears
to be—a domain name instead of using their browsers address bar directly.
(516,882 out of 657,426 to be precise.)
Maybe 21.4% of AOL's customers really have
taken the
training wheels off?
— Michael A. Cleverly
Tuesday, August 15,
2006
at 20:37
2121 comments
| Printer friendly version
Via the Language Log comes a question worthy of Sunday
dinner conversation: how would you complete each of the following sentences?
- The poll shows that a majority of people against the war.
- The poll shows that a majority of people is against the war.
- The poll shows that a minority of people are against the war.
- The poll shows that a minority of people against the war.
- The poll shows that a minority of people is against the war.
- The poll shows that a minority of people are against the war.
Readers were invited to participate in an online poll. This week the
results are in and I'm happy to report that I is not in the
minority... ;-)
How did I answer?
I chose are to complete the first sentence without hesitation.
Both readings, "(a majority of) people are"
and "a majority (of people)
are" work for me.
Initially I wavered slightly in my commitment to are for the
second sentence, but in the end decided that I liked it better as
"(a minority of) people are" instead of
"a minority (of people) is"
(tolerable but rough sounding).
If you are curious about other peoples justifications be sure to check out
both comment threads. Microsoft Word would tell me I'm wrong (in both cases) apparently.
— Michael A. Cleverly
Wednesday, August 16,
2006
at 19:45
109 comments
| Printer friendly version