Showing posts with label blogsci. Show all posts

06 October 2008

miscellaneous topics

i'm caught between two platforms...

robotwisdom.com will blink out again
at the end of this month
unless another volunteer host turns up

i was appalled to learn that google
declines to index pages with more than 100 links
assuming they're spam
(i think pages with <100 are lazy)

google-reader-shared
has killed linkblogging for me
at least temporarily

i've fallen in love with reCAPTCHA

the little boost of knowing
i'm proofing random texts
makes that burden a joy

i tried the new online Monopoly
and find it pointless
except the perspective it offers
on googlemaps AI

if it implemented a map-recaptcha
i'd love it
to fill out the googlemaps model
with geographic intelligence

(Monopoly relaunches this week
if you want to try it out

about the only strategy
is sabotaging neighbors
or not

rural areas have few/no neighbors)

it may be a dark age for linkblogging
but it's interesting times for web 2.0

with tiny tweaks from myspace to facebook to twitter
bringing revolutionary vibe-shifts

which have many evolutions to go

it has to go back to
usenet's decentralised distribution

and i've got an idea
for a new antimath symbol
probably just "+"
that represents
a hypothetical absolute moral ideal

snubbed by almost all
subbing half-measures

pursued tirelessly by one or two

Web3.0 trends

there's some webwide design principles
emerging
that i'm not sure many people
can see yet

wikipedia needs a page for every product
for every person
for every event

(it should supplant the evil classmates.com
as a gatheringplace for nostalgists)

and it should aggregate ALL links
on every topic
sorting them and rating them

so when your vacuumcleaner breaks
you go to its wikipedia page
(which you bookmarked when you bought it
like all your possessions
in an intelligent database)
and find sorted links
to all pages about that product
including diy repairs

this will put google websearch out of business

its 'best-first' algorithms can't compete

the concept of bookmarking
'to read later'
needs to be refined

we experience a continuous stream
of links we find more or less tempting
and we need to sort and rate these
so that the next-most-attractive
is always a click (or two) away

everything on the web
should be striving to connect individuals
who share common tastes and interests
(eHarmony writ large)

everyone's stream of rating/judgments
automatically compared
with similar-others queued
for speed 'dating'

Birth of blogging

i may or may not use this post
to aggregate debate
about the origins of blogging

my word 'weblog' was intended to summarise
some simple basic design features
that hadn't yet been recognised as such

- the best web links
- of every kind
- briefly described
- added as discovered
- top-down
- keep the page short/quick
- link direct to the juicy stuff
- everybody shares their tastes

existing models all lacked some of these
(justin hall's was long/slow
winer kept rearranging the day's items)

i avoided 'news' at first
looking far and wide for good content
most others hadn't yet discovered

it was a (small) catastrophe
when blogger.com co-opted the word
for online diaries

few people maintain proper weblogs

and i'm still waiting
for a decent online web-logging site
(delicious is closest)

twitter isn't a good fit
urls are masked
archives are clumsy

the explosion of options
has become frustrating

'onebit blogging' a la
hypemachine and flickr
seem hopeful, in their way

[mefi thread]
[Ammann]
[Rosenberg]
[Winer]

What flickr/Twitter could learn from HypeMachine

others' faves should be 1stclass streams
with a variety of views

who-subscribes-to-who should be public

more-than-onebit blogs
should be supported too
(eg featuring a content-creator a day
giving background on them)

merge subscriptions to new content
and subscriptions to others' faves
into one stream

prominently feature the count of faves
for each content-item
and conceal/reveal the 1st dozen or so faver-names
within the content-stream

clicking on a faver-name
shows their faves stream
and offers their subscriptions-list
and a single click adds/subscribes-to them

a few things hypem should add:

visible indication of which tunes
you've already tried

onsite user blogs

suggestions of users/blogs/bands/tunes
you'd likely like

Appreciating 'Hype Machine'

i imagine that the hype machine
chose their awful name as camouflage
for their true purpose
of free-mp3-blog-aggregator

but as these things go
they seem to be cuttingedge state-of-the-art

the design-theme is giddy redundancy
everything is one-step-or-less away
and infinite reversibility
(try anything, no commitment)

it took me embarrassingly long to figure out
how to play the tunes
and find the hidden mp3s
(click rightpointing triangle to play
(player at bottom even lets you change pages
without stopping the current song)

open 'read full post' link in new tab
to find mp3 link)

finding new music

start with a song you love
that you think defines your special taste
search for it, 'love'-heart it
go thru everyone else who's hearted it
and sample their recent faves
'follow' them if there's any overlap
(watch which blogs they find good tunes on
and 'follow' these too
or sample them as radio stations)
(following individuals with similar taste
is the general solution to
the longtail fail)

search for favorite artists
and 'follow' them
(it seems you can't yet backtrace
who else follows an artist?
or find generalised most-similar recommendations)

needs:

5star ratings scale

best-first priority-queue

Calculate your blogscore

[i want to tweak this until it accurately reflects mature blogging habits-- eg losing points for deleting unread posts...]

[i think it needs to be broken down into categories: pix, music, vids, tweets, news? humor? gossip?]

[i'm a newbie myself in some of these areas]

[some domains have better filtering tools available]

[you should score higher for finding meta-sources than straight sources]

[there may be objective network-math definitions possible: mathematically optimal coverage]

[each domain has a global average new-content rate;
each user has a time/value tradeoff curve for that content;
each filter delivers some average value per time]

# of rss subscriptions

# of flickr subscriptions
# of flickr-faves subscriptions

# of mp3-blog subscriptions

# of twitter-faves subscriptions

# of youtube-faves subscriptions

One-bit nanoblogs

flickr and twitter
support near-identical
onebit 'nanoblogs' of faves

subscribeable via rss

hackable via api

hacking sequence:

download own faves' ids (count them)
save in a database
sort by author
rank by authorcount

for each author
download itemcount (save in db)
re-rank by fave frequency

detect densest faved authors
not already subscribed-to

download ids of faves of densest author
(save in db)
compare to own faves

calculate degree of taste-overlap
(twitter is different from flickr
because older tweets are rarely faved
so overlap won't extend back
past the more recent of 2 join-dates)

gradually extend database of others' faves

try to detect
still-unseen items
frequently faved by similar favers

Machine-optimised 'explore'

each time we rate
any item of content
our software ought to adjust
its predictions about
other content's ratings

unseen content
by the same author
should be rated similarly

other raters
who rated that content similarly
should be rated similar
(ditto for un-similar)

unseen content
rated highly by
these similar raters
should be 'rated up'

every moment is a choicepoint
(which next?)
(which to queue for later?)

nextbestunseen by sameauthor

bestunseen by similarraters

bestunseen overall

newifany by knownfaveauthors

newifany by knownfaveraters

faves of newauthor/rater

subscribe to authorstream/raterfaves

Web2.0 database theory

at the heart of the classic web2.0 site
is a database of
users U
times
content-items C
U*C

each user u
expects to rate
each content-item c

and these ratings
guide
what sequence of content-items s(C)
is offered to any user
(aka recommendations)

eg flickr
tracks faves via pink stars
and creates from these
a single stream (explore)
offered to all users
but also allows subscriptions
to others' newly-added pix or faves
(not trimming redundancies)
and to tags and thematic groups
(as designated by the photographer)

or eg twitter
tracks faves via gold stars
but leaves it to Favrd
to collate these into a stream
also allowing subscriptions
to others' newly-added tweets or faves
(ignoring redundancies)
and to topical hashtags

or eg youtube
supports faving and rating
and makes recommendations
(though i doubt many people
bother with these?)

if the fundamental web2.0 goal is
helping users find the best content-items
the most important omission
from these sites is
redundancy-trimming:

once you've seen/judged an item
you won't need to see it again
so it can be silently trimmed
from offered streams

(one exception being
streams of favorites
from users you're interested to know
so you want to be told
if they-liked-that-too)

another useful database is C*C similarity

people who like c1
usually like c2 too

possibly decomposable into
thematic dimensions T:
easy-difficult
natural-artificial
common-rare
emotional-dry
sexual-chaste

Filtermath 2

web2.0 sites' target
should be to offer each visitor
a stream with all interesting content
(most interesting first)

they need to track (forever) what content
each visitor has already seen

they need to model visitor tastes/preferences
to anticipate content-ratings
matching visitors with similar tastes

discovering the dimensions of taste
identifying content-prototypes
that can quickly map visitor preferences

ideally recognising five ranks:
A B C D F
where Fs offend you
Ds turn you off
As you want to share

with an 'emulate' mode
when you find a visitor with similar tastes
that lets you zip thru their As and Bs
inheriting their ratings as default

Filtermath

i think my complaint
about blogging's dark age
can be boiled down to a handful
of formulae and statistics

eg for flickr (simplest case)
c4000 new pix per minute
maybe 0.1% of which (4/min) i'd fave
if i had a way to find them

the full database of a billion pix
waiting to be filtered
a checklist that should always remember
which you've already viewed

i subscribe to about 250 photographers' uploads
and 50-more's favorites
plus a single tag ('nude')
all via rss, which lets me quickly scan
mediumsized copies
of hundreds of pix per day

maybe 10% of which i fave
creating a veryhighquality stream
that should be subscribed by thousands
(current subscriptions unknown/zero?)

but missing at least 1000 potential faves/day
because i can't find them

simple strategy:
ruby/python/perl script
builds mirror of personal faves
and for each of these faves
the list of all flickrers who've faved it

and then calculates which flickrers
have faves that overlap mine most
(without excessive burden of
boring/offensive faves)

(flickr needs a stat
hinting how many pix each favoriter winnows
on average)

when you discover promising new fave-sets
you need to queue/ration
the process of sifting them
(the best ones might need better
to be added en masse
with exceptions deleted)

flickr should recommend photographers and faves
based on your known preferences

should always offer a stream
of likeliest-to-fave's
it knows you haven't seen

algorithmically tweaking its choices
as you surf

Blogging's new dark age

i'm feelin like it's
temporarily
a new dark age for blogging

millions of tiny twinkling lights
of valuable content that aren't getting
sorted and focused and concentrated
(which is what blogging is
or should be
all about)

the top issues are too difficult
(credit default swaps, jewish exceptionalism)

the content is too plentiful
(flickr, youtube, blogger, twitter)

the technology is a full tech-generation behind
(favoriting + delicious + tumblr = ???)

the cultural leaders are
for the moment
inarticulate
and ethically compromised
(obama's a washout)

pop music is shapeless and forgettable

games are variations on mediocre themes

i try swimming faster and faster
with less and less to show for it

i want to keep up with
innovations in games
without kotaku's 40 posts/day

with movie/celeb gossip
about my faves
without a trillion realityteevee nonentities

i want easy identification
of a community of like-minded favoriters
on flickr/twitter/youtube/etc
with their votes collated to produce
a custom ranking of recent content
as shallow or deep as i have time for

i want to be able to pass on my favorites
with a minimal twitch
(not my current
blogger + delicious + twitter + tumblr)

i want to feel, again
that we're building something together

that our priorities are obvious
and shared

that tech is unifying and simplifying
rather than the opposite

when i started my blog in 1997
keeping up with content, worldwide
wasn't yet a fulltime job

more or less everybody is
more or less potentially a
more or less fulltime
photographerphilosopherdiaristbloggercritic

generating barely-countable content-items
on mostly predictable hostsites

and trying, more or less
to discover and share
the best content created by others
commenting, critiquing and responding

without stooping to popular-mediocrity contests

struggling against ever-worsening odds

the browser for this future
needs to track all content authors
across all sites

remembering what's been seen
what's been recommended

streamlining the interface mechanics
so that a bottomless best-first queue
offers a customised range of content categories
with duplicates filtered/combined

new content from favorite authors
new recommendations from favorite critics
unplumbed favorites' archives

(inbox zero is passe)

Calculating interface 'friction'

the juxtaposition of
john gruber's concept of interface friction
with david li's copula formula

has refocused an old intuition
about the science of human factors:

that someday we should have a formula
for rating the 'friction' in an interface

how many clicks
for each important action
how much mental strain
to refocus the brain
(like shifting contexts)
how much unintuitive complexity
to master
how much opportunity
to be publicly humiliated

web 2.0 has made considerable progress
on the first two of these

but achieving sufficient mastery
to avoid self-embarrassment
is a gigantic, growing problem

Understanding Twitter

reddit doesn't understand twitter!?!
(i'm surprised)

it's the algonquin roundtable redux:
a universal competition
to say memorable things
in 140 characters or less

the more you're Favorite-d, the better you're doing

if you're not finding
a dozen tweets a day
worth Favoriting...
you're doing it wrong

look at others' Favorites
to find the wits you like best

and doublecheck their own Favorites
so you can subscribe (by rss)
to those who'll broaden your scope
bookmark twitter searches

watch for interesting hashtags
or mentions of interesting people

(these are also rss-able)

get comfortable with searching on @-names
to check others' replies
before replying yourself
keep reloading hashtag-searches
during interesting events
(#oscars #sotu #superbowl #ted)
or favorite teevee shows (#house)
to sample an infinite range of reactions
and occasional insights

temporarily follow
the best of these
add friends and family
(and celebs you wish were f&f)
not expecting algonquin wit
but grateful for random
keeping-in-touch tweets
ask/answer questions of the #lazyweb

monitor memes

announce time-critical tidbits

twitter is like usenet
where you subscribe to
people not topics

fastfastfast

spamfree by design

intolerant of prolixity

easily capturing stray thoughts
and trying them out
on a mostly-forgiving audience

Twitter's two cultures

as of 6:45am on 18 Feb
my web3.0 tweet of 9pm last night

has been retweeted 50 times
mostly by ClayShirky's followers

but Favrd only once

dramatising the cultural divide
between the 'Retweeters'
whose Favorites folders are mostly empty

and the 'Favrders'
who add a dozen new Favorites a day
but frown on retweeting
(because it doesn't raise their Favrd score)

(Favrd currently reposts about 300 posts per day
based on 'gold stars'
from their registered users' Twitter accounts)

the uncrowned king of the Favrders
was MerlinMann/hotdogsladies
who shockingly retired from Twitter in December
because it was taking too much time

addendum 19Feb: Favotter shows 4 Favorite-ers

addendum 22Feb: [debate]

Web 2020

what i'd like/expect to see:

html replaced by simpler e-markup like _italics_
links remember their chain of 'via's
commenting is globally unified-- you don't have to remember where you've left comments
article/page/post/bookmark distinction erased, just views of data
multiple links to same content automatically collated
automatic comparison of tastes/favorites to recommend likely similar author/bloggers

Web3.0 = favorites-matching

there's a simple shared solution
to the problem of
optimising one's datastream

(best pix best vids best tunes
best science best news best gossip
etc)

a web app needs to
monitor your preferences
and match them to the closest
other people's prefs
in each domain

this is not at all like
aggregating scores for the 'best' posts
which results in mediocrity rising

instead it's like creating
networks of individuals
with partly-shared tastes
tuned on a person-by-person basis

so twitter needs to compare
everyone's favorites
and tell me whose tastes
are most similar to mine
so i can subscribe (or not)
to their favorites feeds

and ditto for flickr
and imeem and youtube
and googlereader
and slashdotdiggredditmefi
etc etc etc

Twitter suggestions

twitter currently uses decimal numbering
of individual tweets:
http://twitter.com/robotwisdom/status/1139780124
(10 digits = one billion+ sold)

but by switching to alphanumeric numbering
(like TinyURL.com's)
a billion tweets would require only
6 characters ((26+26+10)^6 = 56,800,235,584)

so links to specific tweets
could be embedded concisely:
eg "//1Qa4rF"
unpacked by tweet-readers to (eg)
http://twitter.com/posts/1Qa4rF

while your reader could/should also unpack
mentions of '@username' into two links
one (as now, the username)
a link to that user's aggregated tweets

but the other (presumably just the '@'?)
to a search for all tweets mentioning '@username'
(mostly replies)

and a similar @-search should be offered
(somewhere in each tweet)
for replies to the tweeter herself

nb four compact links:

cshirky: @robotwisdom No, I don't have an active blog. I seem to have two native formats at the moment -- 140 chars and 250,000 chars. :( 7:11 PM Jan 5th from TweetDeck @

Twitter vs del.icio.us

wouldn't del.icio.us
have 90% of twitter's functionality
if they just exposed (as URLs)
their internal IDs
for individual posts
(so people could link eg
my specific comments
on any given bookmark)

and also then allowed
posts with no associated bookmark/links
(so i could post about nothing)?

Robot Wisdom auxiliary