Showing posts with label xml. Show all posts

06 October 2008

Webservice metadata

flickr: pix, tags, sets, groups, comments, rss (small), flash-slideshow (w/#), faves (no sort, no tags, no sets, no ratings), faves-rss, faves-slideshow (w/#); rss/slideshows for artist w/tag, for set, for group; no aggregate feeds; privacy settings?; 3rdparty (flickriver)

twitter: tweets, subscriptions (no privacy re who), one aggregate feed per user, faves (no sort, no tags, no ratings)

imeem:

last.fm:

postsecret:

overheard:

blogger:

tumblr:

28 November 2007

Autobiography 2.0: a 'person' microformat experiment on del.icio.us

the semantic web movement is 8 years old
and has delivered little more than
a microformat for business cards [Wiki]

and nothing to rank with
roget or lenat

so i'd like to propose
using del.icio.us for prototyping
much more ambitious
microformat-like linksets

starting eg with
prototype person-profiles
[Jorn Barger]

replacing pagetitles with microformat fieldnames

using Wikipedia placeholders
if the fieldvalue isn't already a url

using del.icio.us tags
to sort fieldtypes

other possible prototypes:
website-profile microformat
webpage-profile microformat
blog-profile microformat
video-profile microformat
song-profile microformat
porno-profile microformat

(each of these should be
summarizable
via heraldic barcodes)

25 June 2007

Action hypertext and the beehive paradigm

while compiling my neologisms list
i occasionally came across ideas
that had completely faded from my thinking

one of which
'action hypertext'
i'm happy to resuscitate

the w3c paradigm for the semantic web
is approximately beehive-like
every webpage a hierarchy of containers
to be filled in by worker-drones
presumed to know in advance
exactly what they want to say
and how it fits into the rest of the web/site

but action hypertext is the opposite of this
each author an indeterminate jackson-pollock splatterer

each new webpage starting from scratch
with no clear idea where it's coming from
or where it's going
rather
freeform blocks of text
with freeform styles of separation between them

textblocks accumulating
more or less intentionally
in the course of never-finished web exploration

visits to un/familiar sites
search patterns un/successful
articles read skimmed skipped bookmarked
links saved shared discussed updated

so each of these web-actions
can be seen as an item in an ongoing stream
that should be archived
and re-traceable

and all the usual choices
for dealing with the results of these web-actions:

forget it, shred it
save it, share it, debate it
add it to your to-read list
subscribe to its feed
set a tickler-alarm for it
announce it to the world
link it from related pages of your own
etc etc etc

all these usual choices
should be offered in a standard
dispose-palette
that action-authors habitually consider
when dismissing any action-result

03 April 2007

Discourse as lists

i never cared for
sgml/xml/xhtml's insistence on
text as strict hierarchies
of containers

especially when
entirely notional
paragraphs
are granted positions of
(undeserved) honor
in those hierarchies

but lately i've been pondering
the ordering principles
(alphabetical chronological best-first etc)
by which we sequence
textbites

broadly generalising the narrow concept
of list
until suddenly the prospect
of text
discourse
as hierarchies of
lists within lists

has become attractive

formatting styles, of course
are vastly irrelevant here
(a list can be just
two words
unpunctuated)

the emphasis should instead be on
ordering principles
and set-membership principles

lists offering choices
lists exhausting domains
lists sampling subsets
lists specifying procedural steps
lists structuring arguments

lists arbitrarily
prefixing and postfixing
auxiliary text materials
like headers and footers
and sidebars

22 March 2007

Web3.0: Ego abhors a vacuum

the challenge of meditation
is not to think

the mind normally filled
with neurotic habits

and the challenge of artistic creativity
is to subvert cliche
(vs nabokov's gap-fillers)

and the scandal of philosophy
is the sophomoric sophistries
like free will or epistemological solipsism
tying innocent minds in useless knots

and usenet newsgroups often used to
suffer from negative intelligence
flamewars displacing constructive discussion
like coinage under gresham's law

or kneejerk techno-utopianism
urging (eg) educators
to squander their budgets on digital multimedia
when the real problem as always
is rewarding the best teachers

so the semantic web movement
for its earliest days
jumped the track
seduced by the shiny
techno quick fix
of xml
and its pie-in-the-sky promises
about embedded metadata

delivering only a flatline
adoption curve
while (eg) wikipedia
keeps rocketing up exponentially

and while journalistic coverage
of 'web 3.0' increases
the hype sounds more and more
hollow and anemic
rdf topicmaps microformats blahblahblah

endless jargonmania
but no 'there' there
never jam today

so i'll suggest again
that metadata belongs in the file header
(not in embedded markup)

and that wikipedia article titles
have solved one semantic problem
by automatically redirecting visitors
when an article title has changed
(i wish blogger did this)

so webpages can begin painlessly
categorizing themselves in their headers
with the closest wikipedia article title/s

and google should index these
with a special keyword
eg "wikiarticle:Ajax_(programming)"

but even more
we can categorise filetypes
"filetype:timeline"

and even more, i now propose
an ontology wiki
eg "ontoclass:business/merger"

that encourages experiment
and redirects changed classnames automatically

and that uses storycycles
as the macroformat

allowing description of
any random news story
any random wikipedia article
any random web resource

by sketching a universal
usual story

of everyperson (eg their wikipedia biographies)
of everyproduct (eg their history and web resources)
of everyprogram (eg the software lifecycle)
of everyenterprise (eg their corporate history)
of everyfile (its creation and dissemination)
ov everyword (its etymology)
of everyidea (its genesis and evolution)
of everyspecies (its selection pressures)

so that the metadata
for any given object
can emphasize precisely
those 'slots' in the usual storycycle
where the object's particular history
deviates from the usual expectations

these storycycles being compiled
incrementally, wiki-fashion
by storytellers who understand
the usual range of details
(a storycycle storycycle)

with an ongoing appreciation of the impermanence
of particular wording
or particular orderings

software storycycle
biography storycycle
business storycycle
hardware storycycle

04 January 2007

XML is not for documents

i've been reading
goldfarb's xml handbook
(4th edition, 2002)

and i'm appalled to see
that even goldfarb
makes device-independent display
a cornerstone of his rationalisations

which i still find
disingenuously inside-out
so i'll retell the xml-story here
outside out, imho:

every computer user
maintains dozen of databases
(buddy lists, email archives, etc)
and sometimes chooses to share
some of that data
with others

or may want to merge
others' data
into their own

databases traditionally demand
rectangular arrays of labeled rows and columns

and the traditional way to share data
was to declare in advance
(as metadata)
how many rows and how many columns
the message contains
and what each row and column means

and then send all the cells in sequence
separated by tabs or commas

xml instead
uses labeled tags as separators
eliminating the requirement for rectangular arrays
(a 'row' can contain
any number of cells
of any type)

and permitting some of the required metadata
to be included in the message itself

thus inching towards
full automation
of data sharing

since human-to-human messages
are conventionally
prose documents (not database dumps)
yet often contain
the identical text-strings
we'd want in the databases
(calendar events, book citations)

the possibility was mooted
of embedding tags
within ordinary prose formatted as xml
so that comparably automatic merging
of these prose substrings
into the relevant databases
might be possible

in effect
document authors would now be addressing
not just humans
but also their machines
and tags would be added
(and phrasings tweaked)
so the machines could understand too

(if AI were far enough advanced
this would be unnecessary
because the machines could equally understand
the raw, untagged prose)

no some prose substrings
like titles and subheadings
will have characteristic formatting
(larger/smaller, bold, italic)
chosen by the author

and the designers of xml
(and before xml, goldfarb's 1969-1986 sgml)
suggested that every such formatting change
should be signalled with an xml tag
even if their likelihood
of ever being useful in data exchange
was doubtful
(simple italics, paragraph breaks)

somehow, horribly
this
gamble
was elevated to the status of revelation
(call it goldfarb's conjecture)

and the insupportable additional claim was made
that 'cleansing' documents
of all formatting markup
and replacing it with 'structural' tags
would be a significant advance
towards an entirely different goal:
creating a single master file
that can be efficiently viewed
on any random device
(big monitor, small monitor,
printer, speech-synthesizer)

with the reductio-ad-absurdam
that 'EM' and 'STRONG'
were better than 'I' or 'B'
(for italic and bold)
because speech synthesizers would find
'I' more baffling than 'EM'

and taken now to the absurder extreme
that before
any
device
can render an xml document
it has to refer to a
custom-built stylesheet
that translates the xml tags
back into comprehensible styling instructions

thus making more work
both for the machine
and for the author

if the goal is device-independence
what's needed instead
is a vast shared 'namespace'
of document structures
each with a pre-agreed default rendering
on every class of device

that authors can override
(if the care to be bothered)
but that default to
best-possible-under-the-circs
styling

these structures to include
plain old styles like italic
for unregenerate reactionaries (like me)

and the fascinating web-page structures
xmlers have never gotten around to

but the ideal of mixing these
'style structures'
with semantic tags
seems unpromising to me

(if you mention the same person
ten times in an article
do you embed that person's metadata
ten separate times?)

i'd rather see
machine-readable footnotes
than monstrous human-machine hybrids

xml was never a good fit for documents

Robot Wisdom auxiliary