logo
EverydayChaos
Everyday Chaos
Too Big to Know
Too Big to Know
Cluetrain 10th Anniversary edition
Cluetrain 10th Anniversary
Everything Is Miscellaneous
Everything Is Miscellaneous
Small Pieces cover
Small Pieces Loosely Joined
Cluetrain cover
Cluetrain Manifesto
My face
Speaker info
Who am I? (Blog Disclosure Form) Copy this link as RSS address Atom Feed

February 22, 2023

Section 230: The Internet, Categorization, and Rorty Rorty

Topic: The Supreme Court is hearing a case about whether section 230 exempts Google from responsibility for what it algorithmically recommends.

A thought from one of my favorite philosophers, Richard Rorty:

Photo of Richard Rorty

Rortiana, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0,
via Wikimedia Commons

“…revolutionary achievements in the arts, in the sciences, and the moral and political thought typically occur when somebody realizes that two or more of our vocabularies are interfering with each other, and proceeds to invent a new vocabulary to replace both… The gradual trial and error creation of a new, third, vocabulary… is not a discovery about how old vocabularies fit together… Such creations are not the result of successfully filling together pieces of a puzzle. They are not discoveries of a reality behind the appearances, of an undistorted view of the whole picture with which to replace myopic views of its parts. The proper analogy is with the invention of new tools to take the place of old tools.” — Richard Rorty,  Contingency, Irony, and Solidarity, 1989, p. 12

We’ve had to invent a new vocabulary to talk about these things: Is a website like a place or a magazine?  Are text messages like phone calls or telegrams? Are personal bloggers journalists [a question from 2004]?

The issue isn’t which Procrustean bed we want to force these new entries into, but what sort of world we want to live in with them. It’s about values, not principles or definitions. IMUOETC (“In my uncertain opinion expressed too confidently”)

A person being stretched to fit into Procrustes' bed. Tall people had their legs shortened by axe to get them to fit.

Procruste’s bed. And this is better than being too tall for it :(

A point all of this misses: “Yo, we’re talking about law here, which by its nature requires us to bring entities and events under established categories.”

Tweet
Follow me

Categories: cluetrain, culture, everyday chaos, everythingIsMiscellaneous, internet, law, philosophy Tagged with: 230 • digital culture • internet • law • media • social media • supreme court Date: February 22nd, 2023 dw

Be the first to comment »

January 31, 2022

Meaning at the joints

Notes for a post:

Plato said (Phaedrus, 265e) that we should “carve nature at its joints,” which assumes of course that nature has joints, i.e., that it comes divided in natural and (for the Greeks) rational ways. (“Rational” here means something like in ways that we can discover, and that divide up the things neatly, without overlap.)

For Aristotle, at least in the natural world those joints consist of the categories that make a thing what it is, and that make things knowable as those things.

To know a thing was to see how it’s different from other things, particularly (as per Aristotle) from other things that they share important similarities with: humans are the rational animals because we share essential properties with other animals, but are different from them in our rationality.

The overall order of the universe was knowable and formed a hierarchy (e.g. beings -> animals -> vertebrates -> upright -> rational) that makes the differences essential. It’s also quite efficient since anything clustered under a concept, no matter how many levels down, inherits the properties of the higher level concepts.

We no longer believe that there is a perfect, economical order of things. “We no longer believe that there is a single, perfect, economical order of things. ”We want to be able to categorize under many categories, to draw as many similarities and differences as we need for our current project. We see this in our general preference for search over browsing through hierarchies, the continued use of tags as a way of cutting across categories, and in the rise of knowledge graphs and high-dimensional language models that connect everything every way they can even if the connections are very weak.

Why do we care about weak connections? 1. Because they are still connections. 2. The Internet’s economy of abundance has disinclined us to throw out any information. 3. Our new technologies (esp. machine learning) can make hay (and sometimes errors) out of rich combinations of connections including those that are weak.

If Plato believed that to understand the world we need to divide it properly — carve it at its joints — knowledge graphs and machine learning assume that knowledge consists of joining things as many different ways as we can.

Tweet
Follow me

Categories: abundance, big data, everyday chaos, everythingIsMiscellaneous, machine learning, philosophy, taxonomy, too big to know Tagged with: ai • categories • everythingIsMiscellaneous • machine learning • meaning • miscellaneous • philosophy • taxonomies Date: January 31st, 2022 dw

3 Comments »

November 18, 2019

The inefficiency of order. The efficiency (and beauty) of chaos

In “The Efficiency-Destroying Magic of Tidying Up“, Florent Crivello (twitter: @Altimor) has written a clear, convincing, and compact critique of the assumption that efficient organization is tidy, neat, and rectilinear. “[E]fficiency tends to look messy, and good looks tend to be inefficient.”

This is because complex systems — like laws, cities, or corporate processes — are the products of a thousand factors, each pulling in a different direction. And even if each factor is tidy taken separately, things quickly get messy when they all merge together

He applies this to management organization, the tool sets used by collaborators, city planning, science fiction visions of the future, parenting, and pizzas.

Not to mention that in one brief, beautiful essay he unites the themes of two of my books: Everything Is Miscellaneous and my new Everyday Chaos.

Tweet
Follow me

Categories: everyday chaos, everythingIsMiscellaneous, machine learning Tagged with: chaos • everydaychaos • everythingismisc • neatness Date: November 18th, 2019 dw

Be the first to comment »

August 8, 2017

Messy meaning

Steve Thomas [twitter: @stevelibrarian] of the Circulating Ideas podcast interviews me about the messiness of meaning, library innovation, and educating against fake news.

You can listen to it here.

Tweet
Follow me

Categories: dpla, everythingIsMiscellaneous, libraries, philosophy Tagged with: 2b2k • everythingismisc • libraries • podcasts Date: August 8th, 2017 dw

Be the first to comment »

August 14, 2016

Coinstar's list of unacceptable items seems to have been written by Tim Burton

Coinstar makes vending machines into which you drop coins and from which you get bills or gift cards. Its list of unacceptable items is quite odd, presumably intentionally.

unacceptable items

I’d think that this is based on things people have actually tried to shove into Coinstar slots, except I don’t see “fishing line with gum at its end” or “your dick”on the list.

(Tip o’ the hat to my brother Andy who definitely was not trying to “redeem” 70,000 #6 steel washers.)

Tweet
Follow me

Categories: everythingIsMiscellaneous, humor Tagged with: eim • humor • lists Date: August 14th, 2016 dw

1 Comment »

October 7, 2015

[liveblog] The future of libraries

I’m at a Hubweek event called “Libraries: The Next Generation.” It’s a panel hosted by the Berkman Center with Dan Cohen, the executive director of the DPLA; Andromeda Yelton, a developer who has done work with libraries; and Jeffrey Schnapp of metaLab

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.


Sue Kriegsman of the Center introduces the session by explaining Berkman’s interest in libraries. “We have libraries lurking in every corner…which is fabulous.” Also, Berkman incubated the DPLA. And it has other projects underway.


Dan Cohen speaks first. He says if he were to give a State of the Union Address about libraries, he’d say: “They are as beloved as ever and stand at the center of communities” here and around the world. He cites a recent Pew survey about perspectives on libraries:“ …libraries have the highest approval rating of all American institutions. But, that’s fragile.” libraries have the highest approval rating of all American institutions. But, he warns, that’s fragile. There are many pressures, and libraries are chronically under-funded, which is hard to understand given how beloved they are.


First among the pressures on libraries: the move from print. E-book adoption hasn’t stalled, although the purchase of e-books from the Big Five publishers compared to print has slowed. But Overdrive is lending lots of ebooks. Amazon has 65% of the ebook market, “a scary number,” Dan says. In the Pew survey a couple of weeks ago, 35% said that libraries ought to spend more on ebooks even at the expense of physical books. But 20% thought the opposite. That makes it hard to be the director of a public library.


If you look at the ebook market, there’s more reading go on at places like the DPLA. (He mentions the StackLife browser they use, that came out of the Harvard Library Innovation Lab that I used to co-direct.) Many of the ebooks are being provided straight to a platform (mainly Amazon) by the authors.


There are lots of jobs public libraries do that are unrelated to books. E.g., the Boston Public Library is heavily used by the homeless population.


The way forward? Dan stresses working together, collaboration. “DPLA is as much a social, collaborative project as it is a technical project.” It is run by a community that has gotten together to run a common platform.


And digital is important. We don’t want to leave it to Jeff Bezos who “wants to drop anything on you that you want, by drone, in an hour.”


Andromeda: She says she’s going to talk about “libraries beyond Thunderdome,” echoing a phrase from Sue Kriegman’s opening comments. “My real concern is with the skills of the people surrounding our crashed Boeing.” Libraries need better skills to evaluate and build the software they need. She gives some exxamples of places where we see a tensions between library values and code.


1. The tension between access and privacy. Physical books leave no traces. With ebooks the reading is generally tracked. Overdrive did a deal so that library patrons who access ebooks get notices from Amazon when their loan period is almost up. Adobe does rights management, with reports coming page by page about what people are reading. “Unencrypted over the Internet,” she adds. “You need a fair bit of technical knowledge to see that this is happening,” she says. “It doesn’t have to be this way.” “It’s the DRM and the technology that have these privacy issues built in.”


She points to the NYPL Library Simplified program that makes it far easier for non-techie users. It includes access to Project Gutenberg. Libraries have an incentive to build open architectures that support privacy. But they need the funding and the technical resources.


She cites the Library Freedom Project that teaches librarians about anti-surveillance technologies. They let library users browse the Internet through TOR, preventing (or at least greatly inhibit) tracking. They set up the first library TOR node in New Hampshire. Homeland Security quickly suggested that they stop. But there was picketing against this, and the library turned it back on. “That makes me happy.”


2. Metadata. She has us do an image search for “beautiful woman” at Google. They’re basically all white. Metadata is sometimes political. She goes through the 200s of the Dewey Decimal system: 90% Christian. “This isn’t representative of human knowledge. It’s representative of what Melvil Dewey thought maps to human knowledge.” Libraries make certain viewpoints more computationally accessible than others.“ Our ability to write new apps is only as good as the metadata under them.” Our ability to write new apps is only as good as the metadata under them. “As we go on to a more computational library world — which is awesome — we’re going to fossilize all these old prejudices. That’s my fear.”


“My hope is that we’ll have the support, conviction and empathy to write software, and to demand software, that makes our libraries better, and more fair.”


Jeffrey: He says his peculiar interest is in how we use space to build libraries as architectures of knowledge. “Libraries are one of our most ancient institutions.” “Libraries have constantly undergone change,” from mausoleums, to cloisters, to warehouses, places of curatorial practice, and civic spaces. “The legacy of that history…has traces of all of those historical identities.” We’ve always faced the question “What is a library?” What are it’s services? How does it serve its customers? Architects and designers have responded to this, assuming a set of social needs, opportunities, fantasies, and the practices by which knowledge is created, refined, shared. “These are all abiding questions.”


Contemporary architects and designers are often excited by library projects because it crystallizes one of the most central questions of the day: “How do you weave together information and space?” We’re often not very good at that. The default for libraries has been: build a black box.


We have tended to associate libraries with collections. “If you ask what is a library?, the first answer you get is: a collection.” But libraries have also always been about the making of connections, i.e., how the collections are brought alive. E.g., the Alexandrian Librarywas a performance space. “What does this connection space look like today?” In his book with Matthew Battles, they argue that while we’ve thought of libraries as being a single institution, in fact today there are now many different types of libraries. E.g., the research library as an information space seems to be collapsing; the researchers don’t need reading rooms, etc. But civic libraries are expanding their physical practices.


We need to be talking about many different types of libraries, each with their own services and needs. The Library as an institution is on the wane. We need to proliferate and multiply the libraries to serve their communities and to take advantage of the new tools and services. “We need spaces for learning,” but the stack is just one model.

Discussion


Dan: Mike O’Malley says that our image of reading is in a salon with a glass of port, but in grad school we’re taught to read a book the way a sous chef guts a fish. A study says that of academic ebooks, 75% of scholars read less than 50 pages of them. [I may have gotten that slightly wrong. Sorry.] Assuming a proliferation of forms, what can we do to address them?


Jeffrey: The presuppositions about how we package knowledge are all up for grabs now. “There’s a vast proliferation of channels. ‘And that’s a design opportunity.’”There’s a vast proliferation of channels. “And that’s a design opportunity.” How can we create audiences that would never have been part of the traditional distribution models? “I’m really excited about getting scholars and creative practitioners involved in short-form knowledge and the spectrum of ways you can intersect” the different ways we use these different forms. “That includes print.” There’s “an extraordinary explosion of innovation around print.”


Andromeda: “Reading is a shorthand. Library is really about transforming people and one another by providing access to information.” Reading is not the only way of doing this. E.g., in maker spaces people learn by using their hands. “How can you support reading as a mode of knowledge construction?” Ten years ago she toured Olin College library, which was just starting. The library had chairs and whiteboards on castors. “This is how engineers think”: they want to be able to configure a space on the fly, and have toys for fidgeting. E.g., her eight year old has to be standing and moving if she’s asked a hard question. “We need to think of reading as something broader than dealing with a text in front of you.”


Jeffrey: The DPLA has a location in the name — America &#8212. The French National Library wants to collect “the French Internet.” But what does that mean? The Net seems to be beyond locality. What role does place play?


Dan: From the beginning we’ve partnered with Europeana. We reused Europeana’s metadata standard, enabling us to share items. E.g., Europeana’s 100th anniversary of the Great War web site was able to seamlessly pull in content from the DPLA via our API, and from other countries. “The DPLA has materials in over 400 languages,” and actively partners with other international libraries.


Dan points to Amy Ryan (the DPLA chairperson, who is in the audience) and points to the construction of glass walls to see into the Boston Public Library. This increases “permeability.” When she was head of the BPL, she lowered the stacks on the second floor so now you can see across the entire floor. Permeability “is a very smart architecture” for both physical and digital spaces.


Jeff: Rendering visible a lot of the invisible stuff that libraries do is “super-rich,” assuming the privacy concerns are addressed.


Andromeda: Is there scope in the DPLA metadata for users to address the inevitable imbalances in the metadata?


Dan: We collect data from 1,600 different sources. We normalize the data, which is essential if you want to enable it for collaboration. Our Metdata Application Profile v. 4 adds a field for annotation. Because we’re only a dozen people, we haven’t created a crowd-sourcing tool, but all our data is CC0 (public domain) so anyone who wants to can create a tool for metadata enhancement. If people do enhance it, though, we’ll have to figure out if we import that data into the DPLA.


Jeffrey: The politics of metadata and taxonomy has a long history. The Enlightenment fantasy is for a universal metadata school. What does the future look like on this issue?


Andromeda: “You can have extremely crowdsourced metadata, but then you’re subject to astroturfing”You can have extremely crowdsourced metadata, but then you’re subject to astroturfing and popularity boosting results for bad reasons. There isn’t a great solution except insofar as you provide frameworks for data that enable many points of view and actively solicit people to express themselves. But I don’t have a solution.


Dan: E.g., at DPLA there are lots of ways entering dates. We don’t want to force a scheme down anyone’s throat. But the tension between crowdsourced and more professional curation is real. The Indianapolis Museum of Art allowed freeform tagging and compared the crowdsourced tags vs. professional. Crowdsourced: “sea” and “orange” were big, which curators generally don’t use.


Q&A


Q: People structure knowledge differently. My son has ADHD. Or Nepal, where I visited recently.


A: Dan: It’s great that the digital can be reformatted for devices but also for other cultural views. “That’s one of the miraculous things about the digital.” E.g., digital book shelves like StackLife can reorder themselves depending on the query.


Jeff: Yes, these differences can be profound. “Designing for that is a challenge but really exciting.”


Andromeda: This is a why it’s so important to talk with lots of people and to enable them collaborate.


me: Linked data seems to resolve some of these problems with metadata.


Dan: Linked Data provides a common reference for entities. Allows harmonizing data. The DPLA has a slot for such IDs (which are URIs). We’re getting there, but it’s not our immediate priority. [Blogger’s perogative: By having many references for an item linked via “sameAs” relationships can help get past the prejudice that can manifest itself when there’s a single canonical reference link. But mainly I mean that because Linked Data doesn’t have a single record for each item, new relationships can be added relatively easily.]


Q; How do business and industry influence libraries? E.g., Google has images for every place in the world. They have scanned books. “I can see a triangulation happening. Virtual libraries? Virtual spaces?


Andromeda: (1) Virtual tech is written outside of libraries, almost entirely. So it depends on what libraries are able to demand and influence. (2) Commercial tech sets expectations for what users experiences should be like, which libraries may not be able to support. (3) “People say “Why do we need libraries? It’s all online and I can pay for it.” No, it’s not, and no, not everyone can.”People say “Why do we need libraries? It’s all online and I can pay for it.” No, it’s not, and no, not everyone can. Libraries should up their tech game, but there’s an existential threat.


Jeffrey: People use other spaces to connect to knowledge, e.g. coffee houses, which are now being incorporated into libraries. Some people are anxious about that loss of boundary. Being able to eat, drink, and talk is a strong “vision statement” but for some it breaks down the world of contemplative knowledge they want from a library.


Q: The National Science and Technology Library in China last week said they have the right to preserve all electronic resources. How can we do that?


Dan: Libraries have long been sites for preservation. In the 21st century we’re so focused on getting access now now now, we lose sight that we may be buying into commercial systems that may not be able to preserve this. This is the main problem with DRM. Libraries are in the forever business, but we don’t know where Amazon will be. We don’t know if we’ll be able to read today’s books on tomorrow devices. E.g., “I had a subscription to Oyster ebook service, but they just went out of business. There go all my books. ”I had a subscription to Oyster ebook service, but they just went out of business. There go all my books. Open Access advocacy is going to play a critical role. Sure, Google is a $300B business and they’ll stick around, but they drop services. They don’t have a commitment like libraries and nonprofits and universities do to being in the forever business.


Jeff: It’s a huge question. It’s really important to remember that the oldest digital documents we have are 50 yrs old which isn’t even a drop in the bucket. There’s far from universal agreement about the preservation formats. Old web sites, old projects, chunks of knowledge, of mine have disappeared. What does it mean to preserve a virtual world? We need open standards, and practices [missed the word] “Digital stuff is inherently fragile.”


Andromeda: There are some good things going on in this space. The Rapid Response Social Media project is archiving (e.g., #Ferguson). Preserving software is hard: you need the software system, the hardware, etc.


Q: Distintermediation has stripped out too much value. What are your thoughts on the future of curation?


Jeffrey: There’s a high level of anxiety in the librarian community about their future roles. But I think their role comes away as reinforced. It requires new skills, though.


Andromeda: In one pottery class the assignment was to make one pot. In another, it was to make 50 pots. The best pots came out of the latter. When lots of people can author lots of stuff, it’s great. That makes curation all the more critical.


Dan: the DPLA has a Curation Core: librarians helping us organize our ebook collection for kids, which we’re about to launch with President Obama. Also: Given the growth in authorship, yes, a lot of it is Sexy Vampires, but even with that aside, we’ll need librarians to sort through that.


Q: How will Digital Rights Management and copyright issues affect ebooks and libraries? How do you negotiate that or reform that?


Dan: It’s hard to accession a lot of things now. For many ebooks there’s no way to extract them from their DRM and they won’t move into the public domain for well over 100 years. To preserve things like that you have to break the law — some scholars have asked the Library of Congress for exemptions to the DMCA to archive films before they decay.


Q: Lightning round: How do you get people and the culture engaged with public libraries?


Andromeda: Ask yourself: Who’s not here?


Jeffrey: Politicians.


Dan: Evangelism

Tweet
Follow me

Categories: everythingIsMiscellaneous, libraries, liveblog Tagged with: 2b2k • libraries Date: October 7th, 2015 dw

Be the first to comment »

August 1, 2015

Restoring the Network of Bloggers

It’s good to have Hoder — Hossein Derakhshan— back. After spending six years in an Iranian jail, his voice is stronger than ever. The changes he sees in the Web he loves are distressingly real.

Hoder was in the cohort of early bloggers who believed that blogs were how people were going to find their voices and themselves on the Web. (I tried to capture some of that feeling in a post a year and a half ago.) Instead, in his great piece in Medium he describes what the Web looks like to someone extremely off-line for six years: endless streams of commercial content.

Some of the decline of blogging was inevitable. This was made apparent by Clay Shirky’s seminal post that showed that the scaling of blogs was causing them to follow a power law distribution: a small head followed by a very long tail.

Blogs could never do what I, and others, hoped they would. When the Web started to become a thing, it was generally assumed that everyone would have a home page that would be their virtual presence on the Internet. But home pages were hard to create back then: you had to know HTML, you had to find a host, you had to be so comfortable with FTP that you’d use it as a verb. Blogs, on the other hand, were incredibly easy. You went to one of the blogging platforms, got yourself a free blog site, and typed into a box. In fact, blogging was so easy that you were expected to do it every day.

And there’s the rub. The early blogging enthusiasts were people who had the time, skill, and desire to write every day. For most people, that hurdle is higher than learning how to FTP. So, blogging did not become everyone’s virtual presence on the Web. Facebook did. Facebook isn’t for writers. Facebook is for people who have friends. That was a better idea.

But bloggers still exist. Some of the early cohort have stopped, or blog infrequently, or have moved to other platforms. Many blogs now exist as part of broader sites. The term itself is frequently applied to professionals writing what we used to call “columns,” which is a shame since part of the importance of blogging was that it was a way for amateurs to have a voice.

That last value is worth preserving. It’d be good to boost the presence of local, individual, independent bloggers.

So, support your local independent blogger! Read what she writes! Link to it! Blog in response to it!

But, I wonder if a little social tech might also help. . What follows is a half-baked idea. I think of it as BOAB: Blogger of a Blogger.

Yeah, it’s a dumb name, and I’m not seriously proposing it. It’s an homage to Libby Miller [twitter:LibbyMiller] and Dan Brickley‘s [twitter:danbri ] FOAF — Friend of a Friend — idea, which was both brilliant and well-named. While social networking sites like Facebook maintain a centralized, closed network of people, FOAF enables open, decentralized social networks to emerge. Anyone who wants to participate creates a FOAF file and hosts it on her site. Your FOAF file lists who you consider to be in your social network — your friends, family, colleagues, acquaintances, etc. It can also contain other information, such as your interests. Because FOAF files are typically open, they can be read by any application that wants to provide social networking services. For example, an app could see that Libby ‘s FOAF file lists Dan as a friend, and that Dan’s lists Libby, Carla and Pete. And now we’re off and running in building a social network in which each person owns her own information in a literal and straightforward sense. (I know I haven’t done justice to FOAF, but I hope I haven’t been inaccurate in describing it.)

BOAB would do the same, except it would declare which bloggers I read and recommend, just as the old “blogrolls” did. This would make it easier for blogging aggregators to gather and present networks of bloggers. Add in some tags and now we can browse networks based on topics.

In the modern age, we’d probably want to embed BOAB information in the HTML of a blog rather than in a separate file hidden from human view, although I don’t know what the best practice would be. Maybe both. Anyway, I presume that the information embedded in HTML would be similar to what Schema.org does: information about what a page talks about is inserted into the HTML tags using a specified vocabulary. The great advantage of Schema.org is that the major search engines recognize and understand its markup, which means the search engines would be in a position to constructdiscover the initial blog networks.

In fact, Schema.org has a blog specification already. I don’t see anything like markup for a blogroll, but I’m not very good a reading specifications. In any case, how hard could it be to extend that specification? Mark a link as being to a blogroll pal, and optionally supply some topics? (Dan Brickley works on Schema.org.)

So, imagine a BOAB widget that any blogger can easily populate with links to her favorite blog sites. The widget can then be easily inserted into her blog. Hidden from the users in this widget is the appropriate Schema.org markup. Not only could the search engines then see the blogger network, so could anyone who wanted to write an app or a service.

I have 0.02 confidence that I’m getting the tech right here. But enhancing blogrolls so that they are programmatically accessible seems to me to be a good idea. So good that I have 0.98 confidence that it’s already been done, probably 10+ years ago, and probably by Dave Winer :)


Ironically, I cannot find Hoder’s personal site; www.hoder.com is down, at least at the moment.

More shamefully than ironically, I haven’t updated this blog’s blogroll in many years.


My recent piece in The Atlantic about whether the Web has been irremediably paved touches on some of the same issues as Hoder’s piece.

Tweet
Follow me

Categories: blogs, everythingIsMiscellaneous, free-making software, social media, tech, too big to know Tagged with: 2b2k • blogging • everythingismisc • hoder • microformats • old days • schema.org • social networking Date: August 1st, 2015 dw

10 Comments »

June 1, 2015

[misc][liveblog] Alex Wright: The secret history of hypertext

I’m in Oslo for Kunnskapsorganisasjonsdagene, which my dear friend Google Translate tells me is Knowledge Organization Days. I have been in Oslo a few times before — yes, once in winter, which was as cold as Boston but far more usable — and am always re-delighted by it.

Alex Wright is keynoting this morning. The last time I saw him was … in Oslo. So apparently Fate has chosen this city as our Kismet. Also coincidence. Nevertheless, I always enjoy talking with Alex, as we did last night, because he is always thinking about, and doing, interesting things. He’s currently at Etsy , which is a fascinating and inspiring place to work, and is a professor interaction design,. He continues to think about the possibilities for design and organization that led him to write about Paul Otlet who created what Alex has called an “analog search engine”: a catalog of facts expressed in millions of index cards.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Alex begins by telling us that he began as a librarian, working as a cataloguer for six years. He has a library degree. As he works in the Net, he finds himself always drawn back to libraries. The Net’s fascination with the new brings technologists to look into the future rather than to history. Alex asks, “How do we understand the evolution of the Web and the Net in an historical context?” We tend to think of the history of the Net in terms of computer science. But that’s only part of the story.

A big part of the story takes us into the history of libraries, especially in Europe. He begins his history of hypertext with the 16th century Swiss naturalist Conrad Gessner who created a “universal bibliography” by writing each entry on a slip of paper. Leibniz used the same technique, writing notes on slips of paper and putting them in an index cabinet he had built to order.

In the 18th century, the French started using playing cards to record information. At the beginning of the 19th, the Jacquard loom used cards to guide weaving patterns, inspiring Charles Babbage to create what many [but not me] consider to be the first computer.

In 1836, Isaac Adams created the steam powered printing press. This, along with economic and social changes, enabled the mass production of books, newspapers, and magazines. “This is when the information explosion truly started.”

To make sense of this, cataloging systems were invented. They were viewed as regimented systems that could bring efficiencies … a very industrial concept, Alex says.

“The mid-19th century was also a period of networking”: telegraph systems, telephones, internationally integrated postal systems. “Goods, people, and ideas were flowing across national borders in a way they never had before.” International journals. International political movements, such as Marxism. International congresses (conferences). People were optimistic about new political structures emerging.

Alex lists tech from the time that spread information: a daily reading of the news over copper wires, pneumatic tubes under cities (he references Molly Wright Steenson‘s great work on this), etc.

Alex now tells us about Paul Otlet, a Belgian who at the age of 15 started designing his own cataloging system. He and a partner, Henri La Fontaine, started creating bibliographies of disciplines, starting with the law. Then they began a project to create a universal bibliography.

Otlet thought libraries were focused on the wrong problem. Getting readers to the right book isn’t enough. People also need access to the information in the books. At the 1900 [?] world’s fair in Paris, Otlet and La Fontaine demonstrated their new system. They wanted to provide a universal language for expressing the connections among topics. It was not a top-down system like Dewey’s.

Within a few years, with a small staff (mainly women) they had 15 million cards in their catalog. You could buy a copy of the catalog. You could send a query by telegraphy, and get a response telegraphed back to you, for a fee.

Otlet saw this in a bigger context. He and La Fontaine created the Union of International Associations, an association of associations, as the governing body for the universal classification system. The various associations would be responsible for their discpline’s information.

Otlet met a Scotsman named Patrick Geddes who worked against specialization and the fracturing of academic disciplines. He created a camera obscura in Edinburgh so that people could see all of the city, from the royal areas and the slums, all at once. He wanted to stitch all this information together in a way that would have a social effect. [I’ve been there as a tourist and had no idea!] He also used visual forms to show the connections between topics.

Geddes created a museum, the Palais Mondial, that was organized like hypertext., bringing together topics in visually rich, engaging displays. The displays are forerunners of today’s tablet-based displays.

Another collaborator, Hendrik Christian Andersen, wanted to create a world city. He went deep into designing it. He and Otlet looked into getting land in Belgium for this. World War I put a crimp in the idea of the world joining in peace. Otlet and Andersen were early supporters of the idea of a League of Nations.

After the War, Otlet became a progressive activist, including for women’s rights. As his real world projects lost momentum, in the 1930s he turned inward, thinking about the future. How could the new technologies of radio, television, telephone, etc., come together? (Alex shows a minute from the documentary, The Man who wanted to Classify the World.”) Otlet imagines a screen and television instead of books. All the books and info are in a separate facility, feeding the screen. “The radiated library and the televised book.” 1934.

So, why has no one ever heard of Otlet? In part because he worked in Belgium in the 1930s. In the 1940s, the Nazis destroyed his work. They replaced his building, destrooying 70 tons of materials, with an exhibit of Nazi art.

Although there are similarities to the Web, how Otlet’s system worked was very different. His system was a much more controlled environment, with a classification system, subject experts, etc. … much more a publishing system than a bottom-up system. Linked Data and the Semantic Web are very Otlet-ish ideas. RDF triples and Otlet’s “auxiliary tables” are very similar.

Alex now talks about post-Otlet hypertext pioneers.

H.G. Wells’ “World Brain” essay from 1938. “The whole human memory can be, and probably in a shoirt time will be, made accessibo every individual.” He foresaw a complete and freely avaiable encyclopedia. He and Otlet met at a conference.

Emanuel Goldberg wanted to encode punchcard-style information on microfilm for rapid searching.

Then there’s Vannevar Bush‘s Memex that would let users create public trails between documents.

And Liklider‘s idea that different types of computers should be able to share infromation. And Engelbart who in 1968’s “Mother of all Demos” had a functioning hypertext system.

Ted Nelson thought computer scientists were focused on data computation rather than seeing computers as tools of connection. He invnted the term “hypertext,” the Xanadu web, and “transclusion” (embedding a doc in another doc). Nelson thought that links always should be two way. Xanadu= “intellectual property” controls built into it.

The Internet is very flat, with no central point of control. It’s self-organizing. Private corporations are much bigger on the Net than Otlet, Engelbart, and Nelson envisioned “Our access to information is very mediated.” We don’t see the classification system. But at sites like Facebook you see transclusion, two-way linking, identity management — needs that Otlet and others identified. The Semantic Web takes an Otlet-like approach to classification, albeit perhaps by algorithms rather than experts. Likewise, the Google “knowledge vaults” project tries to raise the ranking of results that come from expert sources.

It’s good to look back at ideas that were left by the wayside, he concludes, having just decisively demonstrated the truth of that conclusion :)

Q: Henry James?

A: James had something of a crush on Anderson, but when he saw the plan for the World City told him that it was a crazy idea.

[Wonderful talk. Read his book.]

Tweet
Follow me

Categories: everythingIsMiscellaneous, libraries, too big to know Tagged with: everythingismisc • libraries • liveblog Date: June 1st, 2015 dw

2 Comments »

October 8, 2014

A dumb idea for opening up library usage data

A dumb idea, but its dumbness is its virtue.

The idea is that libraries that want to make data about how relevant items are to their communities could algorithmically assign a number between 1-100 to those items. This number would present a very low risk of re-identification, would be easily compared across libraries, and would give local libraries control over how they interpret relevance.

I explain this idea in a post at The Chronicle of Higher Ed…

Tweet
Follow me

Categories: everythingIsMiscellaneous, libraries Tagged with: everythingis • libraries Date: October 8th, 2014 dw

1 Comment »

September 25, 2014

BoogyWoogy library browser

Just for fun, over the weekend I wrote a way of visual browsing the almost 13M items in the Harvard Library collection. It’s called the “BoogyWoogy Browser” in honor of Mondrian. Also, it’s silly. (The idea for something like this came out of a conversation with Jeff Goldenson several years ago. In fact, it’s probably his idea.)

screen capture

You enter a search term. It returns 5-10 of the first results of a search on the Library’s catalog, and lays them out in a line of squares. You click on any of the squares and it gets another 5-10 items that are “like” the one you clicked on … but you get to choose one of five different ways items can be alike. At the strictest end, they are other items classified under the same first subject. At the loosest end, the browser takes the first real word of the title and does a simple keyword search on it, so clicking on Fifty Shades of Gray will fetch items that have the word “fifty” in their titles or metadata.

It’s fragile, lousy code (see for yourself at Github), but that’s actually sort of the point. BoogyWoogy is a demo of the sort of thing even a hobbyist like me can write using the Harvard LibraryCloud API. LibraryCloud is an open library platform that makes library metadata available to developers. Although I’ve left the Harvard Library Innovation Lab that spawned this project, I’m still working on it through November as a small but talented and knowledgeable team of developers at the Lab and Harvard Library Technical Services are getting ready for a launch of a beta in a few months. I’ll tell you more about it as the time approaches. For example, we’re hoping to hold a hackathon in November.

Anyway, feel free to give BoogyWoogy a try. And when it breaks, you have no one to blame but me.

Tweet
Follow me

Categories: everythingIsMiscellaneous, libraries, tech Tagged with: libraries • serendipity Date: September 25th, 2014 dw

1 Comment »

Next Page »


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
TL;DR: Share this post freely, but attribute it to me (name (David Weinberger) and link to it), and don't use it commercially without my permission.

Joho the Blog uses WordPress blogging software.
Thank you, WordPress!