August 12, 2013
August 12, 2013
July 6, 2013
A few days ago there was a Twitter back and forth between two people I deeply respect: Dan Brickley [twitter:danbri] and Ed Summers [twitter:edsu]. It started with Ed responding to a tweet about a brief podcast I did with Kevin Ford [twitter:3windmills], who is on the team working on BibFrame:
After a couple of tweets, Dan tweeted the following:
There followed some agreement that it's often helpful to have apps driving the development of standards. (Kevin agrees with this, and points to BibFrame's process.) But, Dan's comment clarified my understanding of why ontologies make me nervous.
Over the past hundred years or so, we've come to a general recognition that all classifications and categorizations are tools, not representations of The Real Order. The periodic table of the elements is a useful way of organizing information, and manifests real relationships among the elements, but it is not the single "real" way the elements are arranged; if you're an economist or an industrialist, a chart that arranges the elements based on where they exist on our planet might be just as valid. Likewise, Linneaus' classification scheme is useful and manifests some real relationships, but if you're a chef you might have a different way of carving up the animal kingdom. Linneaus chose to organize species based upon visible differences — which might not be the "essential" differences — so that his scheme would be useful to scientists in the field. Although he was sometimes ambiguous about this, he seems not to have thought that he was discerning God's own order. Since Linnaeus we have become much more explicit in our understanding that how we classify depends on what we're trying to accomplish.
For example, a DTD (document type definition) typically is designed not to capture the eternal essence of some type of document, but to make the document more usable by systems that automate the document's production and processing. For example, an industry might agree on a DTD for parts catalogs that specifies that a parts catalog must have an element called "part" and that a part must have a type, part number, length, height, weight, material, and a description, and optionally can note whether it turns clockwise or counterclockwise. Each of these elements would have a standard name (e.g., "part_number," not "part#"). The result is a document that describes parts in a standard way so that a company can receive descriptions from all of its suppliers and automatically build a database of the parts it uses.
A DTD therefore is designed with an eye toward what properties are going to be useful. In some industries, it might include a term that captures how shiny the part is, but if it's a DTD for surgical equipment, that may not be relevant enough to include...although "sanitary_packaging" might be. Likewise, how quickly a bolt transfers heat might seem irrelevant, at least until NASA places an order. In this DTD's are much like forms: You don't put a field for earlobe length in the college application form you're designing.
Ontologies are different. They can try to express the structure of a domain independent of any particular use, so that the widest variety of applications can share data, including apps from domains outside of the one that's been mapped. So, to use Dan's example, your ontology of jobs would note that jobs have employers and workers, that they may have a salary or other form of compensation, that they can be part-time, full-time, seasonal, etc. As an ontology designer, because you're trying to think beyond whatever applications you already can imagine, your aim (often, not always) is to provide the fullest possible set of slots just in case someone sometime needs that info. And you will carefully describe the relationships among the elements so that apps and researchers can use knowledge that is implicit in the model.
The line between DTD's and ontologies is fuzzy. Many ontologies are designed with classes of apps in mind, and some DTD's have tried to be hugely general purpose. My discomfort really comes down to a distrust of the concept of "knowledge representation" that underlies some ontologies (especially earlier ones). The complexity of the relationships among parts will always outstrip our attempts to capture and codify those relationships. Further, knowledge cannot be fully represented because it isn't a thing apart from our continuous invention, discovery, and engagement with it.
What it comes down to is that if you talk about ontologies as knowledge representations I'll mutter something under my breath and change the topic.
Categories: misc Tagged with: 2b2k • dtd • everythingismisc • ontologies • sgml
Date: July 6th, 2013 dw
May 28, 2013
This fall I followed the Internet’s instructions on how to cut back the giant shrub of ugliness that’s been occupying the strip that divides our front yard from our neighbor’s. Alas, the Internet lied, and the bush has not sprouted new leaves where I cut it back past its thin margin of green. Oops.
That’s ok because I hate that !@#$@ing shrub and would be happy to be rid of it. Unfortunately, as you can see, it consists of sticks that have buried their gnarled fingers deep into the earth.
Dear Internet, how do I get rid of the thing so that I can plant something more humble and subservient?
May 8, 2013
Keynote presentation software has what seems to be a needless limitation on how large you can scale an object using their animation capabilities: you can take it up to 200% and no larger. A few years ago I poked around in the xml save files and manually increased the scaling on an object to 1000%, and it animated just fine. So I don’t know what was in the designer’s minds when they limited the user interface. Actually, I’m sure they had a good reason, so I already regret the use of the word “dumb” in my headline. A little.
“Dumb” is appropriate, however, for me, given how long it’s taken me to realize a way around the limitation in some circumstances.
Keynote has a really helpful slide transition called “Magic Move.” If you duplicate a slide and move around the objects in the duplicate slide, and resize them, then when you click from the first slide to the second, the objects will smoothly animate into their new positions and sizes. It is occasionally finicky, but when it works, it can save an enormous amount of manual animation. For example, if you have a slide with a square made up of 64 little squares, and you want to animate those little squares flying apart, rather than animating each of their movements, just duplicate the slide and drag the little cubes where you want.
So, duh, if you want to animate one of those cubes so it grows larger than 200%, just duplicate the slide and enlarge the cube to whatever size you want. Apply the “Magic Move” transition to the first slide, and Keynote will do the deed for you.
This doesn’t work for all situations, but in the ones that it works in, it’s very handy. And, yes, I should have realized it a couple of years ago.
April 16, 2013
Please remember that according to the official Rules of Blogging, on the Web we must forgive one another’s bad poetry
April 15, 2013
I’d be blogging more, but I keep writing stuff and then realizing it’s wrong. I’d like to believe that that simply means I’m in a creative period, but it’s far more likely than I’m just wronger than usual, or possibly righter in recognizing my usual level of wrongness.
So, please just stare into this lovely pattern until you’re convinced you just read a highly insightful and 100% correct blog post…
March 28, 2013
Andy Wasklewicz and Jeff Austin from Entwine [twitter:entwinemedia] describe a multi-institutional project to build a platform-agnostic tool for enriching video through note-taking, structured annotations, and sharing. It uses HTML 5, and allows for structured tagging, time-based annotation, and more.
Dan Gillmor is giving a Berkman lunchtime talk about his Permission Taken project. Dan, who has been very influential on my understanding of tech and has become a treasured friend, is going to talk about what we can do to live in an open Internet. He begins by pointing to Jonathan Zittrain’s The Future of the Internet and Rebecca MacKinnon’s Consent of the Networked [two hugely important books].
He says that the intersection of convenience and freedom is narrowing. He goes through a “parade of horribles” [which I cannot keep up with]. He pauses on Loic Le Meur’s [twitter:loic] tweet: “A friend working for Facebook: ‘we’re like electricity.’” If that’s the case, Dan says, we should maybe even think about regulation, although he’s not a big fan of regulation. He goes through a long list of what apps ask permission to do on your mobile. His example is Skype. It’s a long list. Bruce Schneier says when it comes to security, we’re heading toward feudalism. Also, he says, Skype won’t deny it has a backdoor. “You should assume they do,” he says. The lock-in is getting tighter and tighter.
We do this for convenience. “I use a Kindle.” It makes him uncomfortable but it’s so hard to avoid lock-in and privacy risks. The fight against SOPA/PIPA was a good point. “But keep in mind that the copyright cartel is a well-funded smart group of people who never quit.” He says that we certainly need better laws, rules, and policies. “That’s crucial.” But his question this afternoon is what we as individuals can do. Today he’s going to focus on security countermeasures, although they’re not enough. His project â?? which might become a book â?? will begin simply, because it’s aimed at the broad swath of people who are not particularly technically literate.
“Full disk encryption should be the default. It’s not. Microsoft charges extra for it. Mac makes it pretty easy. So does Ubuntu.”
Disable intrusive browser extensions.
Root your phone. That’s not perfect. E.g., it makes you vulnerable to some attacks. But the tradeoff is that you now control your phone.
Dan blocks apps from particular permissions. Sometimes that keeps the app from working. “I accept that.” This is a counter to vendors insisting that you give them all the rights.
Use Tor [The Onion Router], even though “I assume some of the exit nodes” being run by the CIA. Tor, he explains, is a way of browsing the Web with some reasonable likelihood your ISP doesn’t know what you’re actually looking at, and what you’re looking at doesn’t know where you’re coming from.” This he says is important for whistleblowers, etc.
When loyalty cards came out, he and his friend used to randomly swap them to make the data useless. The last time he got one, he filled in his address as 1600 Pennsylvania Ave., and the guy in the store said, “It’s amazing how many people live there.” If you use a false address with a card, it may not work. If you do it on line, you’re committing a felony under the Computer Fraud and Abuse Act. The revisions are going in the wrong direction. “This is terrifying…We have to do something collectively.”
Pick your platform carefully. “I was the biggest Apple person around…I was a Mac bigot for years.” At prss events, he’d be the only person (beside John Markoff) to have a Mac. Many things happened, including Apple suing websites wanting to do journalism about Apple. Their “control freakery” and arrogance with the iPhone was worse. “Now that everyone except me at a press event has a Mac, I get worried.” Now the Mac is taking on the restrictions of the iPhone operating system (IOS). “I want to do what I want with my own computer.” All computer makers are moving to devices that you can’t even open them. “Everyone wants to be Apple.”
Own your own domain. Why are journalists putting their work on Facebook or other people’s platforms? Because it brings distribution and attention. “We do these things on ‘free’ platforms at their sufferance.” “We all should have a place on the Web that is owned by us,” even if we don’t do most of our work there. Dan is going to require students to get their own domain name.
Dan says his book/project is going to present a gradient of actions. At the further end, there’s Linux. Dan switched last year and has found it almost painless. “No one should have to use the command line if they don’t want to,” and Linux isn’t perfect about that yet. “Even there it’s improving.” He says all the major distributions are pretty. He uses Ubuntu. “Even there there’s some control-freakery going on.” Dan says he tried Linux every year for 10 years, and how he finds it “ready for prime time.” He says some control features being introduced to Windows, for reasonable reasons, is making life harder for Linux users. [I'm not sure what he's referring to.]
Dan says the lockdown is caused by self-interest, not good vs. evil. He hopes that we can start to make the overlap of convenient and freedom larger and larger.
Q: If you should have your own domain, you should also do your own hosting, run your own Apache server, etc.
A: You can’t be independent of all external services unless you really want that. There’s a continuum here. My hosting is done by someone I know personally. We really need systematic and universal encryption in the cloud, so whoever is storing your stuff can’t muck with it unless you give them permission. That raises legal questions for them.
Q: I really like what you’re saying. I’m not a specialist and it sounds like a conversation among a very small number of people who are refined specialists in this area. How do you get this out and more accessible? Could this be included in basic literacy in our public schools? On the other hand, I worry there’s a kind of individualism: You know how to do it, so you get to do it, but the rest don’t. How do we build a default position for people who can’t manage this for themselves.
A: Yes, I worry that this for geeks. But I’m not aiming this project at geeks. It’s more aimed at my students, who have grown up thinking Facebook is the Internet and that the MacBook Air gives them complete freedom [when in fact it can't be opened and modified]. The early chapters will be on what you can do whatever it is that use. It won’t solve the problem, but it will help. And then take people up a ramp to get them as far as they’re comfortable doing. In really clear language, I hope. And it’d be a fine idea to make this part of digital literacy education. I’m a huge fan of CodeAcademy; Douglas Rushkoff wrote a wonderful book called “Program or Be Programmed,” and I think it does help to know some of this. [See Diana Kimball's Berkman Talk on coding as a liberal art.] It’s not going to be in big demand any time soon. But I hope people can see what’s at risk, what they’re losing, and also what they gain by being locked down.
Q: Do you think freedom and convenience will grow further apart? What are the major factors?
A: Overall, the bad direction is still gaining. That’s why I’m doing this. I don’t think people are generally aware of the issues. It’ll help if we can get word out about what’s at risk and what the choices are. If people are aware of the issues and are fine with giving up their freedom, that’s their choice. We’ve been trading convenience of the illusion of security. “We put our hands up in scanners as if we’re being frisked.” There’s more money and power on the control side. Every major institution is aligned on the same side of this: recentralizing the technology that promised radical decentralization. That’s a problem. I’m going to try to convince people to use tech that doesn’t do that, and to push for better policies, but …
Q: What exactly are you concerned about? I feel free to do anything I want on the Internet. Maybe the govt is managing me. Marketers definitely are. I worry about hackers stealing my identity. But what are the risks?
A: “I think a society that is under pervasive surveillance is a deadened society in the long run.” It’s bad for us “in every way that I can imagine” except for the possibility that can stop a certain amount of crime. “But in dictatorships, the chief criminals are the govt and the police, so it doesn’t solve the problem.” The FBI wants a backdoor into every technology. If they get one, it will be used by bad people. This stuff doesn’t stay secret forever. The more you harden the defenses, the more room there is for really bad actors to get in. Those are some of the main reasons.
Q: How can Tor can help whistleblowers? Do you have other advice for journalists?
A: I have a chapter in a book that’s coming out about journalists and closed platforms. Journalists need to learn about security right away because they’re putting the lives of their sources at risk. The Committee to Protect Journalists has done important work on helping journalists understand the risks and mitigate them. It’s a crucial issue that hasn’t gotten enough attention inside the craft. although I had my PGP signature at the bottom of my column for 6 years and got 2 emails that used it, one of which said he just wanted to know if it worked. Also, you should be aware that you can’t anticipate every risk. E.g., if the US govt wants to find out what I’m talking about online, they’ll figure out a way to do it. They could break into my house and put up cameras. But like the better deadbolt lock stopping amateur criminals, better security measures will discourage some intrusions. When I do my online banking, I do it from a virtual machine that I use only for that; it has never gone anywhere else on the Internet. I don’t think that’s totally paranoid. There are still risks.
Q: The Supreme Court just affirmed first sale of materials manufactured outside of the US. Late stage capitalism want to literally own their markets, offline as well as online. How much of that wider context do you want to get into?
A: If the Court hadn’t affirmed first sale, every media producer would have moved all their production facilities offshore so that we wouldn’t be able to resell it. These days we buy licenses, not goods. Increasingly, physical goods will have software components. That’s an opportunity for the control crowd to keep you from owning anything you buy. In Massachusetts, the car repair shops got a ballot measure saying they get access to the software in cars; that was marvelous. BTW, I’m making common cause with some friends on the Right. Some of the more far-seeing people on the Right are way ahead in thinking about this. E.g., Derek Khanna. I will be an ally of anybody.
Q: [harry lewis] Great project. Here’s your problem: What are you worried about? This is a different sort of surveillance society. This is the opposite of the Panopticon where everyone knows they’re being spied upon. People won;t be motivated until there are breeches. The incentive of the surveillors is to do it as unobtrusively as possible. You’ll never know why your life insurance premium is $100 higher than my. You want ever see the data paths that led to that, because the surveillance will be happening at a level that will be ompletely invisible to the individual. It’ll be hard to wake people up. “A surveillance society is a deadened society” only if people know they’re being surveilled.
A: If they don’t see a consequence, then they won’t act. If the govt a generation ago had told you that you will henceforth carry a tracking device so we can where you are at any time, there would have been an uproar. But we did it voluntarily [holding up a mobile phone]. The cell tower has to know where you are, but I’d like to find a way to spoof everything else for everyone else. (You should assume your email is being read on your employer’s server, Dan says.)
Q: I worry about creating a privacy of the elite that only a small segment can access. That creates a dangerous scenario. Should there be govt regulations to make sure we’re all operating with the same levels of privacy?
A: It’s an important point. The govt rules won’t be the ones you want. We need to create a market-based solutions. Markets work better than advice or edicts.
Q: But hasn’t the market spoken, and it’s the iPhone?
A: The iPhone has important security features. But people aren’t scared enough to create a market.
A: The ACLU should be advised on how to create pamphlets that will reach people.
A: So much of hacker culture and open source culture are based on things being difficult. Many of the privacy tools work but are too hard to use. There is a distinct lack of design, and we don’t see poorly designed things as legitimate. And that’s a fairly easy thing to fix. A: Yes.
Q: Younger people don’t seem to care about privacy. Is there a generational shift?
A: There are two possibilities for the future. My hope is that we’ll all start cutting each other more slack; everyone will recognize that we all did unbelievably stupid, even possibly criminal things, in our 20s. I still do plenty of stupid things. But it worries me that cultures sometimes grow less tolerant. This could be catastrophic, if the country goes toward the Right.
A: Still pretty geeky, but it’s a wonderful start. But many of the tools cost money.
Q: Any thoughts about ways to use govt and corporate interests to promote your goals. E.g., protect the children.
A: I’ll rename this Protect the Children and then everyone will do what I want :) Overall, the problem is that power is shifting, pulling back into the center. This has long term negative consequences. But speculating on what the consequences will be is never as effective as showing what’s going wrong now. I want the power to be distributed. “I’m pretty worried, although I’m a relentless optimist.” “I’m a resister.”
Categories: misc Tagged with: apple • berkman • dan gillmor • linux • privacy
Date: March 28th, 2013 dw
March 10, 2013
First a disclaimer: Facts matter. The world is one way and another. It is entirely possible to be wrong. Not all statements are true. The statement “That is true for your but not for me” is almost always nonsensical. Ok? Can we proceed?
In an argument, facts — or, more precisely, statements that assert facts — usually are presented as stopping points. If it’s a fact, it’s a fact, and there’s no arguing with facts. If we are challenged to back up our facts, we’ll point to the source where we learned the fact. This is a delegation of authority: I don’t know how lung-less grasshoppers breathe, but this biology text — which is perhaps cited by Wikipedia — does. And how did that text learn it? It probably doesn’t tell us. And if it cites another source, I’m probably not going to be able to find it (unless I happen to be at a university with a generous set of journal subscriptions). And there’s a very good chance that ultimately I’m not going to be able to find out how the original source figured it out. Not all facts are opaque in this way, but many are, and we generally don’t mind when they are, since we probably invoked the fact to stop a line of discussion anyway.
So now I have to name-drop a little: This morning at SxSW I spent an hour with Stephen Wolfram, which is a rare treat; he is as completely fascinating as you think he is. He mentioned that a particular Nobelist had recently reluctantly acknowledged that most of the models being proposed these days are algorithmic and computational, just as Wolfram had predicted. Models are at the high end of the knowledge chain. At the lower end, there are facts, and WolframAlpha is about deriving facts algorithmically from a vast store of data. But computers often solve problems in ways completely other than how a human would; Wolfram’s example was differentials. In many of these cases, while a computer programmer might be able to understand the algorithm, no one could reproduce the outcome except by using another computer. So, in a very real sense, these computed facts are opaque not because the sources are untraceable — WolframAlpha curates the data that drive the site — but because they were not derived by a human intelligence.
Thus, we have knowledge without understanding not only at some of the highest levels of human knowledge, but also increasingly at the factual layer.
January 12, 2013
It was with a shock of emotions beyond articulation that I read this morning that Aaron Swartz killed himself yesterday.
I first met Aaron when he was 14 or 15, at a conference where he was being consulted by graybeards with technical questions. I kept in touch, and followed his activities. Aaron was a prodigy not only of technology but of democracy. Every single project he undertook aimed at improving the public sphere — more open, with lower barriers, richer connections, better information, and less corruption. He wanted the public sphere to be more of us and be more ours.
I was so looking forward to watching him continue to grow, invent, and contribute. I admired him, and I enjoyed his company, and I didn’t ever want to have to use the past tense in talking about him. The future was so much more appropriate.
Cory Doctorow writes movingly and clearly about Aaron’s here.
I am so sorry for his family, for his friends, for all of us who knew him, and for those who did not have that chance.
Here’s something I just posted at Reddit:
Aaron was a hero of the Internet.
Everything he did in his way too short life was aimed at making the connected world more open, with lower barriers, richer connections, more knowledge, more sharing, and less corruption. Consider Aaron’s work on standards for sharing ideas, his commitment to progressive and bottom-up politics, his efforts to provide free access to public domain court records), his work against corruption in politics, his contribution to the struggle against SOPA, the app he wrote for making it easier to create blogs and wikis (acquired by Reddit), his commitment to open information. And more.