Joho the Blog » interop

February 1, 2018

Can AI predict the odds on you leaving the hospital vertically?

A new research paper, published Jan. 24 with 34 co-authors and not peer-reviewed, claims better accuracy than existing software at predicting outcomes like whether a patient will die in the hospital, be discharged and readmitted, and their final diagnosis. To conduct the study, Google obtained de-identified data of 216,221 adults, with more than 46 billion data points between them. The data span 11 combined years at two hospitals,

That’s from an article in Quartz by Dave Gershgorn (Jan. 27, 2018), based on the original article by Google researchers posted at Arxiv.org.

…Google claims vast improvements over traditional models used today for predicting medical outcomes. Its biggest claim is the ability to predict patient deaths 24-48 hours before current methods, which could allow time for doctors to administer life-saving procedures.

Dave points to one of the biggest obstacles to this sort of computing: the data are in such different formats, from hand-written notes to the various form-based data that’s collected. It’s all about the magic of interoperability … and the frustration when data (and services and ideas and language) can’t easily work together. Then there’s what Paul Edwards, in his great book A Vast Machine calls “data friction”: “…the costs in time, energy, and attention required simply to collect, check, store, move, receive, and access data.” (p. 84)

On the other hand, machine learning can sometimes get past the incompatible expression of data in a way that’s so brutal that it’s elegant. One of the earlier breakthroughs in machine learning came in the 1990s when IBM analyzed the English and French versions of Hansard, the bi-lingual transcripts of the Canadian Parliament. Without the machines knowing the first thing about either language, the system produced more accurate results than software that was fed rules of grammar, bilingual dictionaries, etc.

Indeed, the abstract of the Google paper says “Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation of patients’ entire, raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. ” It continues: “We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization.”

The paper also says that their approach affords clinicians “some transparency into the predictions.” Some transparency is definitely better than none. But, as I’ve argued elsewhere, in many instances there may be tools other than transparency that can give us some assurance that AI’s outcomes accord with our aims and our principles of fairness.

I found this article by clicking on Dave Gershgon’s byline on a brief article about the Wired version of the paper of mine I referenced in the previous paragraph. He does a great job explaining it. And, believe me, it’s hard to get a writer — well, me, anyway — to acknowledge that without having to insert even one caveat. Thanks, Dave!

Follow me

Categories: ai, interop Tagged with: 2b2k • data • explanations • interop • machine learning Date: February 1st, 2018 dw

Be the first to comment »

October 28, 2017

Making medical devices interoperable

The screen next to a patient’s hospital bed that displays the heart rate, oxygen level, and other moving charts is the definition of a dumb display. How dumb is it, you ask? If the clip on a patient’s finger falls off, the display thinks the patient is no longer breathing and will sound an alarm…even though it’s displaying outputs from other sensors that show that, no, the patient isn’t about to die.

The problem, as explained by David Arney at an open house for MD PnP, is that medical devices do not share their data in open ways. That is, they don’t interoperate. MD PnP wants to fix that.

The small group was founded in 2004 as part of MIT’s CIMIT (Consortia for Improving Medicine with Innovation and Technology). Funded by grants, including from the NIH and CRICO Insurance, it currently has 6-8 people working on ways to improve health care by getting machines talking with one another.

The one aspect of hospital devices that manufacturers have generally agreed on is that they connect via serial ports. The FDA encourages this, at least in part because serial ports are electrically safe. So, David pointed to a small connector box with serial ports in and out and a small computer in between. The computer converts the incoming information into an open industry standard (ISO 11073). And now the devices can play together. (The “PnP” in the group’s name stands for “plug ‘n’ play,” as we used to say in the personal computing world.)

David then demonstrated what can be done once the data from multiple devices interoperate.

You can put some logic behind the multiple signals so that a patient’s actual condition can be assessed far more accurately: no more sirens when an oxygen sensor falls off a finger.
You can create displays that are more informative and easier to read — and easier to spot anomalies on — than the standard bedside monitor.
You can transform data into other standards, such as the HL7 format for entry into electronic medical records.
If there is more than one sensor monitoring a factor, you can do automatic validation of signals.
You can record and perhaps share alarm histories.
You can create what is functionally an API for the data your medical center is generating: a database that makes the information available to programs that need it via publish and subscribe.
You can aggregate tons of data (while following privacy protocols, of course) and use machine learning to look for unexpected correlations.

MD PnP makes its stuff available under an open BSD license and publishes its projects on GitHub. This means, for example, that while PnP has created interfaces for 20-25 protocols and data standards used by device makers, you could program its connector to support another device if you need to.

Presumably not all the device manufacturers are thrilled about this. The big ones like to sell entire suites of devices to hospitals on the grounds that all those devices interoperate amongst themselves — what I like to call intraoperating. But beyond corporate greed, it’s hard to find a down side to enabling more market choice and more data integration.

Follow me

Categories: future, interop Tagged with: 2b2k • interop • medical Date: October 28th, 2017 dw

Be the first to comment »

June 12, 2016

Beyond bricolage

In 1962, Claude Levi-Strauss brought the concept of bricolage into the anthropological and philosophical lexicons. It has to do with thinking with one’s hands, putting together new things by repurposing old things. It has since been applied to the Internet (including, apparently, by me, thanks to a tip from Rageboy). The term “bricolage” uncovers something important about the Net, but it also covers up something fundamental about the Net that has been growing even more important.

In The Savage Mind (relevant excerpt), CLS argued against the prevailing view that “primitive” peoples were unable to form abstract concepts. After showing that they often in have extensive sets of concepts for flora and fauna, he maintains that these concepts go beyond what they pragmatically need to know:

…animals and plants are not known as a result of their usefulness; they are deemed to be useful or interesting because they are first of all known.

It may be objected that science of this kind can scarcely be of much practical effect. The answer to this is that its main purpose is not a practical one. It meets intellectual requirements rather than or instead of satisfying needs.

It meets, in short, a “demand for order.”

CLS wants us to see the mythopoeic world as being as rich, complex, and detailed as the modern scientific world, while still drawing the relevant distinctions. He uses bricolage as a bridge for our understanding. A bricoleur scavenges the environment for items that can be reused, getting their heft, trying them out, fitting them together and then giving them a twist. The mythopoeic mind engages in this bricolage rather than in the scientific or engineering enterprise of letting a desired project assemble the “raw materials.” A bricoleur has what s/he has and shapes projects around that. And what the bricoleur has generally has been fashioned for some other purpose.

Bricolage is a very useful concept for understanding the Internet’s mashup culture, its culture of re-use. It expresses the way in which one thing inspires another, and the power of re-contextualization. It evokes the sense of invention and play that is dominant on so much of the Net. While the Engineer is King (and, all too rarely, Queen) of this age, the bricoleurs have kept the Net weird, and bless them for it.

But there are at least two ways in which this metaphor is inapt.

First, traditional bricoleurs don’t have search engines that let them in a single glance look across the universe for what they need. Search engines let materials assemble around projects, rather than projects be shaped by the available materials. (Yes, this distinction is too strong. Yes, it’s more complicated than that. Still, there’s some truth to it.)

Second, we have been moving with some consistency toward a Net that at its topmost layers replicates the interoperability of its lower layers. Those low levels specify the rules — protocols — by which networks can join together to move data packets to their destinations. Those packets are designed so they can be correctly interpreted as data by any recipient applications. As you move up the stack, you start to lose this interoperability: Microsoft Word can’t make sense of the data output by Pages, and a graphics program may not be able to make sense of the layer information output by Photoshop.

But, over time, we’re getting better at this:

Applications add import and export services as the market requires. More consequentially, more and richer standards for interoperability continue to emerge, as they have from the very beginning: FTP, HTML, XML, Dublin Core, Schema.org, the many Semantic Web vocabularies, ontologies, and schema, etc.

More important, we are now taking steps to make sure that what we create is available for re-use in ways we have not imagined. We do this by working within standards and protocols. We do it by putting our work into the sphere of reusable items, whether that’s by applying the Creative Commons license, putting our work into a public archive, , or even just paying attention to what will make our work more findable.

This is very different from the bricoleur’s world in which objects are designed for one use, and it takes the ingenuity of the bricoleur to find a new use for it.

This movement continues the initial work of the Internet. From the beginning the Net has been predicated on providing an environment with the fewest possible assumptions about how it will be used. The Net was designed to move anyone’s information no matter what it’s about, what it’s for, where it’s going, or who owns it. The higher levels of the stack are increasingly realizing that vision. The Net is thus more than ever becoming a universe of objects explicitly designed for reuse in unexpected ways. (An important corrective to this sunny point of view: Christian Sandvig’s brilliant description of how the Net has incrementally become designed for delivering video above all else.)

Insofar as we are explicitly creating works designed for unexpected reuse, the bricolage metaphor is flawed, as all metaphors are. It usefully highlights the “found” nature of so much of Internet culture. It puts into the shadows, however, the truly transformative movement we are now living through in which we are explicitly designing objects for uses that we cannot anticipate.

Follow me

Categories: culture, future, interop, philosophy Tagged with: reuse • standards Date: June 12th, 2016 dw

Be the first to comment »

March 1, 2016

[berkman] Dries Buytaert

I’m at a Berkman [twitter: BerkmanCenter] lunchtime talk (I’m moderating, actually) where Dries Buytaert is giving a talk about some important changes in the Web.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He begins by recounting his early days as the inventor of Drupal, in 2001. He’s also the founder of Acquia, one of the fastest growing tech companies in the US. It currently has 750 people working on products and services for Drupal. Drupal is used by about 3% of the billion web sites in the world.

When Drupal started, he felt he “could wrap his arms” around everything going on on the Web. Now that’s impossible, he says. E.g, Google AdWords were just starting, but now AdWords is a $65B business. The mobile Web didn’t exist. Social media didn’t yet exist. Drupal was (and is) Open Source, a concept that most people didn’t understand. “Drupal survived all of these changes in the market because we thought ahead” and then worked with the community.

“The Internet has changed dramatically” in the past decade. Big platforms have emerged. They’re starting to squeeze smaller sites out of the picture. There’s research that shows that many people think that Facebook is the Internet. “How can we save the open Web?,” Dries askes.

What do we mean by the open or closed Web? The closed Web consists of walled gardens. But these walled gardens also do some important good things: bringing millions of people online, helping human rights and liberties, and democratizing the sharing of information. But, their scale is scary . FB has 1.6B active users every month; Apple has over a billion IoS devices. Such behemoths can shape the news. They record data about our behavior, and they won’t stop until they know everything about us.

Dries shows a table of what the different big platforms know about us. “Google probably knows the most about us” because of gMail.

The closed web is winning “because it’s easier to use.” E.g., After Dries moved from Belgium to the US, Facebook and etc. made it much easier to stay in touch with his friends and family.

The open web is characterized by:

Creative freedom — you could create any site you wanted and style it anyway you pleased
Serendipity. That’s still there, but it’s less used. “We just scroll our FB feed and that’s it.”
Control — you owned your own data.
Decentralized — open standards connected the pieces

Closed Web:

Templates dictate your creative license
Algorithms determine what you see
Privacy is in question
Information is siloed

The big platforms are exerting control. E.g., Twitter closed down its open API so it could control the clients that access it. FB launched “Free Basics” that controls which sites you can access. Google lets people purchase results.

There are three major trends we can’t ignore, he says.

First, there’s the “Big Reverse of the Web,” about which Dries has been blogging about. “We’re in a transformational stage of the Web,” flipping it on its head. We used to go to sites and get the information we want. Now information is coming to us. Info, products, and services will come to us at the right time on the right device.”

Second, “Data is eating the world.”

Third, “Rise of the machines.”

For example, “content will find us,” AKA “mobile or contextual information.” If your flight is cancelled, the info available to you at the airport will provide the relevant info, not offer you car rentals for when you arrive. This creates a better user experience, and “user experience always wins.”

Will the Web be open or closed? “It could go either way.” So we should be thinking about how we can build data-driven, user-centric algorithms. “How can we take back control over our data?” “How can we break the silos” and decentralize them while still offering the best user experience. “How do we compete with Google in a decentralized way. Not exactly easy.”

For this, we need more transparency about how data is captured and used, but also how the algorithms work. “We need an FDA for data and algorithms.” (He says he’s not sure about this.) “It would be good if someone could audit these algorithms,” because, for example, Google’s can affect an election. But how to do this? Maybe we need algorithms to audit the algorithms?

Second, we need to protect our data. Perhaps we should “build personal information brokers.” You unbundle FB and Google, put the data in one place, and through APIs give apps access to them. “Some organizations are experimenting with this.”

Third, decentralization and a better user experience. “For the open web to win, we need to be much better to use.” This is where Open Source and open standards come in, for they allow us to build a “layer of tech that enables different apps to communicate, and that makes them very easy to use.” This is very tricky. E.g., how do you make it easy to leave a comment on many different sites without requiring people to log in to each?

It may look almost impossible, but global projects like Drupal can have an impact, Dries says. “We have to try. Today the Web is used by billions of people. Tomorrow by more people.” The Internet of Things will accelerate the Net’s effect. “The Net will change everything, every country, every business, every life.” So, “we have a huge responsibility to build the web that is a great foundation for all these people for decades to come.”

[Because I was moderating the discussion, I couldn’t capture it here. Sorry.]

Follow me

Categories: big data, culture, future, interop, policy Tagged with: interop • open web Date: March 1st, 2016 dw

Be the first to comment »

March 12, 2015

Corrections metadata

It’s certain that this has already been suggested many times, and it’s highly likely it’s been implemented at least several times. But here goes:

Currently the convention for correcting an online mistake is to strikethrough the errant text and then put in the correct text. Showing one’s errors is a wonderful norm, for it honors the links others have made to the piece; it’s at best confusing when you post criticism of someone else’s work, but when the reader goes there the errant remarks have been totally excised. It’s also a visible display of humility.

But strikethrough text is a visual cue of a structural meaning. And it conveys only the fact that the text is wrong, not why it’s wrong.

So, why isn’t there Schema.org markup for corrections?

Schema.org is the set of simple markup for adding semantics to plain old Web pages. The reader can’t see the markup, but computers can. The major search engines are behind Schema.org, which means that if you mark up your page with the metadata they’ve specified, the search engines will understand your page better and are likely to up its ranking. (Here’s another post of mine about Schema.org.)

So, imagine there were simple markup you could put into your HTML that would let you note that some bit of text is errant, and let you express (in hidden text):

When the correction was made
Who made it
Who suggested the correction, if anyone.
When it was made
What was wrong with the text
A bit of further explanation

The corrected text might include the same sort of information. Plus, you’d want a way to indicate that these two pieces of text refer to one another; you wouldn’t want a computer getting confused about which correction corrects which errant text.

If this became standard, browsers could choose to display errant texts and their corrections however they’d like. Add-ons could be written to let users interact with corrections in different ways. For example, maybe you like seeing strikethroughs but I’d prefer to be able to hover to see the errant text. Maybe we can sign up to be notified of any corrections to an article, but not corrections that are just grammatical. Maybe we want to be able to do research about the frequency and type of corrections across sources, areas, languages, genders….

Schema.org could drive this through. Unless, of course, it already has.

Be sure to read the comment from Dan Brickley. Dan is deeply involved in Schema.org. (The prior comment is from my former college roommate.)

Follow me

Categories: interop Tagged with: corrections • errors • interoperability • metadata Date: March 12th, 2015 dw

5 Comments »

January 20, 2015

Fargo: an open outliner

Dave Winer loves outlines. I do, too, but Dave loves them More . We know this because Dave’s created the Fargo outliner, and, in the way of software that makes us freer, he’s made it available to us to use for free, without ads or spyware, and supporting the standards and protocols that make our ideas interoperable.

Fargo is simple and straightfoward. You enter text. You indent lines to create structure. You can reorganize and rearrange as you would like. Type CMD-? or CTL-? for help.

Fargo is a deep product. It is backed by a CMS so you can use it as your primary tool for composing and publishing blog posts. (Dave knows a bit about blogging, after all.) It has workgroup tools. You can execute JavaScript code from it. It understands Markdown. You can use it to do presentations. You can create and edit attributes. You can include other files, so your outlines scale. You can includes feeds, so your outlines remain fresh.

Fargo is generative. It supports open standards, and it’s designed to make it easy to let what you’ve written become part of the open Web. It’s written in HTML5 and runs in all modern browsers. Your outlines have URLs so other pages can link to them. Fargo files are saved in the OPML standard so other apps can open them. The files are stored in your Dropbox folder , which puts them in the Cloud but also on your personal device; look in Dropbox/Apps/smallpicture/. You can choose to encrypt your files to protect them from spies. The Concord engine that powers Fargo is Open Source.

Out of the box, Fargo is a heads-down outliner for people who think about what they write in terms of its structure. (I do.) It thus is light on the presentation side: You can’t easily muck about with the styles it uses to present various levels, and there isn’t an embedded way to display graphics, although you can include files that are displayed when the outline is rendered. But because it is a simple product with great depth, you can always go further with it.

And now matter how far you go, you’ll never be locked in.

Follow me

Categories: free culture, free-making software, interop, reviews Tagged with: dave winer • outliners • word processing Date: January 20th, 2015 dw

1 Comment »

February 1, 2014

Linked Data for Libraries: And we’re off!

I’m just out of the first meeting of the three universities participating in a Mellon grant — Cornell, Harvard, and Stanford, with Cornell as the grant instigator and leader — to build, demonstrate, and model using library resources expressed as Linked Data as a tool for researchers, student, teachers, and librarians. (Note that I’m putting all this in my own language, and I was certainly the least knowledgeable person in the room. Don’t get angry at anyone else for my mistakes.)

This first meeting, two days long, was very encouraging indeed: it’s a superb set of people, we are starting out on the same page in terms of values and principles, and we enjoyed working with one another.

The project is named Linked Data for Libraries (LD4L) (minimal home page), although that doesn’t entirely capture it, for the actual beneficiaries of it will not be libraries but scholarly communities taken in their broadest sense. The idea is to help libraries make progress with expressing what they know in Linked Data form so that their communities can find more of it, see more relationships, and contribute more of what the communities learn back into the library. Linked Data is not only good at expressing rich relations, it makes it far easier to update the dataset with relationships that had not been anticipated. This project aims at helping libraries continuously enrich the data they provide, and making it easier for people outside of libraries — including application developers and managers of other Web sites — to connect to that data.

As the grant proposal promised, we will use existing ontologies, adapting them only when necessary. We do expect to be working on an ontology for library usage data of various sorts, an area in which the Harvard Library Innovation Lab has done some work, so that’s very exciting. But overall this is the opposite of an attempt to come up with new ontologies. Thank God. Instead, the focus is on coming up with implementations at all three universities that can serve as learning models, and that demonstrate the value of having interoperable sets of Linked Data across three institutions. We are particularly focused on showing the value of the high-quality resources that libraries provide.

There was a great deal of emphasis in the past two days on partnerships and collaboration. And there was none of the “We’ll show ‘em where they got it wrong, by gum!” attitude that in my experience all too often infects discussions on the pioneering edge of standards. So, I just got to spend two days with brilliant library technologists who are eager to show how a new generation of tech, architecture, and thought can amplify the already immense value of libraries.

There will be more coming about this effort soon. I am obviously not a source for tech info; that will come soon and from elsewhere.

Follow me

Categories: interop, libraries Tagged with: 2b2k • eim • ld4l • libraries • linked data Date: February 1st, 2014 dw

2 Comments »

June 21, 2013

[lodlam] Kevin Ford on the state of BIBFRAME

Kevin Ford who is a principle member of the team behind the Library of Congress’ BIBFRAME effort — a modern replacement for the aging MARC standard — gives an update on its status, and addresses a controversy about whether it’s “webby” enough. (I liveblogged a session about this at LODLAM.)

Follow me

Categories: interop, libraries, podcast Tagged with: bibframe • libraries • linked data • lodlam • marc • podcast Date: June 21st, 2013 dw

3 Comments »

[lodlam] Kitio Fofack on why Linked Data

Kitio Fofack turned to Linked Data when creating a prototype app that aggregated researcher events. He explains why.

Follow me

Categories: interop, libraries, podcast Tagged with: libraries • linked data • lodlam • podcast Date: June 21st, 2013 dw

Be the first to comment »

March 28, 2013

[annotation][2b2k] Critique^it

Ashley Bradford of Critique-It describes his company’s way of keeping review and feedback engaging.

To what extent can and should we allow classroom feedback to be available in the public sphere? The classroom is a type of Habermasian civic society. Owning one’s discourse in that environment is critical. It has to feel human if students are to learn.

So, you can embed text, audio, and video feedback in documents, video and images. It translates docs into HTML. To make the feedback feel human, it uses slightly stamps. You can also type in comments, marking them as neutral, positive, or critique. A “critique panel” follows you through the doc as you read it, so you don’t have to scroll around. It rolls up comments and stats for the student or the faculty.

It works the same in different doc types, including Powerpoint, images, and video.

Critiques can be shared among groups. Groups can be arbitrarily defined.

It uses HTML 5. It’s written in Javascript, PHP, and uses Mysql.

“We’re starting with an environment. We’re building out tools.” Ashley aims for Critique^It to feel very human.

Follow me

Categories: interop, too big to know Tagged with: 2b2k • annotation • interop • liveblog Date: March 28th, 2013 dw

2 Comments »