Joho the Blog » free-making software

November 25, 2018

Using the Perma.cc API to check links

My new book (Everyday Chaos, HBR Press, May 2019) has a few hundred footnotes with links to online sources. Because Web sites change and links rot, I decided to link to Perma.cc‘s pages instead . Perma.cc is a product of the Harvard Library Innovation Lab, which I used to co-direct with Kim Dulin, but Perma is a Jonathan Zittrain project from after I left.

When you give Perma.cc a link to a page on the Web, it comes back with a link to a page on the Perma.cc site. That page has an archive copy of the original page exactly as it was when you supplied the link. It also makes a screen capture of that original page. And of course it includes a link to the original. It also promises to maintain the Perma.cc copy and screen capture in perpetuity — a promise backed by the Harvard Law Library and dozens of other libraries. So, when you give a reader a Perma link, they are taken to the Perma.cc page where they’ll always find the archived copy and the screen capture, no matter what happens to the original site. Also, the service is free for everyone, for real. Plus, the site doesn’t require users to supply any information about themselves. Also, there are no ads.

So that’s why my book’s references are to Perma.cc.

But, over the course of the six years I spent writing this book, my references suffered some link rot on my side. Before I got around to creating the Perma links, I managed to make all the obvious errors and some not so obvious. As a result, now that I’m at the copyediting stage, I wanted to check all the Perma links.

I had already compiled a bibliography as a spreadsheet. (The book will point to the Perma.cc page for that spreadsheet.) So, I selected the Title and Perma Link columns, copied the content, and stuck it into a text document. Each line contains the page’s headline and then the Perma link.

Perma.cc has an API that made it simple to write a script that looks up each Perma link and prints out the title it’s recorded next to the title of the page that I intend to be linked. If there’s a problem with Perma link, such as a double “https://https://” (a mistake I managed to introduce about a dozen times), or if the Perma link is private and not accessible to the public, it notes the problem. The human brain is good at scanning this sort of info, looking for inconsistencies.

Here’s the script. I used PHP because I happen to know it better than a less embarrassing choice such as Python and because I have no shame.

1	<?php

2	// This is a basic program for checking a list of page titles and perma.cc links
3	// It’s done badly because I am a terrible hobbyist programmer.
4	// I offer it under whatever open source license is most permissive. I’m really not
5	// going to care about anything you do with it. Except please note I’m a
6	// terrible hobbyist programmer who makes no claims about how well this works.
7	//
8	// David Weinberger
9	// [email protected]
10	// Nov. 23, 2018

11	// Perma.cc API documentation is here: https://perma.cc/docs/developer

12	// This program assumes there’s a file with the page title and one perma link per line.
13	// E.g. The Rand Corporation: The Think Tank That Controls America https://perma.cc/B5LR-88CF

14	// Read that text file into an array
15	$lines = file(‘links-and-titles.txt’);


16	for ($i = 0; $i < count($lines); $i++){
17	$line = $lines[$i];
18	// divide into title and permalink
19	$p1 = strpos($line, “https”); // find the beginning of the perma link
20	$fullperma = substr($line, $p1); // get the full perma link
21	$origtitle = substr($line, 0,$p1); // get the title
22	$origtitle = rtrim($origtitle); // trim the spaces from the end of the title

23	// get the distinctive part of the perma link: the stuff after https://perma.cc/
24	$permacode = strrchr($fullperma,”/”); // find the last forward slash
25	$permacode = substr($permacode,1,strlen($permacode)); // get what’s after that slash
26	$permacode = rtrim($permacode); // trim any spaces from the end

27	// create the url that will fetch this perma link
28	$apiurl = “https://api.perma.cc/v1/public/archives/” . $permacode . “/”;

29	// fetch the data about this perma link
30	$onelink = file_get_contents($apiurl);
31	// echo $onelink; // this would print the full json
32	// decode the json
33	$j = json_decode($onelink, true);
34	// Did you get any json, or just null?
35	if ($j == null){
36	// hmm. This might be a private perma link. Or some other error
37	echo “<p>– $permacode failed. Private? $permaccode</p>”;
38	}
39	// otherwise, you got something, so write some of the data into the page
40	else {
41	echo “<b>” . $j[“guid”] . ‘</b><blockquote>’ . $j[“title”] . ‘<br>’ . $origtitle . “<br>” . $j[“url”] . “</blockquote>”;
42	}
43	}


44	// finish by noting how many files have been read
45	echo “<h2>Read ” . count($lines) . “</h2>”;

46	?>

Run this script in a browser and it will create a page with the results. (The script is available at GitHub.)

Thanks, Perma.cc!

By the way, and mainly because I keep losing track of this info, the table of code was created by a little service cleverly called Convert JS to Table.

Follow me

Categories: free-making software, humor, tech Tagged with: api • open platform • perma • utilities Date: November 25th, 2018 dw

1 Comment »

August 1, 2015

Restoring the Network of Bloggers

It’s good to have Hoder — Hossein Derakhshan— back. After spending six years in an Iranian jail, his voice is stronger than ever. The changes he sees in the Web he loves are distressingly real.

Hoder was in the cohort of early bloggers who believed that blogs were how people were going to find their voices and themselves on the Web. (I tried to capture some of that feeling in a post a year and a half ago.) Instead, in his great piece in Medium he describes what the Web looks like to someone extremely off-line for six years: endless streams of commercial content.

Some of the decline of blogging was inevitable. This was made apparent by Clay Shirky’s seminal post that showed that the scaling of blogs was causing them to follow a power law distribution: a small head followed by a very long tail.

Blogs could never do what I, and others, hoped they would. When the Web started to become a thing, it was generally assumed that everyone would have a home page that would be their virtual presence on the Internet. But home pages were hard to create back then: you had to know HTML, you had to find a host, you had to be so comfortable with FTP that you’d use it as a verb. Blogs, on the other hand, were incredibly easy. You went to one of the blogging platforms, got yourself a free blog site, and typed into a box. In fact, blogging was so easy that you were expected to do it every day.

And there’s the rub. The early blogging enthusiasts were people who had the time, skill, and desire to write every day. For most people, that hurdle is higher than learning how to FTP. So, blogging did not become everyone’s virtual presence on the Web. Facebook did. Facebook isn’t for writers. Facebook is for people who have friends. That was a better idea.

But bloggers still exist. Some of the early cohort have stopped, or blog infrequently, or have moved to other platforms. Many blogs now exist as part of broader sites. The term itself is frequently applied to professionals writing what we used to call “columns,” which is a shame since part of the importance of blogging was that it was a way for amateurs to have a voice.

That last value is worth preserving. It’d be good to boost the presence of local, individual, independent bloggers.

So, support your local independent blogger! Read what she writes! Link to it! Blog in response to it!

But, I wonder if a little social tech might also help. . What follows is a half-baked idea. I think of it as BOAB: Blogger of a Blogger.

Yeah, it’s a dumb name, and I’m not seriously proposing it. It’s an homage to Libby Miller [twitter:LibbyMiller] and Dan Brickley‘s [twitter:danbri ] FOAF — Friend of a Friend — idea, which was both brilliant and well-named. While social networking sites like Facebook maintain a centralized, closed network of people, FOAF enables open, decentralized social networks to emerge. Anyone who wants to participate creates a FOAF file and hosts it on her site. Your FOAF file lists who you consider to be in your social network — your friends, family, colleagues, acquaintances, etc. It can also contain other information, such as your interests. Because FOAF files are typically open, they can be read by any application that wants to provide social networking services. For example, an app could see that Libby ‘s FOAF file lists Dan as a friend, and that Dan’s lists Libby, Carla and Pete. And now we’re off and running in building a social network in which each person owns her own information in a literal and straightforward sense. (I know I haven’t done justice to FOAF, but I hope I haven’t been inaccurate in describing it.)

BOAB would do the same, except it would declare which bloggers I read and recommend, just as the old “blogrolls” did. This would make it easier for blogging aggregators to gather and present networks of bloggers. Add in some tags and now we can browse networks based on topics.

In the modern age, we’d probably want to embed BOAB information in the HTML of a blog rather than in a separate file hidden from human view, although I don’t know what the best practice would be. Maybe both. Anyway, I presume that the information embedded in HTML would be similar to what Schema.org does: information about what a page talks about is inserted into the HTML tags using a specified vocabulary. The great advantage of Schema.org is that the major search engines recognize and understand its markup, which means the search engines would be in a position to ~~construct~~discover the initial blog networks.

In fact, Schema.org has a blog specification already. I don’t see anything like markup for a blogroll, but I’m not very good a reading specifications. In any case, how hard could it be to extend that specification? Mark a link as being to a blogroll pal, and optionally supply some topics? (Dan Brickley works on Schema.org.)

So, imagine a BOAB widget that any blogger can easily populate with links to her favorite blog sites. The widget can then be easily inserted into her blog. Hidden from the users in this widget is the appropriate Schema.org markup. Not only could the search engines then see the blogger network, so could anyone who wanted to write an app or a service.

I have 0.02 confidence that I’m getting the tech right here. But enhancing blogrolls so that they are programmatically accessible seems to me to be a good idea. So good that I have 0.98 confidence that it’s already been done, probably 10+ years ago, and probably by Dave Winer :)

Ironically, I cannot find Hoder’s personal site; www.hoder.com is down, at least at the moment.

More shamefully than ironically, I haven’t updated this blog’s blogroll in many years.

My recent piece in The Atlantic about whether the Web has been irremediably paved touches on some of the same issues as Hoder’s piece.

Follow me

Categories: blogs, everythingIsMiscellaneous, free-making software, social media, tech, too big to know Tagged with: 2b2k • blogging • everythingismisc • hoder • microformats • old days • schema.org • social networking Date: August 1st, 2015 dw

10 Comments »

February 2, 2015

Future of libraries, Kenya style

This video will remind you, if you happen to have forgotten, what libraries mean to much of the world:

Internet, mesh, people eager to learn, the same people eager to share. A future for libraries.

You can contribute here.

Follow me

Categories: culture, free culture, free-making software, libraries, open access Tagged with: kenya • libraries • Nairobi • open access • open culture Date: February 2nd, 2015 dw

Be the first to comment »

January 20, 2015

Fargo: an open outliner

Dave Winer loves outlines. I do, too, but Dave loves them More . We know this because Dave’s created the Fargo outliner, and, in the way of software that makes us freer, he’s made it available to us to use for free, without ads or spyware, and supporting the standards and protocols that make our ideas interoperable.

Fargo is simple and straightfoward. You enter text. You indent lines to create structure. You can reorganize and rearrange as you would like. Type CMD-? or CTL-? for help.

Fargo is a deep product. It is backed by a CMS so you can use it as your primary tool for composing and publishing blog posts. (Dave knows a bit about blogging, after all.) It has workgroup tools. You can execute JavaScript code from it. It understands Markdown. You can use it to do presentations. You can create and edit attributes. You can include other files, so your outlines scale. You can includes feeds, so your outlines remain fresh.

Fargo is generative. It supports open standards, and it’s designed to make it easy to let what you’ve written become part of the open Web. It’s written in HTML5 and runs in all modern browsers. Your outlines have URLs so other pages can link to them. Fargo files are saved in the OPML standard so other apps can open them. The files are stored in your Dropbox folder , which puts them in the Cloud but also on your personal device; look in Dropbox/Apps/smallpicture/. You can choose to encrypt your files to protect them from spies. The Concord engine that powers Fargo is Open Source.

Out of the box, Fargo is a heads-down outliner for people who think about what they write in terms of its structure. (I do.) It thus is light on the presentation side: You can’t easily muck about with the styles it uses to present various levels, and there isn’t an embedded way to display graphics, although you can include files that are displayed when the outline is rendered. But because it is a simple product with great depth, you can always go further with it.

And now matter how far you go, you’ll never be locked in.

Follow me

Categories: free culture, free-making software, interop, reviews Tagged with: dave winer • outliners • word processing Date: January 20th, 2015 dw

1 Comment »

January 14, 2015

Install your own listicle

Dave Winer has made it easy to install your own “listicle”: a Web page that cycles through chunks of text one chunk at a time. For an example, see the listicle Dave created to display Doc and my New Clues clue by clue.

The text comes from a JSON file that you can of course alter. Take a look at the JSON file in a text editor and you’ll figure it out. A couple of things to know:

Be sure to end each quote with a comma, except the last one.
If your chunks contain any double quotes, put a backslash before them. Otherwise, the JSON will think it’s come to the end of a chunk and it will get confused.
Because JSON can be finnicky, check what you’ve done at a site like JSON Formatter. (I broke Dave’s New Clues listicle for a while because I neglected to check my file after I added a dropped clue…and forgot to put a comma at the end of the line.)

Dave has not only made it easier for people to use his work and to make it their own, it’s a good project to learn some coding with. And it’s a great example of the sort of software-that-makes-us-freer that Dave’s urging us to recognize, share, and appreciate.

Follow me

Categories: free-making software Tagged with: dave winer • listicle • new clues Date: January 14th, 2015 dw

2 Comments »

Software that makes us freer

Dave Winer has a couple of related posts up, one addressed to Doc Searls and me, and the other broadening the point: we need to be doing more to support software that makes us, and the Internet, freer.

Dave’s first post addressed Doc and me because Dave not only likes Doc Searls‘ and my New Clues (and the Gillmor Gang podcast we did on Friday), he wrote a cool app — a “listicle” version of the Clues — and before we posted gave us some crucial advice. Dave’s point is that there’s software that increases our freedom and there’s software that “siphons off and monetizes freedom.” People like Dave write software that increases our freedom. People like Doc and me and you ought to be informing one another and the entire ecosystem about the freedom-increasing software we use.

No argument there. I don’t blog a lot about specific pieces of software, except for the library software I’d been working with for the past five years — It’s free-making software — and to whine. I can do more, but, frankly, if you’re reading this blog, you’re in a very elite club (and by “elite” I mean “tiny”) so the practical effect will be negligible. Still, I’ll try.

I’m more distressed by how difficult it is to find freedom-making software. At the major download sites (note: do not use download.com until you read this) you can restrict your results to “free” but not in Dave’s sense…and even then many of the apps are only pretending to be monetarily free. It would help a lot if freedom-making software were a category you could search for. Or if there were download sites devoted to aggregating such software. (What am I forgetting or don’t know about? (Source code sites are too geeky for most people.))

It would be good to come up with a better name than “freedom-making” apps so that it is easier for people to talk about it and understand the concept.

Obviously we’d also want to have some criteria. As I understand it, this is software that doesn’t lock you in, doesn’t lock out other apps, and enables what you do with it to become part of the larger Web.

Heck, we might even want a badge. It works for non-GMO food and Fair Trade goods.

I agree with Dave that we all ought to be talking more audibly about the software we use that makes the Web a better place in the ways that matter: by making it richer with openly linkable and re-usable pieces. And I’ll try to do so, starting soon with a review of Dave’s Fargo outliner. It’d be even better to fill in the pieces missing from our infrastructure for supporting the makers who give us more liberty.

Follow me

Categories: free culture, free-making software Tagged with: dave winer Date: January 14th, 2015 dw

2 Comments »