John Colagioia

Hi, I work on a variety of things, most of which I talk about more on my blog than on social media. Here, you’ll probably find me talking mostly talking about Free Culture works and sometimes technology.

  • 1 Post
  • 24 Comments
Joined 2 years ago
cake
Cake day: July 23rd, 2023

help-circle
  • I’ve been using different versions of SearX for a long while (sometimes on my server, sometimes through a provider like Disroot) as my standard search engine, since I’ve never had great luck with the big names, and it’s decent, but between upstream provider quota limits, and just the fact that it relies on corporate search APIs at all, sometimes the quality craters.

    While I haven’t had the energy to run YaCy on my own, and public instances tend to not have a long life, I don’t have nearly as much experience with it, but when I have gotten to try it out, the search itself looked great, but generally didn’t have as broad or current an index. Long-term, though, it (and its protocol) is probably going to be the way to go, if only because a company can’t randomly tank it like they can with the meta-search systems or their own interfaces.

    Looking at Presearch for the first time now, the search results look almost surprisingly good if poorly sorted, but the fact that I now know orders of magnitude more about their finances and their cryptocurrency token than what and how the thing actually searches makes me worry a bit about its future.


  • I believe that YouTube supports RSS. I haven’t used it in years, but gPodder allowed subscribing to channels.

    Ah, yeah. From this post:

    • Go to the YouTube channel page.
    • Click more for the About box.
    • Scroll down to click Share channel. Choose Copy channel ID.
    • Get the feed from https://www.youtube.com/feeds/videos.xml?channel_id= plus that channel ID from the previous step.

    From there, something (like a podcast client) needs to grab the video.

    Otherwise, I’ve been using Tartube to download to my media server, which is not great but fine, except for needing to delete the lock file when it (or the computer) crashes, and the fact that the media server hasn’t the foggiest idea of how to organize the “episodes.”




  • I’d say to ignore the platform licensing and just make sure that the license appears in the media itself (which it should, anyway, in case anybody finds it randomly) and marked in descriptions.

    YouTube seems interesting, because there’s so much garbage listed as CC-BY that almost certainly doesn’t have any legitimate permission for it, and I’ve never found actual Creative Commons content through that route, so that probably informs my “just ignore it” thinking…




  • Always good to see more effort to surface these things. A couple of possible enhancements come to mind.

    • Pepper & Carrot probably belongs under comics, and/or comics belongs as a subset of fiction.
    • It’d be great to filter by license, maybe similar to what Openverse (which you already have listed) does. I know that Creative Commons doesn’t see a problem with incompatible licenses, but I feel like people in the space have strong feelings about how “free/libre” it is to say that something can’t be used commercially (whatever that means) or can’t be altered.
    • If you want a pile of fiction of various sorts, at the risk of self-promoting, I spotlight (and ideally have discussions around) Free Culture works on Saturdays. https://john.colagioia.net/blog/tag/bookclub/ (And a bunch of the links actually lead to collections.)
    • Another pile, you’ll need to figure out how to sift through on your own (I haven’t had the time to figure out how to parse it), but Chris “Sanglorian” Sakkas posted the (I imagine) final backup of his Free and Open Works wiki, sort of your predecessor project. (Edit: I stupidly forgot the link https://archive.org/details/freeand-open-works-20200811084450)
    • Too much manual labor, I realize, especially as the list expands, but ideally, it’d be nice to have some idea of what lives at the other end of a link beyond the format. The videos especially could plausibly be anything…

    Thanks for getting this rolling!





  • For clarity, your first interaction with me was to accuse me of lying. I have twice asked you to leave me out of your fantasies. And yet, you’re still here telling me that I’ve done something dishonest by looking at the FSF and having an opinion. I’ve been polite. I have not attacked you. You’ve been insulting and taken everything personally.

    Stop projecting your immaturity onto me. Stop imagining that you’re going to win my approval or respect. Stop imagining that my insistence that you stop bothering me is an attempt to have a conversation with you. And above all, go away, as I’ve requested three times.


  • Look, if you want to claim that “linguistic purism” doesn’t mean “overly precise,” that’s your problem. If you want to support someone who “underestimates people’s feelings” (a.k.a. “a creep”), that’s your problem. If you want to believe that, any day now, a group that has fallen on its face for decades will finally work out its issues, that’s your problem. As I’ve asked, please stop trying to make it my problem. You’ve made your point that you’re a true believer, now walk away, because you’re only going to convince me that you’re a terrible person, from here.


  • Hate to be the bearer of bad news, but I actually summarized a section of the hilariously reactionary open letter in support of Stallman.

    He is usually more focused on the philosophical underpinnings, and pursuing the objective truth and linguistic purism, while underemphasising people’s feelings on matters he’s commenting on. This makes his arguments vulnerable to misunderstanding and misrepresentation…

    People genuinely signed onto “objective truth” and “linguistic purism” making him “vulnerable to misunderstanding.” If strawmen happen to stand among his most vocal supporters, that’s not remotely my problem.

    But no, “there’s an AGPL that you can hunt for, and maybe someday they’ll have an opinion on machine learning” isn’t a counter-argument, to me. Those make my point for me, that they’ve never really cared about anything until it was far too late. I’m not going to tell you not to support them, but I’ll thank you for not telling me that I’m wrong for using their behavior and that of their supporters to assess them.


  • It’s not just the personalities, annoyingly. Even if supporters didn’t need to support Stallman with absurd statements like “he’s just too precise with his words for you to understand him,” the FSF still spent the '90s loudly dismissing people asking straightforward questions about what would happen if someone put GPL’d software onto an appliance or behind a web server. They mostly ignore anything that isn’t code. They’ve never looked at the future or how to convince people of their message. So, while I’ve donated to them in the past, I don’t really see them as relevant anymore. Putting Stallman back on the board with their “we miss him” press release also made it clear that they don’t see themselves as much more than his personal entourage, which even if he were the nicest, most progressive person in the world, would disqualify them as useful.

    Is the Conservancy a replacement? I don’t know, because I don’t know if I can see their missions as overlapping enough to do so. It’s been a decade since Kuhn (not to pick on him) has so much as mentioned Copyleft-Next, for example, and that repository hasn’t budged in seven years.

    Honestly, what I think that I’d really like to see is more of a grass-roots organization, where we’re not constantly waiting for “leaders” to show up. Especially since software has largely shifted to (on the ground) management through distributed systems and issue-tracking, it seems silly to keep imagining the Free Software movement as centralized.



  • I’m obviously a nobody in the field, but (since the mid-'90s) I’ve always seen two issues, here. First, there’s the general problem of not thinking of a public license as a kind of contract, where you get, because you give. If the community started thinking about copyleft or similar ideas as payment or contractual obligations for the use of the software, rather than “restrictions” or “a virus,” we’d probably see a rapid change in behavior, in enforcement or license-writing, if not compliance. …Except that, second, there’s the specific problem that the FSF has always ignored warnings until it was too late and a company went out of their way to offend them. For as long as I’ve watched the Free Software community, people asked what would happen if you embedded gcc on an appliance or served it up on the web, and those people essentially got a response of “the big guy only cares about code that he runs locally, so it’s fine.” And then a company did the thing, forcing them to gasp at the entirely-foreseen problem and issue a new license a decade too late to help. Which I guess is all to say that Perens is probably on the right track, and I wish that we had more people looking at these problems more frequently. I still vaguely remember the Copyleft Next project, but that (clearly) went nowhere.


  • I keep saying “no” to this sort of thing, for a variety of reasons.

    1. “You can use this code for anything you want as long as you don’t work in a field that I don’t like” is pretty much the opposite of the spirit of the GPL.
    2. The enormous companies slurping up all content available on the Internet do not care about copyright. The GPL already forbids adapting and redistributing code without licensing under the GPL, and they’re not doing that. So another clause that says “hey, if you’re training an AI, leave me out” is wasted text that nobody is going to read.
    3. Making “AI” an issue instead of “big corporate abuse” means that academics and hobbyists can’t legally train a language model on your code, even if they would otherwise comply with the license.
    4. The FSF has never cared about anything unless Stallman personally cared about it on his personal computer, and they’ve recently proven that he matters to them more than the community, so we probably shouldn’t ever expect a new GPL.
    5. The GPL has so many problems (because it’s been based on one person’s personal focuses) that they don’t care about or isolate in random silos (like the AGPL, as if the web is still a fringe thing) that AI barely seems relevant.

    I mean, I get it. The language-model people are exhausting, and their disinterest in copyright law is unpleasant. But asking an organization that doesn’t care to add restrictions to a license that the companies don’t read isn’t going to solve the problem.


  • In addition to YaCy and the varieties of Searx (both of which perform better for me than any of the commercial search engines), it’s not even out of the question to do this yourself, if you’re willing to start with the most recent Common Crawl dump and do some spidering in between releases. I don’t recommend it, unless you want to learn for yourself why search engines often give such miserable results, but it’s possible.

    However, that’s the issue, here. Can you self-host a search engine? Sure, if you want to maintain the storage to back it. That depends on how deep your pockets go…