cross-posted from: https://discuss.tchncs.de/post/48831918

I have multiple .opml files (that list rss feeds I subscribe to in xml i.g.) with different contents on different devices. How can I merge them without duplicates? Are there tools for this?

  • xsukax@lemmy.zip
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 days ago

    If you’re on Linux/macOS, a one-liner with xmllint or even plain Python handles this cleanly:

    import xml.etree.ElementTree as ET
    
    files = ["feeds1.opml", "feeds2.opml", "feeds3.opml"]
    seen = set()
    base = ET.parse(files[0]).getroot()
    body = base.find("body")
    
    for f in files[1:]:
        for outline in ET.parse(f).iter("outline"):
            url = outline.get("xmlUrl")
            if url and url not in seen:
                seen.add(url)
                body.append(outline)
    
    # Seed the base with already-existing URLs
    for o in ET.parse(files[0]).iter("outline"):
        seen.add(o.get("xmlUrl", ""))
    
    ET.ElementTree(base).write("merged.opml")
    

    Run it, done — deduplication is handled by xmlUrl.

  • kevincox@lemmy.ml
    link
    fedilink
    arrow-up
    5
    ·
    4 months ago

    The “dumb” solution is to just import both into one feed reader then export a new OPML. I assume most readers will deduplicate (at least to a basic degree) on import.