Skip to content

Commit

Permalink
Attempt to uniquify feed items with duplicate URLS (#120)
Browse files Browse the repository at this point in the history
This pull-request closes #111, by attempting to make feed items which share URLs unique, via appending the UID of the item with a "#"-mark.

This will fail if a mark already exists, but that risk is worthwhile given the nature of the feeds that will be affected.
  • Loading branch information
skx authored Dec 15, 2023
1 parent a76100a commit 071ad1c
Showing 1 changed file with 31 additions and 0 deletions.
31 changes: 31 additions & 0 deletions processor/processor.go
Original file line number Diff line number Diff line change
Expand Up @@ -256,9 +256,40 @@ func (p *Processor) processFeed(entry configfile.Feed, recipients []string) erro
// Keep track of all the items in the feed.
items := []string{}

//
// Issue #111 reported an example feed which
// contained duplicate URLs
//
// We can look over the links in the feed, before
// we do anything else, and look to see if we have
// duplicates
//
// Do we have dupes?
//
dupes := false

//
// Temporary map
//
seenDupes := make(map[string]int)
for _, str := range feed.Items {
if seenDupes[str.Link] > 0 {
dupes = true
}

seenDupes[str.Link]++
}

// For each entry in the feed ..
for _, xp := range feed.Items {

// If the feed contains duplicate entries
// then we try to uniquify them.
if dupes {
xp.Link += "#"
xp.Link += xp.GUID
}

// Wrap the feed-item in a class of our own,
// so that we can get access to the content easily.
//
Expand Down

0 comments on commit 071ad1c

Please sign in to comment.