-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add thumbnail support #76
Conversation
I like idea of having images for each species, that has been on my mind to implement, but I don't want it to be dependent on any online service. Using external resources is often against Terms of Service, and with BirdWeather there is also availability issues. So I am all in to have thumbnails for each species but they must be locally served, and for that we need to find a royalty free source for photos or have them done using generative AI like Midjourney or Dall-E. |
Yes those are some really good points! I am also quite curious about them. Especially the Terms of Service aspect... I suppose using the birdweather API to post things is different from utilizing their images. I can see that birdnet-pi is grabbing their images from flickr so that might be a more viable solution? However, does everything really need to be served locally? That mostly increases the size of the binary, as well as the bandwidth requirements. Sure you get better availability since there is less external dependencies, and have an easier time in air-gapped environments, but is it always necessary? Perhaps some sort of hybrid approach is also a viable option? |
Regarding ToS issues, perhaps wikidata could be a potential source? |
This!
To be honest, I really enjoy the "simple" view right now. Me personally would prefer pictures from an online source, so it's possible to turn this feature off and to save storage. I know how most birds in my area look like- (I get it why pictures are great tho!) - If needed I can provide some Bird-Pictures (photography brought me to birdnet ;) ) Obviously I don't have ALL birds on my camera |
Emailed tim about terms of service etc.. here's what he has to say "We're using images from Wiki Commons - which we credit as provided in their API." |
That is nice! In theory we could then create our own set of pictures from the same source and either bundle in the git repository itself or host somewhere. Though, I would prefer to use birdweather directly since their api providers some really cool features. Not sure how we should move forward on this topic... |
I created an issue as a feature tracker for stuff from birdweather. Probs discuss stuff there and see what other people think too. |
Awesome :) Do you mind linking to the issue? |
|
Makes use of the birdweather graphql api to fetch a given bird's thumbnail.
Pre-fetching the thumbnails during server boot makes it very responsive when the user accesses the webpage for the first time. Since all the birds' thumbnails have already been fetched.
WalkthroughThe recent updates encompass significant enhancements to the system's image handling and caching functionalities. Key additions include methods to fetch and cache bird thumbnails, improved database retrieval functions, and new image API integrations. These updates enhance the user interface by displaying bird images and attribution details in tables, leveraging asynchronous cache initialization for efficient image handling. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Server
participant Datastore
participant WikimediaAPI
User->>Server: Request bird species data
Server->>Datastore: Query for detected species
Datastore-->>Server: Return detected species list
Server->>WikimediaAPI: Request bird thumbnail for species
WikimediaAPI-->>Server: Return bird thumbnail and attribution
Server->>User: Display bird species with thumbnails
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (2)
go.mod
is excluded by!**/*.mod
go.sum
is excluded by!**/*.sum
Files selected for processing (5)
- internal/datastore/interfaces.go (3 hunks)
- internal/httpcontroller/routes.go (4 hunks)
- internal/httpcontroller/utils.go (3 hunks)
- views/fragments/birdsTableHTML.html (1 hunks)
- views/fragments/recentDetections.html (2 hunks)
Files skipped from review due to trivial changes (2)
- views/fragments/birdsTableHTML.html
- views/fragments/recentDetections.html
Additional Context Used
GitHub Check Runs (1)
lint failure (2)
internal/httpcontroller/routes.go: [failure] 47-47:
Error return value is not checked (errcheck)
Additional comments not posted (4)
internal/httpcontroller/routes.go (1)
64-64
: LGTM! Thethumbnail
function is correctly added to the function map.internal/httpcontroller/utils.go (2)
250-268
: LGTM! ThequeryGraphQL
function is well-implemented and handles errors appropriately.
271-299
: LGTM! Thethumbnail
function handles caching and concurrency appropriately.internal/datastore/interfaces.go (1)
315-325
: LGTM! TheGetAllDetectedSpecies
function retrieves all detected species correctly.
internal/httpcontroller/routes.go
Outdated
go func(note string) { | ||
defer wg.Done() | ||
thumbnail(note) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the error returned by the thumbnail
function.
- thumbnail(note)
+ if _, err := thumbnail(note); err != nil {
+ log.Printf("Error fetching thumbnail for %s: %v", note, err)
+ }
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
go func(note string) { | |
defer wg.Done() | |
thumbnail(note) | |
go func(note string) { | |
defer wg.Done() | |
if _, err := thumbnail(note); err != nil { | |
log.Printf("Error fetching thumbnail for %s: %v", note, err) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- internal/httpcontroller/routes.go (4 hunks)
Files skipped from review as they are similar to previous changes (1)
- internal/httpcontroller/routes.go
Regarding of internet dependency I think it would be OK to have as an option. |
I am sorry but I am not going to accept this PR, I am not okay with consuming any resources from BirdWeather.com without prior agreement with Tim & Co., and even then I don't see any point of accessing resources from them if original source is publicly available at somewhere else. |
I am completely on board with this thought too. Was rebasing the thumbnail branch yesterday and planning on soon replacing the birdweather api calls for wikimedia ones. Though it might make more sense to close this PR and make a new for that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
go.mod
is excluded by!**/*.mod
Files selected for processing (1)
- internal/httpcontroller/utils.go (2 hunks)
Additional comments not posted (3)
internal/httpcontroller/utils.go (3)
13-13
: Ensure proper usage of thesync
package.The import of the
sync
package is necessary for the new thumbnail caching mechanism. Good to see it added here.
16-16
: Verify the new dependency oncgt.name/pkg/go-mwclient
.Ensure that this package is properly integrated and maintained, as it is crucial for the new functionality.
Verification successful
The new dependency on
cgt.name/pkg/go-mwclient
is properly listed in thego.mod
file and is actively used in the codebase.
- The dependency is listed in
go.mod
as an indirect dependency.- It is imported and utilized in
internal/httpcontroller/utils.go
.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the dependency on `cgt.name/pkg/go-mwclient` is properly managed. # Test: Check if the dependency is listed in the project's dependency management file. cat go.mod | grep 'cgt.name/pkg/go-mwclient'Length of output: 89
Script:
#!/bin/bash # Description: Verify the usage of `cgt.name/pkg/go-mwclient` in the codebase. # Test: Search for the usage of `cgt.name/pkg/go-mwclient` in the codebase. rg 'cgt.name/pkg/go-mwclient'Length of output: 142
Script:
#!/bin/bash # Description: Verify the usage of `mwclient` in `internal/httpcontroller/utils.go`. # Test: Search for the usage of `mwclient` in `internal/httpcontroller/utils.go`. rg 'mwclient' internal/httpcontroller/utils.goLength of output: 147
244-245
: Use ofsync.Map
for caching.The use of
sync.Map
forthumbnailMap
andthumbnailMutexMap
is appropriate given the concurrent access patterns expected with the thumbnail caching feature.
internal/httpcontroller/utils.go
Outdated
func queryWikiMedia(scientificName string) (string, error) { | ||
w, err := mwclient.New("https://wikipedia.org/w/api.php", "Birdnet-Go") | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
// Specify parameters to send. | ||
parameters := map[string]string{ | ||
"action": "query", | ||
"prop": "pageimages", | ||
"piprop": "thumbnail", | ||
"pilicense": "free", | ||
"titles": scientificName, | ||
"pithumbsize": "400", | ||
"redirects": "", | ||
} | ||
|
||
// Make the request. | ||
resp, err := w.Get(parameters) | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
// Print the *jason.Object | ||
pages, err := resp.GetObjectArray("query", "pages") | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
thumbnail, err := pages[0].GetString("thumbnail", "source") | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
return string(thumbnail), nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor queryWikiMedia
function for error handling and readability.
func queryWikiMedia(scientificName string) (string, error) {
w, err := mwclient.New("https://wikipedia.org/w/api.php", "Birdnet-Go")
if err != nil {
return "", fmt.Errorf("failed to create mwclient: %w", err)
}
parameters := map[string]string{
"action": "query",
"prop": "pageimages",
"piprop": "thumbnail",
"pilicense": "free",
"titles": scientificName,
"pithumbsize": "400",
"redirects": "",
}
resp, err := w.Get(parameters)
if err != nil {
return "", fmt.Errorf("failed to get response: %w", err)
}
pages, err := resp.GetObjectArray("query", "pages")
if err != nil {
return "", fmt.Errorf("failed to get pages object array: %w", err)
}
thumbnail, err := pages[0].GetString("thumbnail", "source")
if err != nil {
return "", fmt.Errorf("failed to get thumbnail source: %w", err)
}
return string(thumbnail), nil
}
Refactor the function to improve error messages and overall readability.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
func queryWikiMedia(scientificName string) (string, error) { | |
w, err := mwclient.New("https://wikipedia.org/w/api.php", "Birdnet-Go") | |
if err != nil { | |
return "", err | |
} | |
// Specify parameters to send. | |
parameters := map[string]string{ | |
"action": "query", | |
"prop": "pageimages", | |
"piprop": "thumbnail", | |
"pilicense": "free", | |
"titles": scientificName, | |
"pithumbsize": "400", | |
"redirects": "", | |
} | |
// Make the request. | |
resp, err := w.Get(parameters) | |
if err != nil { | |
return "", err | |
} | |
// Print the *jason.Object | |
pages, err := resp.GetObjectArray("query", "pages") | |
if err != nil { | |
return "", err | |
} | |
thumbnail, err := pages[0].GetString("thumbnail", "source") | |
if err != nil { | |
return "", err | |
} | |
return string(thumbnail), nil | |
} | |
func queryWikiMedia(scientificName string) (string, error) { | |
w, err := mwclient.New("https://wikipedia.org/w/api.php", "Birdnet-Go") | |
if err != nil { | |
return "", fmt.Errorf("failed to create mwclient: %w", err) | |
} | |
parameters := map[string]string{ | |
"action": "query", | |
"prop": "pageimages", | |
"piprop": "thumbnail", | |
"pilicense": "free", | |
"titles": scientificName, | |
"pithumbsize": "400", | |
"redirects": "", | |
} | |
resp, err := w.Get(parameters) | |
if err != nil { | |
return "", fmt.Errorf("failed to get response: %w", err) | |
} | |
pages, err := resp.GetObjectArray("query", "pages") | |
if err != nil { | |
return "", fmt.Errorf("failed to get pages object array: %w", err) | |
} | |
thumbnail, err := pages[0].GetString("thumbnail", "source") | |
if err != nil { | |
return "", fmt.Errorf("failed to get thumbnail source: %w", err) | |
} | |
return string(thumbnail), nil | |
} |
internal/httpcontroller/utils.go
Outdated
func thumbnail(scientificName string) (string, error) { | ||
// Check if thumbnail is already cached | ||
if thumbnail, ok := thumbnailMap.Load(scientificName); ok { | ||
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail) | ||
return thumbnail.(string), nil | ||
} | ||
|
||
// Use a per-item mutex to ensure only one GraphQL query is performed per item | ||
mu, _ := thumbnailMutexMap.LoadOrStore(scientificName, &sync.Mutex{}) | ||
mutex := mu.(*sync.Mutex) | ||
|
||
mutex.Lock() | ||
defer mutex.Unlock() | ||
|
||
// Check again if thumbnail is cached after acquiring the lock | ||
if thumbnail, ok := thumbnailMap.Load(scientificName); ok { | ||
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail) | ||
return thumbnail.(string), nil | ||
} | ||
|
||
thumbn, err := queryWikiMedia(scientificName) | ||
if err != nil { | ||
log.Printf("error querying wikimedia endpoint: %v", err) | ||
} | ||
|
||
thumbnailMap.Store(scientificName, thumbn) | ||
log.Printf("Bird: %s, Thumbnail (fetched): %s\n", scientificName, thumbn) | ||
return thumbn, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize thumbnail caching logic.
func thumbnail(scientificName string) (string, error) {
if thumbnail, ok := thumbnailMap.Load(scientificName); ok {
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail)
return thumbnail.(string), nil
}
mu, _ := thumbnailMutexMap.LoadOrStore(scientificName, &sync.Mutex{})
mutex := mu.(*sync.Mutex)
mutex.Lock()
defer mutex.Unlock()
if thumbnail, ok := thumbnailMap.Load(scientificName); ok {
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail)
return thumbnail.(string), nil
}
thumbn, err := queryWikiMedia(scientificName)
if err != nil {
log.Printf("error querying wikimedia endpoint: %v", err)
return "", err
}
thumbnailMap.Store(scientificName, thumbn)
log.Printf("Bird: %s, Thumbnail (fetched): %s\n", scientificName, thumbn)
return thumbn, nil
}
Consider optimizing the thumbnail function to reduce redundancy and improve efficiency. The double-check locking pattern is correctly implemented, but ensure that the error from queryWikiMedia
is properly handled.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
func thumbnail(scientificName string) (string, error) { | |
// Check if thumbnail is already cached | |
if thumbnail, ok := thumbnailMap.Load(scientificName); ok { | |
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail) | |
return thumbnail.(string), nil | |
} | |
// Use a per-item mutex to ensure only one GraphQL query is performed per item | |
mu, _ := thumbnailMutexMap.LoadOrStore(scientificName, &sync.Mutex{}) | |
mutex := mu.(*sync.Mutex) | |
mutex.Lock() | |
defer mutex.Unlock() | |
// Check again if thumbnail is cached after acquiring the lock | |
if thumbnail, ok := thumbnailMap.Load(scientificName); ok { | |
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail) | |
return thumbnail.(string), nil | |
} | |
thumbn, err := queryWikiMedia(scientificName) | |
if err != nil { | |
log.Printf("error querying wikimedia endpoint: %v", err) | |
} | |
thumbnailMap.Store(scientificName, thumbn) | |
log.Printf("Bird: %s, Thumbnail (fetched): %s\n", scientificName, thumbn) | |
return thumbn, nil | |
} | |
func thumbnail(scientificName string) (string, error) { | |
if thumbnail, ok := thumbnailMap.Load(scientificName); ok { | |
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail) | |
return thumbnail.(string), nil | |
} | |
mu, _ := thumbnailMutexMap.LoadOrStore(scientificName, &sync.Mutex{}) | |
mutex := mu.(*sync.Mutex) | |
mutex.Lock() | |
defer mutex.Unlock() | |
if thumbnail, ok := thumbnailMap.Load(scientificName); ok { | |
log.Printf("Bird: %s, Thumbnail (cached): %s\n", scientificName, thumbnail) | |
return thumbnail.(string), nil | |
} | |
thumbn, err := queryWikiMedia(scientificName) | |
if err != nil { | |
log.Printf("error querying wikimedia endpoint: %v", err) | |
return "", err | |
} | |
thumbnailMap.Store(scientificName, thumbn) | |
log.Printf("Bird: %s, Thumbnail (fetched): %s\n", scientificName, thumbn) | |
return thumbn, nil | |
} |
Ok finally got around to creating some sort of PoC of the wikimedia api. Thanks to @PovilasID for making me aware of the query feature in the wikipedia api in the discussion thread yesterday. Was quite lost on how to get the correct image for a given bird in some automatic way otherwise :) Image proof that it still works as expected: However, I am still not quite sure how we are actually supposed to give the correct credit for the images. I can see that the birdweather graphql api returns these fields along with the image:
I have never used birdweather directly myself, so I don't know how these fields are displayed inside the application itself? Do you have any idea @matthew73210 |
@isZumpo np. I do not think we can really get good way to get author info I think for now you can probably just split the string of the image url on the dash after 300px and replace everything before it with macaulaylibrary.org has en embed view https://macaulaylibrary.org/asset/211672961/embed to get pics from them you need to know image ID... whobird just made 3 60k arrays... to be as lookup table lol. So Sylvia atricapilla_Eurasian Blackcap |
Re credit, probs the best way is to have it mentioned on the github page and under options (if that's where you put the toggle). If memory serves they can be part of the metadata of the image itself. Not sure if this helps. |
On the birdweather website clicking on the 🛈 icon next to the bird thumbnail takes you to a page with information about the species, a larger image and image credit link at the bottom of the page. e.g. https://app.birdweather.com/species/golden-whistler In the birdnet iOS app clicking on the thumbnail takes you to the Wikipedia article. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
go.sum
is excluded by!**/*.sum
Files selected for processing (2)
- go.mod (2 hunks)
- views/fragments/recentDetections.html (2 hunks)
Additional context used
golangci-lint
Additional comments not posted (2)
go.mod (2)
28-28
: The addition ofgithub.com/antonholmquist/jason
as an indirect dependency is appropriate for JSON parsing in API interactions.
27-27
: Ensure the license ofcgt.name/pkg/go-mwclient
is compatible with your project's license.Verification successful
The
cgt.name/pkg/go-mwclient
package is licensed under the Creative Commons CC0 1.0 Universal license, which is a public domain dedication. This license is generally compatible with most other licenses, including open-source licenses.
- Ensure that the CC0 1.0 Universal license is compatible with your project's license.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check the license of the newly added dependency. # Test: Fetch the license file from the repository. curl -s https://raw.githubusercontent.com/cgt/go-mwclient/master/LICENSELength of output: 89
Script:
#!/bin/bash # Description: Search for any license-related files in the repository. # Test: List all files in the repository and filter for license-related files. curl -s https://api.github.com/repos/cgt/go-mwclient/contents | jq -r '.[] | select(.type == "file") | .name' | grep -i 'license\|copying\|readme'Length of output: 166
Script:
#!/bin/bash # Description: Fetch the contents of the COPYING file to determine the license type. # Test: Fetch the COPYING file from the repository. curl -s https://raw.githubusercontent.com/cgt/go-mwclient/master/COPYINGLength of output: 7122
@@ -24,6 +24,8 @@ | |||
hx-trigger="click" | |||
hx-push-url="true"> | |||
{{ .CommonName}}</a> | |||
<!-- Bird thumbnail --> | |||
<img loading="lazy" width="40" src="{{thumbnail .ScientificName}}" alt="Bird Image" class="h-auto rounded-md"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a title
attribute to the <img>
tags for better accessibility and to provide additional context on hover.
- <img loading="lazy" width="40" src="{{thumbnail .ScientificName}}" alt="Bird Image" class="h-auto rounded-md">
+ <img loading="lazy" width="40" src="{{thumbnail .ScientificName}}" alt="Bird Image" title="{{.CommonName}}" class="h-auto rounded-md">
Also applies to: 71-71
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
<img loading="lazy" width="40" src="{{thumbnail .ScientificName}}" alt="Bird Image" class="h-auto rounded-md"> | |
<img loading="lazy" width="40" src="{{thumbnail .ScientificName}}" alt="Bird Image" title="{{.CommonName}}" class="h-auto rounded-md"> |
Ok that looks great! Did some work today and managed to parse out the owner of the image and license itself: will have to do some more code improvement and management of edge cases before I update the pr :) |
looks good |
Finally managed to get something kind of ready. Let me tell you, getting the author information out of Wikipedia is very tricky since everyone is using different formats for the author block. Some just have their names, others have URLs, and then there are even some crazy people with tables and lots of links... I believe what I have now covers most of the different scenarios quite well! I am a bit unsure about the daily summary view since the credits do not really fit into the table itself. Any suggestions on how we could format it in a nicer way there? Anyways for those of you who want to try it out, feel free to either build this PR yourself. Or grab the image itself from my fork:
|
Is this ready for merge? We can tweak things further later |
Yeah, you are right let's not try to get everything perfect on the first try. There are observed issues for certain birds where Wikipedia does not return any "machine-readable" data. I believe that should be fixable without too much trouble in a future PR. Furthermore, it does not cause much more trouble than a missing image. Since there are mentions of birdweather etc in the previous commits of this PR. Perhaps it is best to merge it by squashing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Outside diff range and nitpick comments (3)
views/fragments/birdsTableHTML.html (1)
Line range hint
8-36
: The integration of the thumbnail column in the HTML table is well done, with appropriate attributes for lazy loading and responsive design. However, there's a TODO comment about adding credits back in. It's important to address this and ensure that proper attribution is displayed for the images.internal/httpcontroller/init.go (2)
8-8
: Consider using a more specific import alias forsync
if it's only used in limited contexts, to improve readability.
36-36
: The addition ofBirdImageCache
to theServer
struct is a significant architectural change. Ensure that its lifecycle is properly managed across server operations.
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
go.sum
is excluded by!**/*.sum
Files selected for processing (9)
- go.mod (2 hunks)
- internal/httpcontroller/imageprovider/imageprovider.go (1 hunks)
- internal/httpcontroller/imageprovider/imageprovider_test.go (1 hunks)
- internal/httpcontroller/imageprovider/wikipedia.go (1 hunks)
- internal/httpcontroller/init.go (4 hunks)
- internal/httpcontroller/routes.go (2 hunks)
- internal/httpcontroller/utils.go (2 hunks)
- views/fragments/birdsTableHTML.html (2 hunks)
- views/fragments/recentDetections.html (3 hunks)
Files skipped from review as they are similar to previous changes (2)
- go.mod
- views/fragments/recentDetections.html
Additional comments not posted (7)
internal/httpcontroller/imageprovider/imageprovider_test.go (1)
16-54
: The test implementation forTestBirdImageCache
is well-structured and effectively checks the caching behavior. Good job ensuring that the cache avoids unnecessary fetches and handles duplicate requests correctly.internal/httpcontroller/routes.go (1)
37-47
: The updates tofuncMap
in theinitRoutes
method are well implemented, correctly integrating thethumbnail
andthumbnailAttribution
functions for use in templates. This is a crucial step for enabling dynamic content rendering based on image data.internal/httpcontroller/init.go (2)
16-16
: Importing theimageprovider
package is crucial for the new functionality. Ensure that it is well integrated and tested.
109-109
: Initializing the bird image cache in a separate goroutine is a good practice for non-blocking operations. Ensure that all potential race conditions are handled.internal/httpcontroller/utils.go (2)
242-250
: Thethumbnail
method is straightforward and correctly retrieves the image URL from the cache. Ensure that empty strings are handled appropriately in the calling context.
252-273
: ThethumbnailAttribution
method is well-implemented with checks for missing author or license information. Consider adding more detailed logging for debugging purposes.
[REFACTOR_SUGGESTion]- log.Printf("Error getting thumbnail info for %s: %v", scientificName, err) + log.Printf("Error getting thumbnail attribution for %s: %v", scientificName, err)internal/httpcontroller/imageprovider/wikipedia.go (1)
25-33
: The constructor forwikiMediaProvider
is correctly setting up the client. Ensure that the user-agent string "Birdnet-Go" is appropriate and accepted by Wikipedia's API.
cache, err := imageprovider.CreateDefaultCache() | ||
if err != nil { | ||
log.Fatal(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error handling during cache initialization is critical. Consider logging the error for better diagnostics before terminating the application.
- log.Fatal(err)
+ log.Fatalf("Failed to create default cache: %v", err)
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
cache, err := imageprovider.CreateDefaultCache() | |
if err != nil { | |
log.Fatal(err) | |
} | |
cache, err := imageprovider.CreateDefaultCache() | |
if err != nil { | |
log.Fatalf("Failed to create default cache: %v", err) | |
} |
func (s *Server) initBirdImageCache() { | ||
speciesList, err := s.ds.GetAllDetectedSpecies() | ||
if err != nil { | ||
s.Echo.Logger.Fatal(err) | ||
} | ||
|
||
var wg sync.WaitGroup | ||
|
||
for _, species := range speciesList { | ||
wg.Add(1) | ||
|
||
go func(speciesName string) { | ||
defer wg.Done() | ||
_, err := s.BirdImageCache.Get(speciesName) | ||
if err != nil { | ||
s.Echo.Logger.Error(err) | ||
} | ||
}(species.ScientificName) | ||
} | ||
|
||
wg.Wait() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method initBirdImageCache
efficiently initializes the cache with concurrency. However, consider handling errors more gracefully instead of just logging them, to avoid silent failures in cache initialization.
- s.Echo.Logger.Error(err)
+ if err != nil {
+ s.Echo.Logger.Errorf("Failed to initialize cache for species %s: %v", speciesName, err)
+ return
+ }
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
func (s *Server) initBirdImageCache() { | |
speciesList, err := s.ds.GetAllDetectedSpecies() | |
if err != nil { | |
s.Echo.Logger.Fatal(err) | |
} | |
var wg sync.WaitGroup | |
for _, species := range speciesList { | |
wg.Add(1) | |
go func(speciesName string) { | |
defer wg.Done() | |
_, err := s.BirdImageCache.Get(speciesName) | |
if err != nil { | |
s.Echo.Logger.Error(err) | |
} | |
}(species.ScientificName) | |
} | |
wg.Wait() | |
} | |
func (s *Server) initBirdImageCache() { | |
speciesList, err := s.ds.GetAllDetectedSpecies() | |
if err != nil { | |
s.Echo.Logger.Fatal(err) | |
} | |
var wg sync.WaitGroup | |
for _, species := range speciesList { | |
wg.Add(1) | |
go func(speciesName string) { | |
defer wg.Done() | |
_, err := s.BirdImageCache.Get(speciesName) | |
if err != nil { | |
s.Echo.Logger.Errorf("Failed to initialize cache for species %s: %v", speciesName, err) | |
return | |
} | |
}(species.ScientificName) | |
} | |
wg.Wait() | |
} |
// fetch retrieves the bird image for a given scientific name. | ||
func (l *wikiMediaProvider) fetch(scientificName string) (BirdImage, error) { | ||
// Query for the thumbnail image URL and source file name | ||
thumbnailUrl, thumbnailSourceFile, err := l.queryThumbnail(scientificName) | ||
if err != nil { | ||
return BirdImage{}, fmt.Errorf("failed to query thumbnail of bird: %s : %w", scientificName, err) | ||
} | ||
|
||
// Query for the image author information | ||
authorInfo, err := l.queryAuthorInfo(thumbnailSourceFile) | ||
if err != nil { | ||
return BirdImage{}, fmt.Errorf("failed to query thumbnail credit of bird: %s : %w", scientificName, err) | ||
} | ||
|
||
// Return the bird image struct with the image URL and author information | ||
return BirdImage{ | ||
Url: thumbnailUrl, | ||
AuthorName: authorInfo.name, | ||
AuthorUrl: authorInfo.url, | ||
LicenseName: authorInfo.licenseName, | ||
LicenseUrl: authorInfo.licenseUrl, | ||
}, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fetch
method is a key part of this implementation, integrating thumbnail and author queries. Consider adding retries or fallback mechanisms in case of API failures.
- return BirdImage{}, fmt.Errorf("failed to query thumbnail of bird: %s : %w", scientificName, err)
+ // Implement retry logic here
Committable suggestion was skipped due to low confidence.
// extractArtistInfo tries to extract the author information as best as possible | ||
// from the given input which may consist of nested html tags. | ||
func extractArtistInfo(htmlStr string) (href, text string, err error) { | ||
// Parse the HTML string into an HTML document | ||
doc, err := html.Parse(strings.NewReader(htmlStr)) | ||
if err != nil { | ||
return "", "", err | ||
} | ||
|
||
// Find all the links in the document | ||
links := findLinks(doc) | ||
|
||
// If no links are found, extract the inner text and return it | ||
if len(links) == 0 { | ||
return "", html2text.HTML2Text(htmlStr), nil | ||
} | ||
|
||
// If there is only one link, extract the href and inner text and return them | ||
if len(links) == 1 { | ||
link := links[0] | ||
href = extractHref(link) | ||
text = extractText(link) | ||
return href, text, nil | ||
} | ||
|
||
// Look for a Wikipedia user link and extract the href and inner text | ||
wikipediaUserLinks := findWikipediaUserLinks(links) | ||
|
||
if len(wikipediaUserLinks) == 0 { | ||
return "", "", fmt.Errorf("failed to extract link from HTML: %s", htmlStr) | ||
} | ||
|
||
if len(wikipediaUserLinks) == 1 { | ||
// Return the href and inner text of the Wikipedia user link | ||
wikipediaLink := wikipediaUserLinks[0] | ||
href = extractHref(wikipediaLink) | ||
text = extractText(wikipediaLink) | ||
return href, text, nil | ||
} | ||
|
||
// Check if all the links have the same href value | ||
firstHref := extractHref(wikipediaUserLinks[0]) | ||
allSameHref := true | ||
for _, link := range wikipediaUserLinks[1:] { | ||
if extractHref(link) != firstHref { | ||
allSameHref = false | ||
break | ||
} | ||
} | ||
|
||
if allSameHref { | ||
// Return the href and inner text of the first Wikipedia user link | ||
wikipediaLink := wikipediaUserLinks[0] | ||
href = extractHref(wikipediaLink) | ||
text = extractText(wikipediaLink) | ||
return href, text, nil | ||
} | ||
|
||
return "", "", fmt.Errorf("multiple Wikipedia user links found in HTML: %s", htmlStr) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extractArtistInfo
function is complex and handles multiple scenarios. Refactor this to simplify the logic and improve maintainability.
- if len(links) == 0 {
+ if len(links) <= 1 {
+ // Simplify handling for one or no links
}
Committable suggestion was skipped due to low confidence.
type BirdImageCache struct { | ||
dataMap sync.Map | ||
dataMutexMap sync.Map | ||
birdImageProvider ImageProvider | ||
nonBirdImageProvider ImageProvider | ||
} | ||
|
||
type emptyImageProvider struct { | ||
} | ||
|
||
func (l *emptyImageProvider) fetch(scientificName string) (BirdImage, error) { | ||
return BirdImage{}, nil | ||
} | ||
|
||
func initCache(e ImageProvider) *BirdImageCache { | ||
return &BirdImageCache{ | ||
birdImageProvider: e, | ||
nonBirdImageProvider: &emptyImageProvider{}, // TODO: Use a real image provider for non-birds | ||
} | ||
} | ||
|
||
func CreateDefaultCache() (*BirdImageCache, error) { | ||
provider, err := NewWikiMediaProvider() | ||
if err != nil { | ||
return nil, err | ||
} | ||
return initCache(provider), nil | ||
} | ||
|
||
func (c *BirdImageCache) Get(scientificName string) (info BirdImage, err error) { | ||
// Check if the bird image is already in the cache | ||
birdImage, ok := c.dataMap.Load(scientificName) | ||
if ok { | ||
return birdImage.(BirdImage), nil | ||
} | ||
|
||
// Use a per-item mutex to ensure only one query is performed per item | ||
mu, _ := c.dataMutexMap.LoadOrStore(scientificName, &sync.Mutex{}) | ||
mutex := mu.(*sync.Mutex) | ||
|
||
mutex.Lock() | ||
defer mutex.Unlock() | ||
|
||
// Check again if bird image is cached after acquiring the lock | ||
birdImage, ok = c.dataMap.Load(scientificName) | ||
if ok { | ||
return birdImage.(BirdImage), nil | ||
} | ||
|
||
// Fetch the bird image from the image provider | ||
fetchedBirdImage, err := c.fetch(scientificName) | ||
if err != nil { | ||
// TODO for now store a empty result in the cache to avoid future queries that would fail. | ||
// In the future, look at the error and decide if it was caused by networking and is recoverable. | ||
// And if it was, do not store the empty result in the cache. | ||
c.dataMap.Store(scientificName, BirdImage{}) | ||
return BirdImage{}, err | ||
} | ||
|
||
// Store the fetched image information in the cache | ||
c.dataMap.Store(scientificName, fetchedBirdImage) | ||
|
||
return fetchedBirdImage, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of BirdImageCache
uses modern concurrency patterns effectively with sync.Map
and per-item mutexes. However, consider improving the error handling in the fetch
method to differentiate between recoverable and non-recoverable errors before deciding to cache an empty result.
func (c *BirdImageCache) fetch(scientificName string) (info BirdImage, err error) { | ||
var imageProviderToUse ImageProvider | ||
|
||
// Determine the image provider based on the scientific name | ||
if slices.Contains([]string{ | ||
"Dog", | ||
"Engine", | ||
"Environmental", | ||
"Fireworks", | ||
"Gun", | ||
"Human non-vocal", | ||
"Human vocal", | ||
"Human whistle", | ||
"Noise", | ||
"Power tools", | ||
"Siren", | ||
}, scientificName) { | ||
imageProviderToUse = c.nonBirdImageProvider | ||
} else { | ||
imageProviderToUse = c.birdImageProvider | ||
} | ||
|
||
// Fetch the image from the image provider | ||
return imageProviderToUse.fetch(scientificName) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fetch
method's logic for determining the image provider based on the scientific name is a smart design choice. Consider adding more comments to explain why certain categories use the non-bird image provider, enhancing the maintainability and understandability of the code.
"A picture is worth a thousand words"
I though it could be nice to have some images of the detected birds since I am really bad at bird names... I saw that the BirdWeather GraphQL API offers endpoints to receive thumbnail for a given bird. In this pull request, I have started on a very crude implementation for fetching these thumbnails and adding them in some of the html views.
Daily summary view:
Recent detections view:
Of course the images could be placed better inside the html view, I just did not spend too much time on it.
In addition to only fetching the urls, a in memory caching mechanism is utilized. The cache saves the result for each bird after each request, such that the next request for that bird will be received from the cache instead. In addition, during startup, the cache will be initialized with every detected bird in the database.
I am a bit unsure of where to place these functions. Maybe it would be better to put these inside the birdweather internal/birdweather folder instead? Please give me some advice, and I will do my best to update the pull request accordingly.