-
Notifications
You must be signed in to change notification settings - Fork 199
Tags and deps #551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tags and deps #551
Conversation
Cateories are parsed into tags. Helper functions to allow for tags to be proposed and accepted set up Cleaned up the interface
Added Author and Maintainer fields to PackageItem jQuery DataTables used for the page Added votes as a queryable parameter UI changes to propose/accept tags
A clickable UI for accepting/rejecting tags Added get method to /package/:pkgname/tags shifted forms to a separate template
When a trustee aliases Tag abcd -> Tag abc, all packages that were earlier tagged `abcd` get tagged to `abc` and any new packages tagged `abcd` get retagged `abc` on upload
Added backup functions for tag alias Fixed tag aliasing bug on startup
Limit revdeps to keep track of latest 5 versions Represent all the dep stats as ReverseCount HTML views for Revdeps by PackageID Popularity Metric based on RevDeps/Votes/Downloads Tag pages with packages by popularity
Add maintainers link to dataTables interface
Fixes category parsing error. "a,b," now parsed to ["a","b"] as opposed to ["a","b",""] Minor UI fixes Fixes tag-edit.html error (caused by rebase)
multiple maintainers (if present) show up in the search tables as opposed to just the latest uploader
adaptive doubling for rev deps graph
reverse dependencies and voting don't redirect spacing and UI fixes as pointed
A simple related packages view (currently based on bm25f results)
A visualization of the package database
Edited the output of /packages/search to use the datatables interface
It doesn't need to be in the interface for the core
We were writing changes into a channel but never reading from it.
Looks like dead code. It was being updated but never read.
It's fully constructing everything.
These days we pretty much have to have all GETS provide some sane cache control headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a few changes directly. This will take a bit more review yet.
let revs = constructReverseIndex index | ||
update $ ReplaceReverseIndex revs | ||
} | ||
initReverseIndex :: IO () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is being run every time, so we're not actually benefiting from this as a persistent DB. We could just as well use a MemState and not have the complexities of persistent state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we keep this and get rid of the persistent component, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
index <- queryGetPackageIndex | ||
let ReverseIndex _ revdeps nodemap = constructReverseIndex index | ||
assoc = takeWhile (\(a,_) -> a < Bimap.size nodemap) $ Arr.assocs . Gr.transposeG $ revdeps | ||
forM_ (map (\(a,b) -> (nodemap Bimap.!> a, map (nodemap Bimap.!>) b)) assoc) $ uncurry setReverse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost certainly quadratic since it's doing each package one by one rather than in bulk.
packageNodeIdMap = nodemap | ||
} | ||
|
||
changePackage :: PackageId -> [PackageId] -> [PackageId] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused. Should it be used?
|
||
arrDouble :: RevDeps -> RevDeps | ||
arrDouble rev = | ||
Gr.buildG (0, 2*revSize rev) (Gr.edges rev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not how immutable graphs work. Doubling the size helps nothing, since updating the graph has to copy anyway. That is, insEdges is already O(n), and if needs to add more nodes then that can be done in one pass at the same time, using accumArray
with the new graph bounds.
let index = PackageIndex.insert pkgid pindex | ||
rd' | revSize revs < Bimap.size nodemap = arrDouble revs | otherwise = revs | ||
npm = Bimap.tryInsert pkgid (Bimap.size nodemap) nodemap | ||
rd = insEdges (map (\d -> (npm ! d, [npm ! pkgid])) deps) rd' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This graph enlarging ought to be done differently. Array size doubling is bogus for pure immutable arrays.
resourceGet = [("html", serveReverseList)] | ||
} | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly these all need cacheControl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto as above -- not quite sure what to do for these, though happy to consult.
Distribution/Server/Features/Html.hs
Outdated
putAliasEdit :: DynamicPath -> ServerPartE Response | ||
putAliasEdit dpath = do | ||
tagname <- tagInPath dpath | ||
mergeTags (Tag tagname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mergeTags
isn't at the right abstraction level. It includes front-end things like looking up query params and then performing some state update. Instead all the front-end bits should live here and the code in the underlying feature should just deal with validation (if any) and the state/cache updates.
We can fix this by moving parts of the mergeTags
code out to here.
Distribution/Server/Features/Tags.hs
Outdated
addns <- optional $ look "addns" | ||
delns <- optional $ look "delns" | ||
raddns <- optional $ look "raddns" | ||
rdelns <- optional $ look "rdelns" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These front-end bits don't really belong here. putTags
should accept all these things as function args and the html layer should get them from the POST params and do the parsing.
Distribution/Server/Features/Tags.hs
Outdated
@@ -195,20 +230,78 @@ tagsFeature CoreFeature{ queryGetPackageIndex | |||
pkgMap <- liftM packageTags $ queryState tagsState GetPackageTags | |||
return $ Map.fromDistinctAscList . map (\pkg -> (pkg, Map.findWithDefault Set.empty pkg pkgMap)) $ Set.toList pkgs | |||
|
|||
mergeTags :: Tag -> ServerPartE () | |||
mergeTags deprTag = do | |||
tags <- optional $ look "tags" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These front-end bits don't really belong here. mergeTags
should accept the tags as function args and the html layer should do the lookup and parsing.
@@ -1519,26 +1599,50 @@ mkHtmlTags HtmlUtilities{..} | |||
pkgname <- packageInPath dpath | |||
_ <- lookupPackageName pkgname -- TODO: necessary? | |||
putTags pkgname | |||
return $ toResponse $ Resource.XHtml $ hackagePage "Set tags" | |||
[toHtml "Put tags for ", packageNameLink pkgname] | |||
currTags <- queryTagsForPackage pkgname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same point about mergeTags applies to putTags here.
-- The main reverse dependencies map | ||
reverseDependencies :: RevDeps, | ||
data ReverseIndex = ReverseIndex { | ||
duplicatedIndex :: !(PackageIndex PackageId), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this genuinely is redundant. It is used during lookups, but these are cases where we could grab the PackageIndex
from the core feature and use that in combination. This will be a tad easier if we switch to using a MemState
. Then the query functions will just need to take the ReverseIndex
and the core PackageIndex
to do their stuff.
It's not small so it'd be better if we can eliminate it.
where forComponent (_, iftree, elsetree) = harvestDependencies iftree ++ maybe [] harvestDependencies elsetree | ||
constructRevDeps :: PackageIndex PkgInfo -> Bimap PackageId NodeId -> RevDeps | ||
constructRevDeps index nodemap = | ||
let allPackages = concatMap (take 5) $ PackageIndex.allPackagesByName index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea what this take 5 is about? Is it trying to take the 5 latest versions? Are we sure it's not actually getting the 5 oldest?
Also this is not consistent. This is only used on startup, but once we're in a steady state then we get all new versions not just the last 5.
Distribution/Server/Features/Html.hs
Outdated
addns = toStr $ fst revTags | ||
delns = toStr $ snd revTags | ||
trustainer <- authorisedAsMaintainerOrTrustee pkgname | ||
user <- authorisedAsAnyUser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't likad adding these new ad-hoc auth functions (and particular the new ones they require in turn), and it's better not to add new auth functions to reduce possible confusion. We can just fail immediately if they're not authorised as any user, so overall we can use:
uid <- guardAuthorised [AnyKnownUser]
trustainer <- checkPriviledged uid [InGroup (maintainersGroup pkgname), InGroup trusteesGroup]
now ok having said lets not add new auth functions there's a reasonable argument that we should provide a wrapper for checkPriviledged
in the Users
feature that does what the underlying checkPriviledged
does but supplies the Users
db appropriately (which is what I'm implicitly using above). This at least isn't ad-hoc since it follows the existing pattern.
|
||
authorisedAsAnyUser :: ServerPartE Bool | ||
authorisedAsAnyUser = | ||
guardAuthorised' [AnyKnownUser] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to add these new auth functions, as mentioned above, but adding a wrapper here for checkPriviledged
here is ok. e.g.
checkPriviledged :: UserId -> [PrivilegeCondition] -> ServerPartE Bool
checkPriviledged uid privconds = do
users <- queryGetUserDb
Auth.checkPriviledged users uid privconds
@@ -412,6 +421,11 @@ userFeature templates usersState adminsState | |||
overrideResponse <- msum <$> runHook authFailHook err | |||
throwError (fromMaybe defaultResponse overrideResponse) | |||
|
|||
-- Check if there is an authenticated userid, and return info, if so. | |||
getAuthentication :: ServerPartE (Maybe (UserId, UserInfo)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called checkAuthenticated
since that's what it wraps and that's the pattern here.
instance MemSize a => MemSize (IntMap a) where | ||
memSize m = sum [ 8 + memSize v | v <- IntMap.elems m ] | ||
|
||
instance MemSize Graph where | ||
memSize m = sum [ 6 + memSize v | v <- Gr.edges m ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just have an instance for ordinary boxed Arrays then this would be automatic. We've got one for unboxed arrays, boxed should be easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the right instance offhand, but feel free to go for it... (shouldn't be a blocker either way)
…ime in mem, avoid spurious doubling, don't dup index
Ok, so this should fix everything except for the cache control stuff, which I'll leave to @dcoutts (though I'm happy to consult). Everything with the revdep index is now created directly at startup with no persistence, but it should still be relatively fast I think. Worth testing with a full hackage index just to be sure. Also redid the auth stuff as requested, using getAuthorised' from the users feature, which already existed and was good enough... |
Any chance of reviewing this soon? |
? |
@hvr I think that with the new revdep branch I made there's nothing left on here that hasn't found its way elsewhere? |
everything has now been merged in pieces i believe. |
This is a cleanup of #514 that should supersede it. Changed revdeps in the display to give more useful numbers (after nub applied), made maintainers actually maintainers and not metadata uploaders, made display of already made votes work, improved table styling, added deprecation info to table, and other small UI improvements and cleanups. I'm pretty happy with the result.
I think there's more work to be done on the deploy branch with regards to improving the frontpage and some other docs, but I'm relatively happy here.